It looks like the test case usual_cli::multiple_names()</cod

tried again, same issue. verified new rayon by noting <div class="highlight highli

Test case usual_cli::multiple_names sorts expected value incorrectly on some file systems about parallel-disk-usage HOT 20 CLOSED

peret commented on June 3, 2024

Test case usual_cli::multiple_names sorts expected value incorrectly on some file systems

from parallel-disk-usage.

Comments (20)

peret commented on June 3, 2024 1

test succeeds if run in a vm with ext4 fs. machine it is failing on is using zfs and btrfs.

This was a great hint and I was able to figure it out now. The test code sorts each individual data_tree (nested/, flat/, empty_dir/), but not the root data_tree.

So, why was it working on most systems and only failing on some? That seems to have to do with different file systems reporting different sizes for the SampleWorkspace. I inspected the actual output of the test on a system with an ext4-filesystem with cargo test multiple_names -- --show-output. This is the result:

---- multiple_names stdout ----
ACTUAL:
 40      ┌──empty-dir│██████████                                                               │ 14%
  3      │ ┌──3      │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                         │  1%
126      ├─┴flat     │████████████████████████████████                                         │ 43%
  6      │   ┌──1    │██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░░░░                                         │  2%
 66      │ ┌─┴0      │█████████████████░░░░░░░░░░░░░░░                                         │ 23%
126      ├─┴nested   │████████████████████████████████                                         │ 43%
292    ┌─┴(total)    │█████████████████████████████████████████████████████████████████████████│100%

EXPECTED:
 40      ┌──empty-dir│██████████                                                               │ 14%
  3      │ ┌──3      │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                         │  1%
126      ├─┴flat     │████████████████████████████████                                         │ 43%
  6      │   ┌──1    │██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░░░░                                         │  2%
 66      │ ┌─┴0      │█████████████████░░░░░░░░░░░░░░░                                         │ 23%
126      ├─┴nested   │████████████████████████████████                                         │ 43%
292    ┌─┴(total)    │█████████████████████████████████████████████████████████████████████████│100%

Note that the two folders nested/ and flat/ are actually reported to have the exact same size! Therefore, they already are in the correct order and no sorting is required. The test succeeds. In comparison, in the failing test output, you can see that these folders have different sizes (due to a different file system), and therefore the EXPECTED value is not sorted correctly.

That also means, you can actually reproduce this failure on any system, by making sure the size of flat/ is larger than nested/ on all systems, e.g. by just adding a single byte:

--- a/tests/_utils.rs
+++ b/tests/_utils.rs
@@ -79,7 +79,7 @@ impl Default for SampleWorkspace {
                 "0" => file!("")
                 "1" => file!("a")
                 "2" => file!("ab")
-                "3" => file!("abc")
+                "3" => file!("abcd")
             }
             "nested" => dir! {
                 "0" => dir! {

I will add a PR with a fix in a minute :)

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

This is weird. pdu always sort output by their sizes (IIRC, I haven't touch this in a long time). What stopped working on Nix CI?

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

Anyway, I use rayon's into_par_sorted to sort the results. This is a rayon's bug. If this rayon's bug has been fix, then fixing this bug is as simple as updating rayon's version. If not, we would have to forward this issue to rayon repo and fallback to regular sort on aarch64-linux.

from parallel-disk-usage.

commented on June 3, 2024

What stopped working on Nix CI?

the package was just added to nixpkgs (PR submitted 3 weeks ago) and always failed on the aarch64 linux CI.

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

~~Can you add a patch to Nix aarch64 build that replace rayon's sort with regular Vec's sort then tell me if it passes?~~

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

Actually, pdu's rayon is outdated. You should try updating the rayon version first to see if it passes. If it does, I will update rayon and release a new version.

from parallel-disk-usage.

commented on June 3, 2024

if i run cargo test on master my machine fails the same test -- AMD Ryzen 7 5700X 8-Core Processor

just updating the Cargo.lock will fail shell completion tests. just updating the rayon and rayon-core by copy / pasting the ones from the new lock file to the old lock file fails in the same way.

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

just updating the Cargo.lock will fail shell completion tests

This is trivial, just execute ./generate-completions.sh.

just updating the rayon and rayon-core by copy / pasting the ones from the new lock file to the old lock file fails in the same way.

I don't know how cargo actually works, but I suspect that it didn't actually update because it detected wrongly that the lockfile is up-to-date.

from parallel-disk-usage.

commented on June 3, 2024

tried again, same issue. verified new rayon by noting

   Compiling rayon-core v1.12.1                                                                                                                                                                                                                                                                              
   Compiling rayon v1.8.1

printed. same issue.

from parallel-disk-usage.

commented on June 3, 2024

test succeeds if run in a vm with ext4 fs. machine it is failing on is using zfs and btrfs. not sure if that matters. not sure what the aarch64 nix CI is using as a filesystem.

from parallel-disk-usage.

commented on June 3, 2024

also tried tasksel 01 cargo test to just run on one core and test still fails.

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

It still sorts incorrectly then?

So I reexamined the log viewer (from the link you posted) and see that in the fail test, there's one called ACTUAL and one called EXPECTED. It's actually the EXPECTED that is sorted wrong, because it has 42% (nested) under 58% (flat).

The EXPECTED value was generated by this code:

parallel-disk-usage/tests/usual_cli.rs

Lines 632 to 639 in 8e29f89

 let visualizer = Visualizer::<OsStringDisplay, _> { 

 data_tree: &data_tree, 

 bytes_format: BytesFormat::MetricUnits, 

 direction: Direction::BottomUp, 

 bar_alignment: BarAlignment::Left, 

 column_width_distribution: ColumnWidthDistribution::total(100), 

 max_depth: 10.try_into().unwrap(), 

 };

In short, it's the test code that bugged, the main code works fine.

from parallel-disk-usage.

commented on June 3, 2024

yeah, still incorrect -- test fails. i thought that the failure was posted at the top of this message -- guess not. it's the same as the link but pasting it here too. I guess we can just disable the test then for nix.

---- multiple_names stdout ----
ACTUAL:
 6          ┌──1 │███████████████████▒▒▒▒▒▒▒░░░░░░                                             │ 25%
 8        ┌─┴0   │██████████████████████████░░░░░░                                             │ 33%
10      ┌─┴nested│████████████████████████████████                                             │ 42%
 1      │ ┌──1   │███░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                │  4%
 2      │ ├──2   │██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                │  8%
 3      │ ├──3   │██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                │ 13%
14      ├─┴flat  │█████████████████████████████████████████████                                │ 58%
24    ┌─┴(total) │█████████████████████████████████████████████████████████████████████████████│100%

EXPECTED:
 1        ┌──1   │███░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                │  4%
 2        ├──2   │██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                │  8%
 3        ├──3   │██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                │ 13%
14      ┌─┴flat  │█████████████████████████████████████████████                                │ 58%
 6      │   ┌──1 │███████████████████▒▒▒▒▒▒▒░░░░░░                                             │ 25%
 8      │ ┌─┴0   │██████████████████████████░░░░░░                                             │ 33%
10      ├─┴nested│████████████████████████████████                                             │ 42%
24    ┌─┴(total) │█████████████████████████████████████████████████████████████████████████████│100%

thread 'multiple_names' panicked at 'assertion failed: `(left == right)`

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

The EXPECTED value was sorted by this line:

parallel-disk-usage/tests/usual_cli.rs

Line 620 in 8e29f89

data_tree.par_sort_by(|left, right| left.data().cmp(&right.data()).reverse());

which calls Vec::sort_by recursively:

parallel-disk-usage/src/data_tree/sort.rs

Lines 11 to 17 in 8e29f89

 /// Sort all descendants recursively, in parallel. 

 pub fn par_sort_by(&mut self, compare: impl Fn(&Self, &Self) -> Ordering + Copy + Sync) { 

 self.children 

 .par_iter_mut() 

 .for_each(|child| child.par_sort_by(compare)); 

 self.children.sort_by(compare); 

 }

(It turns out I didn't use rayon method for sorting, only iterating)

I wonder what's the difference between the test code and the compile binary? Could it be a race condition?

Anyway, can you try replacing par_iter_mut with iter_mut to see if it still fails?

parallel-disk-usage/src/data_tree/sort.rs

Line 14 in 8e29f89

.par_iter_mut()

from parallel-disk-usage.

commented on June 3, 2024

modified code as shown by the following diff -- test still fails with the same error. I am not fluent in rust.

--- a/src/data_tree/sort.rs
+++ b/src/data_tree/sort.rs
@@ -1,6 +1,6 @@
 use super::DataTree;
 use crate::size::Size;
-use rayon::prelude::*;
+//use rayon::prelude::*;
 use std::cmp::Ordering;
 
 impl<Name, Data> DataTree<Name, Data>
@@ -11,7 +11,7 @@ where
     /// Sort all descendants recursively, in parallel.
     pub fn par_sort_by(&mut self, compare: impl Fn(&Self, &Self) -> Ordering + Copy + Sync) {
         self.children
-            .par_iter_mut()
+            .iter_mut()
             .for_each(|child| child.par_sort_by(compare));
         self.children.sort_by(compare);
     }

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

@a-n-n-a-l-e-e Can you restore the code (back to par_iter_mut) then run cargo test --release instead?

from parallel-disk-usage.

commented on June 3, 2024

@a-n-n-a-l-e-e Can you restore the code (back to par_iter_mut) then run cargo test --release instead?

done -- test still fails.

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

At this point, I'm out of ideas.

I guess you can make a little patch for your special build that adds #[ignore] above the failing tests and call it a day. Since it is the test that got it incorrect anyway.

from parallel-disk-usage.

KSXGitHub commented on June 3, 2024

@peret One thing I don't understand: Both the test code and the main code are called on the same SampleWorkspace, it should work on the same filesystem. How is it possible that the same filesystem reports different results?

from parallel-disk-usage.

peret commented on June 3, 2024

@KSXGitHub, it's because test code and main code do slightly different things. The main code builds the entire DataTree first and then sorts that tree. The test code sorts each individual sub_tree first (flat/, nested/, empty_dir/) and then constructs the overall DataTree from those children. It doesn't, however, sort the overall tree again.

At least that's how I read the code and it seems to make sense, to me.

from parallel-disk-usage.

Test case usual_cli::multiple_names sorts expected value incorrectly on some file systems about parallel-disk-usage HOT 20 CLOSED

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	let visualizer = Visualizer::<OsStringDisplay, _> {
	data_tree: &data_tree,
	bytes_format: BytesFormat::MetricUnits,
	direction: Direction::BottomUp,
	bar_alignment: BarAlignment::Left,
	column_width_distribution: ColumnWidthDistribution::total(100),
	max_depth: 10.try_into().unwrap(),
	};

	/// Sort all descendants recursively, in parallel.
	pub fn par_sort_by(&mut self, compare: impl Fn(&Self, &Self) -> Ordering + Copy + Sync) {
	self.children
	.par_iter_mut()
	.for_each(\|child\| child.par_sort_by(compare));
	self.children.sort_by(compare);
	}