GithubHelp home page GithubHelp logo

glidesort's People

Contributors

orlp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

glidesort's Issues

Median candidate analysis

Have had this half-finished for a while; I'm writing up what I can even though there's more to investigate. CC @scandum, @Voultapher as I believe ipn's picked up glidesort's candidate approach.

I looked into how the choice of pivot candidates affects median accuracy using a simulation on random piecewise-linear inputs. These seem like they capture one type of order that might be expected in an input, and I don't have any other promising ideas for testing that.

I found that the accuracy depends not only on the way candidates are grouped into sets of 3 (what I initially wanted to test) but also the positions of those candidates (unexpected, but obvious in retrospect). So I measured both for an even distribution, and Glidesort's recursive selection, using midpoints of the nine intervals that come out of the recursion. I copied the arithmetic for this by hand; it looks right but it's possible I made a mistake here. I've also tested various positions not shown here. Error is measured as cost, assuming sorting each of the two partitions takes time proportional to n log(n). For context, the baseline cost is 8965784, so that a relatively high cost difference of 30000 adds 0.33% to total sorting cost due to that step (it'll compound at every partition). The tables below show cost for the best and worst groupings after testing every possibility, as well as some others:

  • 000111222 is the current glidesort scheme, which I find is fairly poor regardless of positions
  • 012012012 transposes it, and seems generally good
  • 012102120 stands out as consistently very good

The arrangement 011201220 that I've used and recommended previously does badly, often worse than 000111222.

Beyond any particular choice of grouping it doesn't seem like Glidesort's positions do well: the "true of 9" row is the true median of those and is worse than most pseudomedians with even spacing. The middle 4 makes Glidesort's positions skewed and prevents perfect performance on 1-piece (totally linear) inputs, but changing it to 3.5 isn't much of an improvement (in fact the true median is worse for 2-piece inputs). So I think it's mainly the clustering that weakens it, although I can't say exactly why the effect is so strong. Something that seems a little better is to use 3 as the midpoint for one or two intervals and 4 for the others.

I also did some theoretical analysis on the median-of-median-of-... idea. I find that 3^k candidates processed with recursive medians have about as much power as a true median of (π/2)*2.25^k pivots. For example the pseudomedian of 243 for a 32768-element list is worth a true median of 91. The probability distributions for the median value found this way with random candidates are equal at the exact midpoint, and the shapes seem similar (if anything the pseudomedian is wider). I can write the math up if you're interested.

Even spacing (0.5/9, ..., 8.5/9):

Partition Diagram 1 2 3 7
True of 3 ~147 0 13413 25575 77986
True of 9 012345678 0 1557 2940 8960
012102120 048/136/257 0 1557 3174 16728
012012012 036/147/258 0 1557 4276 18077
000111222 012/345/678 0 13413 20636 28802
001112022 016/234/578 35848 31942 29785 30945

Glidesort arrangement (approx 0.008, 0.07, 0.117, 0.508, 0.57, 0.617, 0.883, 0.945, 0.992):

Partition Diagram 1 2 3 7
True of 3 ~147 0 13413 25575 77986
True of 9 012345678 14184 69439 77943 74393
001122102 017/236/458 184 26029 43807 74776
012102120 048/136/257 14184 69439 79653 81007
012012012 036/147/258 14184 69439 84625 85439
000111222 012/345/678 14184 79120 97432 98255
011120220 058/123/467 39866 141889 139266 99083

Source for this, run with CBQN. I can translate to some other language on request. The inputs are created with •rand which changes between runs, but the results above don't change significantly.

# All partitions of 9 candidates into 3+3+3
part ← (⊐⊸≡∧(3⥊3)≡/⁼)¨⊸/⥊↕3⌊1+↕9
pfmt ← > ('0'⊸+ ⋈ ·∾⟜"/"⊸∾´ '0'+⊔)¨ part  # Display
# Hard-coded partitions of interest
pint ← "000111222"‿"012012012"‿"012102120" #‿"011201220"

# Make random piecewise-linear functions
GetFn ← {
  R ← •rand.Range
  p←0∾1∾˜∧(𝕩-2)R 0 ⋄ q←𝕩 R 0  # x endpoints, y endpoints
  m←q÷○(«⊸-)p ⋄ y←q-m×p       # Slope, y intercept
  {(⊏⟜y+𝕩×⊏⟜m)(1↓p)⍋𝕩}
}
dist ← GetFn¨ 1e3/≍2‿3‿4‿8    # 1e3 sets with each of 2, 3, 4, 8 vertices
Sample ← dist {𝕎𝕩}¨ <         # Sample values given positions

# Candidate sampling
Mid ← {(0.5+↕𝕩)÷𝕩}                  # Midpoints of equal-sized intervals, 𝕩 total
glide ← {t←0‿4‿7÷8 ⋄ ⥊t+⌜(t+÷16)÷8} # Midpoints of glidesort candidate intervals

# Scoring: cost increase relative to true median on 1e3 elements
list ← Sample Mid l←1e3
ScoreAll ← {+˝˘ (2×{𝕩×2⋆⁼𝕩}l÷2) -˜ +○{𝕩×2⋆⁼𝕩}⟜(l⊸-) list +´∘≤¨⎉∞‿¯1 𝕩}
MakeTable ← {
  cand ← Sample 𝕩
  med ← part (1⊑∧){𝔽𝔽¨}∘⊔⌜ cand                # Pseudomedians
  score ← +˝˘ scoremat ← ScoreAll med
  t39 ← > (⌊2÷˜3‿9) ⊑⟜∧¨¨ ⟨Sample Mid 3, cand⟩ # True medians
  ∾⟨
    ["True of 3"‿"147","True of 9"‿('0'+↕9)] ∾˘ ⌊ ScoreAll t39
    ((⌽⊸∨0=↕∘≠)∨pint∊˜⊏˘)⊸/ score ⍋⊸⊏ (⌊scoremat) ∾˘˜ pfmt
  ⟩
}
•Show∘MakeTable¨ ⟨Mid 9, glide⟩

parallel version

Do you plan a parallel version as Rayon? That works be cool/fast.

Host visualizations on Github

This is just a hack to host visualization videos on Github.

glidesort_merge_example.mp4
glidesort_adaptiveness_example.mp4

Quadratic-time run scanning

The minimum run length commit seems to introduce quadratic behavior for runs somewhat shorter than sqrt(n / 2), because the run is repeatedly followed and discarded. If so, this would cause worst-case performance of O(n^(3/2)) by multiplying O(n) time to get past each run by O(sqrt(n)) runs that fit in a length-n array. I took the following timings on a length 1e8 array to confirm that this has a practical impact; the input data is just 0, 1, ... r-1 repeated, for run length r. sqrt(1e8 / 2) is about 7071; strangely, performance improves gradually from about 6000 to that number instead of sharply as I'd expected. The "% create" here is a loose estimate from perf top of fraction of time spent in LogicalRun<B,T>::create, and "Time create" is that multiplied by total time.

Run Time (s) % create Time create
500 2.44 0.20 0.49
1000 2.96 0.25 0.74
2000 3.74 0.38 1.42
4000 4.48 0.45 2.02

Add license

Add some kind of license, so editing your repository and using it become legal

Potentially replace standard library sort?

If this sorting algorithm is stable and strictly better performance-wise than the standard library sort, it seems like the standard library implementation could be replaced completely (like it happened previously with hashbrown and crossbeam-channel)?

Provide a way to easily reproduce benchmarks

Hey, glidesort is very impressive.

Could this provide an easy way to run benchmarks on our machines and see these results? (I'm willing to help if necessary).

(EDIT: by chance, I have a 4800 MHz dual-channel system available, and a 2666MHz single-channel one, I'm curious to compare bench results on both.)

Frustrations

Well, I'll be blunt. I find the glidesort codebase hard to work with even by the standards of sorting research. I feel that in your desire to give a polished view of the algorithm to the world, you've actually made it harder for people who want to dig into the details. Given that pdqsort (Rust version included) was basically my entry point into high-performance sorting, seeing the next step like this is tough. Worse, from what I understand of the algorithm, it doesn't seem to be much more complicated than pdqsort, given that it throws out a lot of things like pivot scrambling and heapsort. I'll describe my difficulties as best as I'm able to give you the most information if you'd like to help.

I believe a real commit history would be very useful, and find the decision to publish as a single commit surprising for software presented at an open source conference. You apparently had enough of an implementation to benchmark for the talk in May, without full panic safety infrastructure. Because I can't access any version like this, I have no way to test your claim that panic safety accounts for 10-15% of time taken by the algorithm. Could it be different across processors? I have no insight into how tuning decisions were made, which is often available in the history too.

You've shared benchmarks from an ARM machine that's presumably your M1, and an unspecified AMD processor, as a png. Could you include, or link to, the processor specs and raw data in this repository?

Generally it feels that while sorting concepts are well explained, the way they are implemented isn't. As I understand it the way you use Rust isn't typical, so I expect even fluent Rust readers (I'm not one) could use some help. For example gap_guard.rs makes no attempt to explain what "the gap" is. And much of branchless_merge.rs is taken up by implementation of the BranchlessMergeState structure with no explanation of how this structure will be used.

Other structures have no comments at all. I suppose the names are supposed to be self-documenting. Take enum PartitionStrategy<T> in quicksort. LeftWithPivot is meaningless to me. What goes left? And of course there's a pivot, you're partitioning! Eventually I figured out that the un-named parameter is the pivot value to be used. Is left pivoting the variety used for the left side in a bidirectional partition? Because the block comment at the top never connects to any specific part of the code, I can't tell. Are LeftIfNewPivotEquals and LeftIfNewPivotEqualsCopy identical other than the way they way they store the pivot? The definition of partition_left certainly suggests this, but later less_strategy and geq_strategy recognize only the Copy version.

Where does the recursive median-based strategy for pivot selection come from? Is there a reference? To me it seems obviously questionable because if just two of the three systematically-chosen regions based on a, b, and c have lower median values, then you'll get a low pivot. For example, what happens with an array consisting of three up-down patterns?

What is tracking? I couldn't even google cfg(feature = "tracking").

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.