Heapsort, Quicksort, and Entropy



Heapsort, Quicksort, and Entropy

Numerous web-pages compare heapsort and quicksort.

Most of them say something along the lines of `both take an average time scaling as N log N, but A good implementation of QUICKSORT usually beats HEAPSORT in practice.'

Some take this folklore a bit further, giving quantitative details: `On average the number of comparisons done in HEAPSORT is about twice as much as in QUICKSORT, but HEAPSORT avoids the slight possibility of a catastrophic degradation of performance.'

But few seem to ask the question `why should heapsort use twice as many comparisons?' People spend a lot of effort on trying to `get the best of both worlds', making hybrid sorting algorithms such as `introspective sort', which applies quicksort recursively and occasionally switches to heapsort if the recursion depth gets big.

Quicksort and heapsort have been thoroughly compared by Paul Hsieh. He says `I suspected that heapsort should do better than its poor reputation and I think these results bear that out.' In his tests, the best compiler (for either heapsort or quicksort) produced a heapsort that was about 20% faster than quicksort, in total CPU time.

The total CPU tally is different from the number of comparisons made. Heapsort used an average of 61,000 comparisons, and Quicksort 22,000 comparisons, to sort lists of about 3000 objects. See his article for the explanation of the contrast between the comparison-count result and the CPU-time result.

The question I'd like to address, however, is, why Heapsort uses more comparisons than quicksort. Paul Hsieh says `what struck me is that I could not see really why heapsort is slower than quicksort. And I've not heard or read a credible explanation for this either.'

I think there is a simple explanation, based on the idea of expected information content. To make this readable, let's ramble our way via a classic puzzle.


Read full article from Heapsort, Quicksort, and Entropy


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts