Talk:Quicksort: Difference between revisions
m →Wrongly placed reference? & text attribution: {{unsigned}} |
|||
Line 303: | Line 303: | ||
The reference text again: Yaroslavskiy, Vladimir (2009). "Dual-Pivot Quicksort" (PDF). Archived from the original (PDF) on 2 October 2015. |
The reference text again: Yaroslavskiy, Vladimir (2009). "Dual-Pivot Quicksort" (PDF). Archived from the original (PDF) on 2 October 2015. |
||
<!-- Template:Unsigned --><small class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Geke|Geke]] ([[User talk:Geke#top|talk]] • [[Special:Contributions/Geke|contribs]]) 25 August 2020 (UTC)</small> |
Revision as of 23:07, 12 September 2020
This is the talk page for discussing improvements to the Quicksort article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1, 2, 3Auto-archiving period: 4 months |
This article has not yet been rated on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||
Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.
Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.
|
| |||
This page has archives. Sections older than 120 days may be automatically archived by Lowercase sigmabot III when more than 5 sections are present. |
This is the talk page for discussing improvements to the Quicksort article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1, 2, 3Auto-archiving period: 4 months |
Blog interview
The new blog post that purported holds an interview with Tony Hoare isn't needed — most of the history contained in it has already been divulged by Hoare in other places. I'll try to find out where so we can get rid of the blog. QVVERTYVS (hm?)
Addition of Content on Quicksort for Linked List
This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Information to be added to the end of the Variants section:
Linked List Quicksort
The Quicksort is typically taught using a bidirectional partition method appropriate for in-place exchange of array elements[1]. However, the Quicksort algorithm allows partitioning to be performed in any manner[2]. In this variant, the partitioning method is performed by a unidirectional iteration in which nodes with a lesser key than the pivot node's key are unlinked from their current list location and then inserted into the sublist before the pivot node[3], which can be implemented in a way that makes the Quicksort stable for linked lists[4]. This variant was compared empirically with linked list merge sort, and it performed more slowly due to non-optimal list dividing, which resulted in more key comparisons[3].
- A bottom up linked list merge sort is faster still than a top down merge sort, and doesn't involve any halving of lists. A bottom up merge sort considers a list of n nodes as n sub-lists of size 1, merging one node at a time into a small (26 to 32) array of lists where array_of_list[i] has 2^i nodes. Merge_sort#Bottom-up_implementation_using_lists Rcgldr (talk) 21:17, 1 January 2020 (UTC)
- Hi @Rcgldr:, I agree that what you say is true here, but it seems like it is most relevant where it already is. In this quicksort page, existence of linked list quicksort is missing along with the point that it is empirically slower than merge sort. That linked list quicksort might therefore be even slower than a bottom up merge sort is likely true but draws a conclusion that crosses the line into originality (no direct source to back up the direct comparison). So, it seemed better to reword to talk about the quicksort's suboptimal dividing (which increased key comparisons) and sidestep talking specifically about the exact operation of the merge sort that was actually used. The quicksort would need more comparisons in any case. Thanks for commenting, JohnBoyerPhd (talk) 00:30, 4 January 2020 (UTC)
- @JohnBoyerPhd: I'll have to see if I can find web site sources for this. I had correspondence about this issue with P._J._Plauger - one of the authors of the Dinkumware template library used by Microsoft for Visual Studio and other compiler makers, noted that the scanning of lists to split them added significant overhead to merge sort that bottom up merge sort avoids. This is probably mentioned in one of his published works. I've also seen this mentioned in some old textbooks (an analysis showed something like a 40% overhead), but I don't recall the name of the textbook names, so finding a reference there would be difficult. Rcgldr (talk) 16:27, 9 January 2020 (UTC)
Explanation of issue:
The quicksort is most typically taught as a partition _exchange_ sort (see Knuth and many others). The exchange design was evidently due to Quicksort being developed originally for an array, and this design precluded its use on singly linked lists (which have no backward pointer). Via web search, one can easily find implementations of linked list quick sort that have appeared on many knowledge sharing platforms in the last 10 years (For example, see [cboard post]). As the content in this edit request shows, linked list quicksort has been available since the 1990s. Since it is easily found on other knowledge sharing platforms, the proposed content fills an omission from Wikipedia as a knowledge sharing platform.
References supporting change:
- [1] Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.
- [2] Hoare, C. A. R. (1961). "Algorithm 64: Quicksort". Comm. ACM. 4 (7): 321. doi:10.1145/366622.366644.
- [3] Boyer, John M. (May 1998). "Sorting and Searching Linked Lists in Java". Dr. Dobb's Journal (285): 126–129, 137–138. Retrieved November 23, 2019.
- [4] Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
References
- ^ a b Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.
- ^ a b Hoare, C. A. R. (1961). "Algorithm 64: Quicksort". Comm. ACM. 4 (7): 321. doi:10.1145/366622.366644.
- ^ a b c Boyer, John M. (May 1998). "Sorting and Searching Linked Lists in Java". Dr. Dobb's Journal (285): 126–129, 137–138. Retrieved November 23, 2019.
- ^ a b Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
Reply 24-NOV-2019
- The policies against using only self-published sources as references for these particular claims, as well as those advising against using original research, prevent these changes from being made. The COI editor is urged to provide references from reliable, independent, WP:SECONDARY sources in order to verify the claims proposed here.
Regards, Spintendo 23:56, 24 November 2019 (UTC)
Response 15-DEC-2019
The matter has been discussed extensively on the @Spintendo:'s talk page. The discussion has resulted in the above substantially revised proposed text. Based on the discussion, the sources were not self-published sources. The one article authored (not published) by this author in the revised proposed content is permitted by Wikipedia policy here as it is a single citation published by a reliable source. The revised proposed content also does not violate the no original research policy because the revised proposed content follows the policy requirements for citing a primary source. Specifically, the revised proposed text does not "analyze, evaluate, interpret, or synthesize material found in a primary source," it changes less than 1% of the Quicksort article and so does not "base an entire article on primary sources," and the revised proposed content makes only two "straightforward, descriptive statements of facts that can be verified by any educated person with access to the primary source but without further, specialized knowledge" and so does not "add unsourced material from [my] personal experience, because that would make Wikipedia a primary source of that material." It's clear that the no original research policy exists to prevent Wikipedia from becoming a primary source, not to prevent it from citing primary sources. Therefore, please reconsider this revised edit request. JohnBoyerPhd (talk) 22:35, 15 December 2019 (UTC)
- I still don't see any independent sourcing for the variant. Without that, this seems like undue weight. - MrOllie (talk) 23:07, 15 December 2019 (UTC)
- There is an error in grammar in the newer request: It states
the Quicksort algorithm allows partitioning to be performed in any manner.
(referenced by the Hoare source). The very next sentence begins withIn this variant, the partitioning method is performed by a unidirectional iteration..
(referenced by the Boyer source). My question is, which of Hoare's "any manners" does this variant in the second sentence refer to? My guess is it's the variant that Boyer has proposed in their research. If that's the case, then considering the Hoare source was written 38 years before Boyer's was published, how does that source verify Boyer's claims? Please advise. I would also add that the changes you've made to your request did not follow the guidelines for changing previous posts at WP:REDACTED, thus there is no simple way to see what you've changed as opposed to what you previously submitted, which is unfortunate. Would you please repost the references that you removed (and place them below this post). Spintendo 23:51, 15 December 2019 (UTC)
- There is an error in grammar in the newer request: It states
- Hi @Spintendo:, If you click through to Hoare's paper to see what it says, you see that Algorithm 64 is really a short piece of text that presents only the algorithm Partition, Quicksort, Quicksort. It is and is intended to be the most broadly applicable algorithm that covers any method that partitions first in any way then recursively sorts the two partitions. This is exactly how broadly stated algorithms can apply to other papers written later. Hoare's algorithm applies to all other papers cited in the Quicksort page as well as the DDJ source. In the variant that is discussed in the proposed subsection, the any manner is given in the cited source as a unidirectional iteration, which is what the sentence says. Regarding the WP:REDACTED, noted for the future, as nothing on the edit request page describes this process. In the meantime, the design of Wikipedia platform is such that the source appear on the [history page]. Thanks, JohnBoyerPhd (talk) 06:03, 16 December 2019 (UTC)
- Hi @MrOllie:, The undue weight characterization is not at all applicable to this type of content in general, nor this specific content. This content proposes 2 sentences that provide one point of view among the plethora appearing in the rest of the 5000+ word article. Thanks, JohnBoyerPhd (talk) 06:03, 16 December 2019 (UTC)
- Hi @Spintendo:, I'm just checking in to see if you have been able to decide if you want still want to decline the revised comment for a revised reason, or if you see that the compromises made in revising the content along with the answers to your questions above and the discussion of the applicable policies are now sufficient for you to accept the content? Thank you, JohnBoyerPhd (talk) 03:34, 20 December 2019 (UTC)
- The intro to the Boyer paper states
"This paper presents a novel approach for a generalized comparison by transforming the problem into comparing executed code size of a benchmark imperative algorithm with a partially declarative variant of the same algorithm."
The key words here are a novel approach for a generalized comparison.[a] Please indicate what it is, about this novel approach, which negates its novelty simply because the generalized comparison it is performing is upon a broadly-stated algorithm. Regards, Spintendo 21:37, 30 December 2019 (UTC)
- The intro to the Boyer paper states
Notes
- ^ The main problem here are the words novel approach, simply because Wikipedia is not generally predisposed to publishing novel research without secondary sources verifying it, per WP:NOR. That includes any analysis or synthesis of published materials that serve to reach or imply a conclusion not stated by those sources. You must be able to cite reliable, published sources that directly support the material being presented — in this case that material would be the "novel approach for a generalized comparison" as stated in the paper. The question is thus: Do you consider your paper to be a new analysis or synthesis (in your words, a novel approach) of the Hoare source that serves to advance a position not clearly advanced by Hoare? If your answer is yes, then your paper is original research. If your answer is no, then you must provide other sources that also describe the same novel approach you discuss in your paper.
- Hi @Spintendo:, The paper that you are quoting from is not cited in and not a part of the linked list quicksort content proposal that we are discussing. At the very beginning of this, you merged two separate and distinct content requests that I made (variants of quicksort: Linked List Quicksort and Partially Declarative Quicksort). Since then, you've expressed such an amount of concern about the second of those requests, based on the article you continue to object to, that I removed the whole content and citation related to that second request (partially declarative quicksort). Please see the actual revised content request above. The article you object to just simply is not there. The only thing that we are discussing is my first content addition request about Linked List Quicksort-- not partially declarative quicksort and so not the article you have been objecting to. Can you please review the content currently proposed? Also, to reflect the current state of the discussion, please remove your prior objection for the cause of SPS since we got past that on your talk page a long time ago. Thank you, JohnBoyerPhd (talk) 22:25, 2 January 2020 (UTC)
- Hi @Spintendo:, I've now added a secondary reference to substantiate the variant, per your request. The secondary source support comes from none other than Robert Sedgewick, so hopefully that will be satisfactory to both you and @MrOllie:. Thanks, JohnBoyerPhd (talk) 00:30, 4 January 2020 (UTC)
@JohnBoyerPhd: First of all, the changes that you've made on this talk page to your older posts were not made according to the guidelines at WP:REDACTED.[a] Those guidelines state that you must use underlined font for additions and strikeout font for subtractions. Nothing is supposed to be deleted. The fact that you've gone back and deleted content that we've been talking about here is not at all helpful. How are editors supposed to evaluate your "newer" request when you've deleted the older request, so much so that there is now no longer a starting point to compare the newer material to. Editors are no longer able to easily see how and in what way a problematic reference that you removed applies to the evolution of your request, and your recommendation that editors simply "check the history" is hardly constructive to the overall request in general.[b] In order to proceed with full transparency, I would ask that you restore the deleted content in a post below, and that you then explain how it differs from what you're requesting now. I think you'll find that editors are better able to follow your explanations when you don't delete content they've already read. Regards, Spintendo 09:41, 5 January 2020 (UTC)
Notes
- ^ That guideline states that you are allowed to make as many changes as you want until another editor posts after you, then older posts of yours must use underlined and strikeout fonts for any changes made.
- ^ I would add that the requirements at WP:REDACTED apply to all talk pages — not just in edit requests. It's only on a user's own talk pages where these requirements are lax, and even then, going against them is not considered to be editing in good faith.
- Hi @Spintendo:, I did not delete the so-called problematic reference from my request. You merged two different requests I made, a fact that I have mentioned to you many times. I removed the text of the second request to undo the content merge that you performed. The linked list quicksort variant never cited that reference.
- Furthermore, there is no lack of full transparency. It's four sentences. And, it's very much aligned with what I proposed during our discussion on your talk page a long time ago. This was a discussion that I assumed was following a cooperative Wikipedia:Negotiation process, so yes the content is different now-- that's what compromise looks like.
- And... it's four sentences now. So, no, I will not restore the deleted content because 1) it would restore your damaging merge of my two requests, 2) it is easy enough to see in the history anyway, and 3) the revised content represents sufficient compromise that it is much easier to simply consider it as if it were a new request... to review four sentences.
- Since it is the same as withdrawing my request and just submitting a new one with the four sentences, I'd suggest it be handled as if that happened. Because I'm not really going to be jumping through any more hoops. This is simple undergraduate material at or below the level of the rest of the content in the Quicksort article-- not something "scientists" might dispute, as you asserted during our discussion on your talk page and not undue emphasis on some flat-earth-like fringe viewpoint as @MrOllie: seems to believe. It's important for a content editor to have command of the content in order to appropriately apply policies to the content rather than just partial pattern matching on policies out of the context of the content. Frankly, this edit request isn't even necessary anymore. Even the incorrectly labelled "problematic" source is not problematic. Wikipedia policy supports the addition without an edit request of neutrally worded content with a single self-cite from a reliable primary source if the content added to the article performs no analysis or synthesis and just states facts reported in the reliable primary source. Despite your insistence, it does not add original research to a wikipedia article to add a fact to the article that was original research in the primary source. The source and the article are two different things. I've already pointed this out, including occurrences in the Quicksort page itself, but ultimately I grew tired of explaining the policy language, so I simply withdrew the second request containing that so-called problematic reference. You know, this page asks those who can improve the content to do so, but it's too difficult to do so in the face of having more important things to do than continue arguing about this level of content, so the astringency of this process needs to be substantially reduced. Please process my request as it is now (four sentences) or simply indicate that you refuse. JohnBoyerPhd (talk) 04:19, 7 January 2020 (UTC)
- It's not the same as submitting a new request, because you've left behind a bunch of discussion that appears to be about your new request, but is really about an old request. It should be obvious by now what problems this leads to. Please just start a new section with your new request. Also, I don't believe this is 'some flat-earth-like fringe viewpoint', I think it is probably correct, but also something that hasn't been impactful enough on the field that it should be discussed in the article. - MrOllie (talk) 12:11, 7 January 2020 (UTC)
- You can see from this diff that the only items I changed were in how your request was presented. I didn't change any of the content:
- It's not the same as submitting a new request, because you've left behind a bunch of discussion that appears to be about your new request, but is really about an old request. It should be obvious by now what problems this leads to. Please just start a new section with your new request. Also, I don't believe this is 'some flat-earth-like fringe viewpoint', I think it is probably correct, but also something that hasn't been impactful enough on the field that it should be discussed in the article. - MrOllie (talk) 12:11, 7 January 2020 (UTC)
Reformatted request
| |||||
---|---|---|---|---|---|
|
- The changes to the content were of your own making, they were not mine, and I respectfully ask that you retract your accusation that I altered the substance of your request. In contrast to the formatting changes I made, your changes were far more extensive, they were not redacted, and they made the process of understanding what you wanted here much more difficult.
- @Spintendo: Thanks for clearly showing that the two requests became one, which was not my intent. That is the damage I was talking about because you complained repeatedly about a source in proposal 2 that you merged into the request for proposal 1 but which didn't have to do with proposal 1. In order to try to unblock your rejection of the edit request I removed the proposal 2 content that you merged into the edit request. Really need to stop arguing about this, so I'll do as you and MrOllie suggest and do it as a separate request. Still going to need you to revise and update your understanding of the Wikipedia policy on ability to use a primary source to support a content addition, as is done throughout wikipedia, including in the page we are talking about. JohnBoyerPhd (talk) 08:52, 9 January 2020 (UTC)
- Here in Wikipedia the WP:BURDEN is on the person wishing to add content to an article to persuade others, first by providing the necessary sources, and then by providing a coherent rationale. If you find others unwilling to join you after making that argument then it's time to reexamine your reasoning. I agree with Ollie that you should start all over from square one with a new approach to your request. Regards, Spintendo 04:55, 8 January 2020 (UTC)
Hi @MrOllie:, OK, sure, let’s try it as a new request to review the four sentences. If you do consider the new request, I’d ask that you use the Wikipedia policy on weighing impact based on the content being supported by a reliable source, rather than using your own evaluative policy. The problem is that the stating that the material is “probably correct” (rather than knowing that it is) demonstrates movement toward the Comprehension level of Bloom’s taxonomy, which is 4 levels below the evaluative level required to make that call on the impact. Therefore, please instead decide on the sufficiency of impact based on the facts that 1) it is only a few sentences added as a variant, 2) that it is supported by a citation of a publication in a reliable source (DDJ, software developer journal for 38 years with 160K circulation), and 3) that the explanation provided with the edit request includes that this variant appears in programmer-focused knowledge sharing platforms. If I were anyone but the author of #2, I would expect to add this material without it being reverted by anyone. Since my adding the content while also being the author of #2 is supported by Wikipedia policy, there should be no obstacle to what would otherwise be permissible content. Also, I don’t really understand why a simple web search wasn’t used to assuage your concerns. While getting no results doesn’t prove a negative, a plethora of results can provide the extra support you desire. In this case, rather than rejecting so quickly, one could just search for something reasonable like “how to do a linked list quicksort in java”. Many knowledge sharing platforms for programmers come up in the results, including stack overflow, geeks for geeks, tutorial points dev, code speedy, and compute patterns. Because I really don’t want to argue this anymore, I’ll put links to half a dozen or so of these into the explanation section of the new request (despite not being required by Wikipedia sourcing policies). JohnBoyerPhd (talk) 08:52, 9 January 2020 (UTC)
June 2020
Hello @MrOllie:, Although it's been more than 120 days for this inactive discussion to be deleted from this page, the automated archiving agent has either malfunctioned in not deleting it or is maldesigned to keep dormant discussions to have some minimum number of discussions around (which would give a false appearance of the number of active discussions). You have also informed me of your preference that I not manually delete this inactive discussion. Although deleted conversations are still archived in the page's version history, if you require this discussion to be kept around, then it makes less sense to me to start a new request and discussion on the same topic as this one, rather than just just reactivating and finishing the discussion here. It seems particularly low cognitive load considering you're actually the one who last expressed a dissenting opinion, so here goes:
The content we're discussing has been reduced, by the discussion, to 4 sentences of about 60 words within an article of about 10,000 words, so your classification of it as "undue emphasis" seems to mean you think it is "unimportant" (how else could less than 1% content be undue emphasis?). Can you please say upon what do you base your opinion? Would you be able to change your opinion and remove your objection if a web search reveals that this is important enough for others to have written about it on, say, half a dozen other knowledge sharing web sites? Here are half a dozen that easily come up using a web search for linked list quicksort: stackoverflow, geeksforgeeks, tutorialcup, studytonight, computepatterns, and codespeedy
These are just the ones that show up at the top of a web search results list. So, if the topic is discussed by Sedgwick, accepted by DDJ editors, and widely written about by others in knowledge sharing sites, can you say whether or not you still assert that this content really fits the intent of "undue emphasis" when it is, as I said, well less than 1% content in the article? If not, great; if so, why?
Thank you, JohnBoyerPhd
- As I mentioned on your user talk page, there is indeed a minimum count of threads kept on this page by the archiving bot. This is working as intended. It is configured to archive discussions no earlier than 120 days after last activity, this is not some sort of guarantee that discussions be archived at exactly 120 days. They'll go when we actually need the space.
- I looked at the Sedgewick book, and while he does give an overview of Quicksort, he doesn't cite you at all. Wikipedia is not a howto site, so the fact that Q&A sites, tutorials, and other howto style sites write about something doesn't mean we should, any more than the existence of tutorials on fixing drywall means that our article on Drywall should include tips for mixing drywall compound. I also note that those links don't cite you either. We continue to need independent sources and they just aren't here. - MrOllie (talk) 04:04, 11 June 2020 (UTC)
- Hello @MrOllie:, I mentioned those sites and the Sedgwick book because they talk about this topic, not me. The problem is that you justified your dissenting opinion by citing the Wikipedia policy on undue weight. If you read that policy, you see it is saying to forbid a minority viewpoint, in other words a topic that is unimportant or not supported by a reliable source. I had already established that DDJ is a reliable source. If you read the reliable sources text linked by your "undue weight" citation, you will see exact language supporting the use of a non-academic magazine publication such as the DDJ source I used. Therefore, I concluded that you must be saying this is an unimportant topic. The sources I gave were meant to refute that, only.
- So that misunderstanding is cleared up, as it's clear now that your actual concern doesn't fit the definition of "undue weight" but rather is about the use of a reliable source _that is_ a self citation. And in that case, the problem remains that, despite your concerns, it is expressly permitted by the appropriate Wikipedia policies that I already provided in explanations above.
- We can test this hypothesis while being in keeping with the intent of a talk page (to refine content through discussion). Suppose I were to propose the following revised content (in Variants section):
- The Quicksort is typically taught using a bidirectional partition method appropriate for in-place exchange of array elements[1]. However, the Quicksort algorithm allows partitioning to be performed in any manner[2]. This variant iterates through a singly linked list and can be implemented in a way that makes the Quicksort stable for linked lists[3].
- This revised content would introduce linked list quicksort but simply removes information that was supported by my DDJ article. The question is, would you still claim "undue weight" on this version of the content, or would you approve this revised content?
- Thank you, JohnBoyerPhd
References
- ^ Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.
- ^ Hoare, C. A. R. (1961). "Algorithm 64: Quicksort". Comm. ACM. 4 (7): 321. doi:10.1145/366622.366644.
- ^ Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
- These remaining cites talk about quicksort in general, but not this variant you'd like to add. We need secondary sources specifically about the variant. FYI, I took a look at the closest algorithms book to my desk, which happened to be Cormen, Leiserson & Rivest's Introduction to Algorithms, and they also don't mention this. - MrOllie (talk) 11:36, 12 June 2020 (UTC)
- It's not clear to me what the issues are here. Lomuto partition scheme is unidirectional. I found a 1993 university paper on quick sort for linked list qucksort linked list pdf. In that paper, there is a reference to "Weg82" which specifically mentions linked lists, but I don't know if that reference uses quick sort. That paper also describes a top down merge sort for linked lists, but Merge_sort#Bottom-up_implementation_using_lists is faster. It's also not clear to me if a quick sort partition without swaps but instead partition by splitting a list into 2 or 3 lists would be considered to be a quick sort or just quick sort like. Lomuto partition scheme using the first element as a pivot could be implemented for a linked list using swaps. On a side note, Hoare's original implementation of quicksort was done on a computer without a stack, and used what he called a "nest", a stand-alone stack, to store indexes, for an iterative implementation of quick sort using a stand alone stack. Rcgldr (talk) 20:27, 12 June 2020 (UTC)
- Hello @MrOllie:, I appreciate you looked in a book to support you viewpoint, but a counterexample is used to disprove universality, not to prove something is unimportant. Despite having 'Introduction' in its name, the CLR book's audience is not first semester where discussion of this topic would more naturally belong.
- The other citations used in the content do say many things about quicksort but they do also provide ample support of the existence of a quicksort for linked lists, including quoting those exact words in this tweaked version:
- The Quicksort is typically taught using a bidirectional partition method appropriate for in-place exchange of array elements[1]. However, the Quicksort algorithm allows partitioning to be performed in any manner[2]. Forward iteration can be used to implement "a stable quicksort for linked lists" [3].
- How about now?
- For what it's worth, I also agree with @Rcgldr:'s assertion that it's not really clear what the issue with this latest content. By removing the DDJ article, I also removed all COI, so at this point I'm really just trying to see if anything would be found agreeable or if disagreement is just the order of the day. The top of the talk page invites users to make improvements if they can, so it'd be useful if that were actually possible, much less if the COI submission policies are understood well enough to use them here. To answer Rcgldr's question about whether a unidirectional quicksort would still be a quicksort or just quicksort like, that is the intent of the second sentence. It is generally accepted that Hoare's invention is a generalized algorithm that covers anything with the pattern of [partition, sort, sort]. You can see this from reading the source that this is really all it says, and you can also see it in the direct words of Sedgwick (he didn't say "a stable quicksort like algorithm for linked lists". I also ran across the 1993 paper, and there is of course this from 1990, but works from sources like these were asserted by others to be not sufficiently reliable sources. While I don't happen to agree, it's not germane to the current trajectory of finding out whether there is any content on this topic of linked list quicksort that could be deemed passable.
- Thank you, JohnBoyerPhd
References
- ^ Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.
- ^ Hoare, C. A. R. (1961). "Algorithm 64: Quicksort". Comm. ACM. 4 (7): 321. doi:10.1145/366622.366644.
- ^ Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
- What I meant by "unidirectional" was the scanning to compare elements versus pivot, which is what Lomuto does. "Bi-directional" would be Hoare type scheme, scanning from both ends inwards until the two scans cross. This is independent of how out of order elements are handled, normally by swapping, which could be done with a linked list, or what might be better for a linked list, separating into 3 lists (< pivot, == pivot, > pivot) and using recursion for list != pivot, then concatenating. Another side note, quick sort isn't quite like other divide and conquer algorithms, the reordering of data takes places in each partition step, before a recursion step, and nothing happens on the return chain. Compare this to top down merge sort, where nothing but splitting happens during divide steps, and reordering (merging) takes place on the return chain. Rcgldr (talk) 17:14, 13 June 2020 (UTC)
- @JohnBoyerPhd: You said:
It is generally accepted that Hoare's invention is a generalized algorithm that covers anything with the pattern of [partition, sort, sort]. You can see this from reading the source that this is really all it says (...)
- and AFAIR the Tony Hoare's work, you must have made that up.
- @JohnBoyerPhd: @CiaPan: Tony Hoare's first implementation of quicksort predates Algol 60, and the computer system was probably an Elliott_803 or similar system, one without a native stack. His early papers refer to a "nest", which is a stand alone stack, used by an iterative algorithm, such as this 1962 paper than includes some benchmarks on the older (1956) Elliott 405: quicksort pdf. The introduction of the recursive language Algol 60 made a recursive implementation of quicksort possible, but few computer systems had implemented Algol 60, and even his later papers like the 1962 paper I used as a reference, still refer to using a "nest" with an iterative implementation. That paper also mentions bidirectional scanning from both ends of an array, as well as pushing the parameters (indexes) for the larger partition onto the nest and iterating on the smaller partition to limit "nest" (stack) space complexity to O(log2(n)). Rcgldr (talk) 16:00, 14 June 2020 (UTC)
- I read the PDF available at https://doi.org/10.1093/comjnl/5.1.10 again, just to make sure, and here is what actually is written in a few initial paragraphs of Hoare's description of the Hoare's quicksort:
A problem of sorting a mass of items, occupying consecutive locations in the store of a computer, may be reduced to that of sorting two lesser segments of data, provided that it is known that the keys of each of the items held in locations lower than a certain dividing line are less than the keys of all the items held in locations above this dividing line. (...)
The first step of the partition process is to choose a particular key value which is known to be within the range of the keys of the items in the segment to be sorted. (...)
The items to be sorted are scanned by two pointers: one of them, the lower pointer, starts at the item of the lowest address, and moves upwards in the store, while the other, the upper pointer, starts at the item with the highest address and moves downwards. (...)
- And so on.
- This clearly indicates that the algorithm Hoare described is designed specifically for a compact, continuous block of homogeneous data, stored one after one in consecutive locations in the computer's memory. Which is exactly an array as defined by Fortran, Pascal, C and many other programming languages.
- Of course that can be easily adapted for data scattered accross the storage and indexed with some array of references, or (possibly not that easily) to dynamically linked lists – but that would be an adaptation, a quicksort-like algorithm, as Rcgldr called it, not the Tony Hoare's quicksort itself. --CiaPan (talk) 23:26, 13 June 2020 (UTC)
Hoare's paper specifically mentions bidirectional scanning: "The items to be sorted are scanned by two pointers: one of them, the lower pointer, starts at the item of the lowest address, and moves upwards in the store, while the other, the upper pointer, starts at the item with the highest address and moves downwards ", so how did Lomuto unidirectional scanning with only a lower pointer end up being accepted as a type of quicksort? Rcgldr (talk) 01:54, 15 June 2020 (UTC)
- @Rcgldr: Apparently that quicksort is not a Hoare's quicksort. Possibly it should be specifically mentioned every time, whose version of quicksort is being discussed. And maybe it should be each time defined, whether it is still a quicksort, or a quicksort-like algorithm. :) Just kidding. --CiaPan (talk) 21:10, 15 June 2020 (UTC)
Hello @Rcgldr:, The 1961 CACM paper appearing as my second reference (and already referenced in the wikipedia quicksort page) articulates the quicksort algorithm as the recursive pattern of partition, sort, sort. If you click on that doi link and read the pdf, you'll see that I'm not making up anything. Algorithm 64: Quicksort doesn't include the partitioning scheme as the algorithm is intended to be as broad as possible, as this is what algorithm designers (and inventors) tend to do. The 1962 paper you recall and provided is a deeper articulation that also includes the partitioning scheme. Various algorithm authors, including Cormen et al. that @MrOllie: mentioned, describe Lomuto's work as another partitioning method within the broad scope of the quicksort algorithm. Here is another nice explanation of Hoare versus Lomuto partitioning schemes within the quicksort algorithm.
This brings us back to the fact that quicksort on a linked list is still a quicksort, as the Sedgwick words I've now directly quoted indicate. Furthermore, you pointed out that Lomuto is unidirectional, but my second sentence isn't just about the fact that partitioning can be unidirectional. The first sentence mentions that the Hoare partition scheme that is usually taught is bidirectional and exchanges elements. Element exchange is what makes quicksort not stable, even if Lomuto's unidirectional partition method is used. When Sedgwick refers to linked list quicksort being stable, he is alluding to the fact that the unidirectional iteration need only move elements with a lesser key than the pivot node to a position immediately before the node. Elements are not exchanged. But these are just variations of implementation of the quicksort algorithm that are suitable for a different data structure than an array. All this being said, we can pop the stack and ask @MrOllie:, do you still object to this content that, I would emphasize, doesn't even have a COI for me anymore?
Thank you, JohnBoyerPhd
- stability - exchange of adjacent items doesn't break stability (for example bubble sort), but exchange of non-adjacent items can break stability. Rcgldr (talk) 17:45, 15 June 2020 (UTC)
- linked list - I don't have an opinion on calling such a sort for linked list "quicksort" (similar to Lomuto versus Hoare). Since quicksort is not stable, I don't see a requirement for an equivalent sort for linked list to be stable, although it would be nice. Quicksort for a linked list is slower than bottom up merge sort for linked list, but being slower doesn't make it an invalid method. Rcgldr (talk) 17:45, 15 June 2020 (UTC)
- Hi @Rcgldr:, I agree there is no requirement for the linked list quicksort variant to be stable. In fact, in my DDJ paper and 1990 works, the partition was implemented slightly differently to eliminate a line of code and run slightly faster, at the expense of stability (because it wasn't required). These days, I'd prefer the stability. Anyway, I mentioned the stability because Sedgwick mentioned it, at which point it's clear he's talking about a better variant than something trivially array-like, such as just converting the linked list to a doubly linked list and then using Hoare's partition method. Thank you, JohnBoyerPhd
- Side note - The Hoare paper actually has errors in it. " divided ... keys (on left) ... less than keys (on right)", when it should be less than or equal. It correctly stated later with "The aim ... divided ... equal or less than bound ... equal or greater than bound". The lower pointer scan description is wrong, stops at key "greater than bound", when it should be "equal to or greater than bound". The upper pointer is described correctly. Keys equal to the bound (pivot) or the bound (pivot) itself can end up in either part of a partition. In Hoare's bet with his boss about a sort faster than insertion sort, his boss probably wasn't considering a non-stable sort, since insertion sort was typically used as a stable sort to create runs of sorted data for a tape sort (merge sort(1945), polyphase merge sort(1960)), where stability was expected. I'm not sure how common it was at the time to sort a file small enough to fit in memory and where stability was not expected. Rcgldr (talk) 17:45, 15 June 2020 (UTC)
Arbitrary partitioning method
Hi, JohnBoyerPhd!
Here: Special:Diff/930928823 you wrote 'the Quicksort algorithm allows partitioning to be performed in any manner' and referenced https://doi.org/10.1145%2F366622.366644 – alas, I can't see the linked page supports the claim. Can you, please, explain in two or three sentences, how 'performing in any manner' follows from the linked source? --CiaPan (talk) 21:46, 15 June 2020 (UTC)
- Hi @CiaPan:, Yes, I can do that in a few sentences. The linked article does not have a quotable sequence of characters that says you can do partitioning in any way. Rather, it is the fact that the article describes the quicksort algorithm as the recursive pattern around partition-sort-sort while placing no bound on how partitioning is done. Placing no bound in how partitioning is done means that it can be implemented in any way because there is no bound on how it is to be implemented. Thanks, JohnBoyerPhd
- @JohnBoyerPhd: That's right! There is no quotable sequence of characters in the article that says you can do partitioning in any way. Instead, there is an explicit definition of the partitioning algorithm, just on the previous page as Algoritm 63: https://dl.acm.org/doi/pdf/10.1145/366622.366642 --CiaPan (talk) 07:23, 17 June 2020 (UTC)
- @JohnBoyerPhd: @CiaPan: - In "algorithm 64", there is a statement that no extra space is required, yet the recursive calls require worst case O(n) stack space, since the code is missing the handling of the smaller partition first. Rcgldr (talk) 20:48, 16 June 2020 (UTC)
- @Rcgldr: That's a completely another topic. --CiaPan (talk)
- @CiaPan: - I somehow lost a comment about about algorithm 64 not specifying at all what partition does, other than returning I and J, which could seem to imply Hoare (bidirectional) scheme, but Lomuto (unidirectional) scheme could set I to pivot index - 1 and J to pivot index + 1. Rcgldr (talk) 16:37, 17 June 2020 (UTC)
- @Rcgldr: That's a completely another topic. --CiaPan (talk)
- @JohnBoyerPhd: @CiaPan: - In "algorithm 64", there is a statement that no extra space is required, yet the recursive calls require worst case O(n) stack space, since the code is missing the handling of the smaller partition first. Rcgldr (talk) 20:48, 16 June 2020 (UTC)
Conclusion
FYI @Rcgldr:, @CiaPan:. Hello @MrOllie:, Although we've had discussions recently in which you haven't exactly agreed with the veracity of my proposed content, I've noticed that the issue is a bit of a moot point now. I would reiterate that the objections to my sources are not consistent with the language of the Wikipedia policies that were cited. However, a month after my request was initially declined, it looks like somebody else went on in and added [new content about linked list quicksort] that is along the lines of what I'd hoped could be added. And, wonderfully, they didn't even cite any sources to support the content!
I'm so pleased the content is represented, and I quite agree with the person who added the content that "quicksort can be implemented as a stable sort using linked lists." Of course, I did also provide sources such as the reliably sourced DDJ article and the quoted words from Sedgwick to support it. Plus, added bonus, the new content also implies that the mergesort is a better choice for linked lists. I really wanted this to also appear in the content, but the main source I had for supporting that fact was the DDJ article.
The person who added the content did wobble a bit at the end of the sentence by claiming that poor pivot choices would be made due to not having random access. This isn't really the reason. For one thing, in a random list, the first element is just as random as any other, so using it as the pivot choice is no poorer. More generally, though, partition is O(n) anyway, so a sequential implementation of median-of-three or any other pivot selection method does not increase the order of magnitude of partitioning and hence does not change the analysis of the linked list quicksort. The reason mergesort on linked list outperforms quicksort on a linked list is because mergesort always performs optimal divisions of the list, whereas quicksort often makes pivot choices that result in suboptimal divisions. The mergesort on a linked list can also be done "in place" (i.e. the data structure already contains O(n) extra space for pointers). I discuss this and show it empirically in the DDJ article that the linked list mergesort is faster than the linked list quicksort. Since DDJ is a reliable source, pretty much anybody except me could, without a COI, have said something along the lines of "Although quicksort can be implemented as a stable sort using linked lists[1], the linked list mergesort is faster than linked list quicksort because it can be done in place and always performs optimal divisions.[2]". So, it's a little bit unfortunate that the new content referred to no references to support the point. Maybe somebody should fix it.
Cheers, JohnBoyerPhd
- @JohnBoyerPhd: @CiaPan: A bottom up merge sort for a linked list is faster since it avoids the repeated scanning of lists to divide them. Instead, it uses a small (25 to 32) array of lists, typically pointers to first nodes of lists, and immediately starts merging nodes into the array, where array[i] is either nullptr or points to a list with 2^i nodes. Similar to bottom up merge sort for arrays, you could think of the division process as dividing a list of n nodes into n lists of 1 node each, merging them into the array one node at a time. Once all nodes are merged into the array, the lists in the array are merged to form the sorted linked list. On a large list with scattered nodes (a lot of cache misses), bottom up can be ~30% faster than top down (== top down is ~40% slower than bottom up). Merge_sort#Bottom-up_implementation_using_lists Rcgldr (talk) 00:59, 17 June 2020 (UTC)
- Hi @Rcgldr:, I think the important bit is to support assertions with external reliable sources such as the DDJ article[2]. If a "bottom up" (non-recursive) implementation is even faster than a recursive one, that wouldn't undermine the point being made by the external reliable source[2] that merge sort is better than quicksort for linked lists. That being said, if you check out the external reliable source[2], you'll see that it does cover two variants of non-recursive linked list mergesort. It may be that non-recursive is faster on arrays and/or languages that are "close to the metal" (or with a better implementation), but for linked lists and with Java at that time, the difference in the implementations tried was negligibly slower. So, another source would be needed to support the point that linked list bottom-up/non-recursive mergesort is faster than recursive mergesort. That being said, such a source would be used to support the point if it were being added to the mergesort page, rather than the sections of the quicksort page that describes its relation to other algorithms like the mergesort. For the quicksort page, which we're talking about here, it suffices to point out (and support with an external reliable source) that linked list merge sort (whether recursive or not) is better than linked list quicksort[2]. Cheers, JohnBoyerPhd (talk) 14:44, 17 June 2020 (UTC)
- @JohnBoyerPhd: - I assumed that the recursive DDJ was top down, but the DDJ recursive merge sort for linked list is different than a typical top down merge sort for linked lists, which splits lists in half by scanning. The DDJ version avoids the scanning by not setting F2 until the base case of 1 node is reached and merge returns the value to use for F2 in NP (node pair), going back up the call chain. This reduces the overhead to O(log2(n)) for the passing and updating of parameters, versus bottom up iteration through the array (which I thought would be slightly faster - (update - turns out it is slightly faster)). Essentially the DDJ recursive version is using the stack in the same manner as bottom up uses the array. I was thinking that the recursion, calculation of sizes, and stack overhead would create a small time penalty versus the array access. If the list doesn't have a size member, the recursive method would have to do a 1 time pass to get the count of nodes. Rcgldr (talk) 03:10, 18 June 2020 (UTC)
- @JohnBoyerPhd: - I benchmarked recursive versus bottom up and found bottom up to be slightly faster, but not much, typically less than 4% faster. I added the source code for both to my github repository: github/jeffareid/misc. For the recursive, since the test code always passes a "before" node, I removed the conditional. I also switched to primitive integer type (int). For bottom up, merge() uses a static temporary node ("temp"), which would need to be local for a multi-threaded implementation. The benchmark runs twice using pseudo random integers. The first sort will have sequentially allocated nodes, the second sort with scattered nodes. On my system, both sorts take ~7 seconds for the first sort, ~12 seconds for the second sort. Rcgldr (talk) 23:08, 17 June 2020 (UTC)
- Hi @Rcgldr:, I appreciate your instinct toward testing via coding and execution. Indeed, before suggesting the addition of text for linked list quicksort and mergesort last year, I reimplemented the recursive versions in straight C, just to double-check with close-to-the-metal code. And sure enough, the linked list quicksort still had O(n log n) performance on random data, but the merge sort was still about 10% faster. I also had the code count key comparisons, and those were down about 10% too. However, empirical results from our code are really only useful for discussion here. They can't be added to the page because they are "original research" (in the way Wikipedia means it). Still, all results in this discussion only serve to reinforce what has been published in the reliable source DDJ article. In particular, any version of the merge sort is quicker than the linked list quicksort, and not because of being unable to use a good pivot selection technique like median of three. As I said earlier, partition is O(n) anyway, so one can afford to spend O(n) time on median of 3 or any other pivot selection technique at the front end of partition. Rather, the linked list quicksort is efficient but the linked list merge sort is quicker because it does less key comparisons, and that happens because quicksort partitions are likely to be suboptimally divided by the pivot key. Anyway, although Wikipedia policies do permit me to add content along the lines I described above after a COI review to ensure the rules are met, which they are, there have been numerous misunderstandings of the words used in those policies, so the COI review process has proven to be impractical for this page. That doesn't prevent anyone else who is qualified to read the DDJ article from using it as a reliable source to correct and support the content that is now in the page. Cheers, JohnBoyerPhd (talk) 04:18, 18 June 2020 (UTC)
- @JohnBoyerPhd: - I wasn't intending to make any changes to Quicksort. I have wondered if the DDJ recursive version for linked list should be added to Merge_sort, but I never proposed it. Rcgldr (talk) 08:56, 18 June 2020 (UTC)
- Hi @Rcgldr:, That's fine, I meant that anyone who is so inclined could fix the incorrectness at the end of that sentence. I'm not inclined to do so since I don't want to experience any further acrimony related to sources and since I am satisfied that the two main gist points I wanted do now appear in the text. I'm delighted they were added without the unnecessary obstructions related to sources that transpired earlier in this thread, so as I alluded in the section title, for me this is concluded. JohnBoyerPhd (talk) 17:20, 18 June 2020 (UTC)
References
- ^ Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
- ^ a b c d e Boyer, John M. (May 1998). "Sorting and Searching Linked Lists in Java". Dr. Dobb's Journal (285): 126–129, 137–138. Retrieved June 16, 2020.
"quicksort" or "Quicksort"?
What should it be, "quicksort" or "Quicksort"?
The article currently uses both (is inconsistent).
--Mortense (talk) 22:51, 8 March 2020 (UTC)
Intro claims a well tuned quicksort 2 or 3 times faster than merge sort?
This claim should be clarified. Merge sort does more moves but fewer compares than average case quicksort, so the relative speed depends on the cost of moving elements versus comparing elements. Based on various benchmarks, if sorting an array of integers, quicksort is about 15% faster than merge sort on a typical PC. If sorting an array of records where the keys that are compared are a relatively small part of the record, then the advantage of quicksort increases, due to fewer moves. At some record size, it's faster to sort an array of pointers to records, in which case merge sort ends up faster, because moving pointers takes less time than compares that require indirection via the pointers. Rcgldr (talk) 07:45, 15 June 2020 (UTC)
External sort
I redid this section, after finding a reference source that doesn't require paying a fee. The sort is not stable and has worst case of O(n) passes, equivalent to time complexity O(n^2) for internal sort, and the average number of passes is worse than a typical external merge sort. The reference was written in 1983, at which time mainframe high end hard drives had reached 1 GB capacity, but PC hard drives were 10 MB. My impression is that this was an academic exercise, versus what was being used in commercial products that included file sorting, such as databases. Rcgldr (talk) 00:59, 19 June 2020 (UTC)
Lomuto partition scheme
In Lomuto partition scheme implementation there is a mistake. In the worst case scenario j grows as i, giving a memory out of bounds error on the last swap. I correct it, see the history. — Preceding unsigned comment added by Dante DDM (talk • contribs) 23:16, 5 August 2020 (UTC)
- @Dante DDM: No mistake there. The i variable will not be incremented in the last iteration of the loop, so we have 'i is less or equal hi' after the loop termination. No risk of out-of-bounds swap (unless you make a mistake in translating the algorithm into the code). --CiaPan (talk) 23:59, 5 August 2020 (UTC)
- @CiaPan: I've checked out and misunderstood the code. I apologize for bothering. Dante DDM (talk) 11:49, 8 September 2020 (UTC)
- @Dante DDM: That's okay. We all learn from our mistakes. Don't let it to discourage you – It's not the mistake that matters. --CiaPan (talk) 12:35, 11 September 2020 (UTC)
Wrongly placed reference? & text attribution
1. Reference 16 (at time of writing) is about the Yaroslavskiy algorithm, but it’s placed in the section about the Lomuto implementation. Shouldn’t it be removed there and inserted elsewhere, maybe in the History section?
The reference text (for reference :) Wild, Sebastian (2012). "Java 7's Dual Pivot Quicksort". Technische Universität Kaiserslautern.
2. The algorithm given in the Algorithm section seems to be taken almost verbatim from Yaroslavskiy (Reference 10). Would it be good to attribute it to that paper?
The reference text again: Yaroslavskiy, Vladimir (2009). "Dual-Pivot Quicksort" (PDF). Archived from the original (PDF) on 2 October 2015.
— Preceding unsigned comment added by Geke (talk • contribs) 25 August 2020 (UTC)