Jump to content

Talk:Persistent data structure: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Qqwy (talk | contribs)
No edit summary
No edit summary
Line 72: Line 72:


[[User:Qqwy|Qqwy]] ([[User talk:Qqwy|talk]]) 07:02, 13 May 2019 (UTC)
[[User:Qqwy|Qqwy]] ([[User talk:Qqwy|talk]]) 07:02, 13 May 2019 (UTC)

== Copy on Write has exponential performance characteristics?!? ==

There is a claim in about the performance characteristics of Copy on Write algorithms. It is already marked for "Citation needed". I think it is factually false, as the CPU complexity of such a write is proportional to the number of writes and the length of the data structure. As the lengths of the data structure itself is proportional to the number of writes, we can say it is O(n^2). For space complexity, it is also O(n^2) for similar reasons.

I am not an editor, and I have very little experience on how such issues should be handled.

Revision as of 11:40, 17 November 2019

WikiProject iconComputing Unassessed
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
???This article has not yet received a rating on Wikipedia's content assessment scale.
???This article has not yet received a rating on the project's importance scale.

This article is or was the subject of a Wiki Education Foundation-supported course assignment. Further details are available on the course page. Student editor(s): Ethan McCue (article contribs).

I think this article needs to be merged from or to Purely functional. Persistent data structure is a better name for the page (functional is an adjective which therefore violates Wikipedia naming conventions). On the other hand, purely functional has some nice diagrams which I drew to explain the concept :-) What does everyone think? Richard W.M. Jones 18:52, 17 January 2006 (UTC)[reply]

I've never seen "persistent data structure" used in the way this article describes. It may be used that way in some obscure book or two, but in my experience, a persistent data structure is one that is automatically saved when the program exits, as is described in the link to persistence. Dave Gudeman 04:40, 29 August 2006 (UTC)[reply]

It's a matter of your field of study. The sense in which you use it is encountered in some papers on OODBMS's. But the functional data structure meaning is widely known in research (everyone I've talked to about them immediately understood) and used in an entire subfield of widely cited papers. To list a few:

  • J. Driscoll, N. Sarnak, D. D. Sleator, and R. Tarjan. Making Data Structures Persistent. Journal of Computer and System Sciences, 38:86--124, 1989. [1] (123 citations)
  • N. Sarnak and R. E. Tarjan. Planar point location using persistent search trees. Communications of the ACM, 29(7):669--679, 1986. [2] (115 citations)
  • P. F. Dietz. Fully persistent arrays. In Workshop on Algorithms and Data Structures, volume 382 of Lecture Notes in Computer Science, pages 67--74. SpringerVerlag, August 1989. [3] (28 citations)
  • J. Driscoll, D. Sleator, and R. Tarjan. Fully persistent lists with catenation. Journal of the ACM, 41(5):943-959, 1994. [4] (22 citations)
  • Okasaki, C. (1998). Purely Functional Data Structures. Cambridge University Press. [5][6] (104 citations) Used throughout this seminal book, occuring dozens of times. Section 9.2 is entitle "Persistent Data Structures".

The name collision with the idea of disk-persisted data structures is unfortunate, and I wouldn't object to adding a note to clarify this. Deco 05:36, 29 August 2006 (UTC)[reply]

Persistent data structure and Purely Functional should not be merged

Purely Functional describes how 'purely functional' computer languages deal with variables. So 1) the title Purely Functional is perfectly valid; it is the name of a category of computer programming languages, in the same way as Object Oriented, Imperative, Declarative, Duck-typed, etc. 2) This article should not be merged with any other discussion of data structures because this article is not about the data structures themselves, but rather the way that purely functional computer languages deal with them - whether or not they are persistent. —Preceding unsigned comment added by 202.63.49.61 (talk) 10:45, 21 January 2010 (UTC)[reply]

Not all purely functional data structures are persistent

For example, consider a queue. A standard representation user two lists, one reversed. When adding an element we add it to the first list, when removing an element we remove it from the second. When the second list is empty and we want to remove an element from the queue, we have to reverse the first list (reversal is O(N)), and this list becomes the new second list. We can remove the element from the second list now. Because we don't have to reverse most of the time, this queue is amortized O(1). But occasionally O(N)!

This queue is *not* persistent: what if we have a queue with an empty second list, and a nonempty first list of length N. When we use this list several times, and remove elements from it several times, the remove takes O(N) time. So this is no longer an O(1) queue! —Preceding unsigned comment added by 84.25.34.87 (talk) 18:31, 12 October 2007 (UTC)[reply]

If you want to claim non-persistence, then why do you argue inefficiency of implementation? Nothing prevents this data structure from being persistent AFAICT. --MarSch (talk) 12:06, 13 February 2008 (UTC)[reply]
Persistence has nothing to do with the amortized runtime of the operations. Additionally, you're wrong - the remove takes amortized O(1) time - this means that a sequence of n insert and remove operations takes O(n) time. See amortized analysis. Dcoetzee 20:10, 13 February 2008 (UTC)[reply]

Not persistant at all

I would argue there is no implicit persistance. Although there is internal reference sharing, as is the case with list types in ML, there is no way to implicitly refer to these elements in the list. The programmer would have to declare variables for every sublist they wanted a reference to, but changing that reference would create a new list. Specificaly a list which references the values of the old list to the location where they differ and adds its unique elements and is therefore considered a new list. At this point the programmer is at a loss without language support for such a feature. There are similarities between the notion of persistance and functional data structures but nothing tangibly usefull. Every change made to a data structure in a functional language returns a new reference to the updated structure and provides no way to acceess the version previously modified. Now consider garbage collection on unreachable references, the removed element of the list.

In a functional language you don't ``change`` data structures, you create new ones. If you create a new data structure from an old one there is no reason why your reference to the old one should be lost unless you choose to lose it. --MarSch (talk) 12:03, 13 February 2008 (UTC)[reply]

Vlists

I have removed the reference to the VList data structure. Vlists are not really a "persistent counterpart to the array". They are not persistent at all if you update them as an array. They do allow O(1) random read access if you update them as a list, but that's only if you update them in a partially persistent way. If you update them in a fully persistent way, then the random read access is not efficient. —Preceding unsigned comment added by Ken Hirsch (talkcontribs) 07:40, 21 May 2008 (UTC)[reply]

How (much more or less) efficient in practice?

"While persistence can be achieved by simple copying, this is inefficient in CPU and RAM usage" While true, I came here from Rope page and in general I would think as for Rope, some operations are slower (there e.g. indexing). Some data structures might be accessed a lot such as strings (ropes) in an editor say, but updated/edited at a human speed (to be fair data structute would also be read infrequently, when e.g. scrolling page?).

It seems to me that a structure based on pointers such as ropes and all persistents(?) would have less locality of reference than some simpler ones such as arrays. In general a hybrid might reduce that drawback arbitrarily (but possible and still be persistent).

I could look into all of this (anyone have answers and they belong here?); in general it seems to me lazy languages such as Haskell have not taken the world by a storm and the newer eager Julia is, and it is fast. I wander if the eager, side-effect functional variant is the way to go (want to find out if Julia has a fatal flaw..). comp.arch (talk) 18:38, 1 November 2014 (UTC)[reply]


Section on modern usage?

I was thinking that this article could be more useful with a description of Persistent Hash Array Mapped Tries and how they and other persistent data structures have been adopted by languages like Clojure, Scala, Java, Javascript, etc. I know at least that the "Persistent Map" based on that technique has a good deal of adoption in the React community. Curious as to anyone elses thoughts on that. — Preceding unsigned comment added by Ethan McCue (talkcontribs) 21:28, 9 October 2018 (UTC)[reply]

Intro too long?

The intro blurb is 4 paragraphs and mentions efficiency issues. Not that these aren't worth noting, but it seems dubious that this information would be useful at a glance. Maybe it should be broken down into a new or existing section? — Preceding unsigned comment added by Ethan McCue (talkcontribs) 19:07, 26 November 2018 (UTC)[reply]


Persistency vs Immutability

The page contains multiple statements like this one:

> Haskell is a pure functional language and therefore does not allow for mutation. Therefore all data structures in the language are persistent, as it is impossible to not preserve the previous state of a data structure with functional semantics.[15] This is because any change to a data structure that would render previous versions of a data structure invalid would violate referential transparency.

I have the feeling that the terms 'persistency' and 'immutability' are here conflated: It would be more correct to say that a pure language requires _immutable_ data structures (to not violate referential transparency), and that persistent data structures are the only (known) _efficient_ implementations of these.

Qqwy (talk) 07:02, 13 May 2019 (UTC)[reply]

Copy on Write has exponential performance characteristics?!?

There is a claim in about the performance characteristics of Copy on Write algorithms. It is already marked for "Citation needed". I think it is factually false, as the CPU complexity of such a write is proportional to the number of writes and the length of the data structure. As the lengths of the data structure itself is proportional to the number of writes, we can say it is O(n^2). For space complexity, it is also O(n^2) for similar reasons.

I am not an editor, and I have very little experience on how such issues should be handled.