Jump to content

CAP theorem: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Tags: Reverted Mobile edit Mobile web edit
m Reverted edits by 2401:E180:8D52:46B5:95B1:53C9:7E81:F77B (talk) to last version by Rlink2
Line 1: Line 1:
{{short description|Need to sacrifice consistency or availability in the presence of network partitions}}
{{short description|Need to sacrifice consistency or availability in the presence of network partitions}}
[[International Standard Book Number|In]] [[theoretical computer science]], the '''CAP theorem''', also named '''Brewer's theorem''' after computer scientist [[Eric Brewer (scientist)|Eric Brewer]], states that any [[distributed data store]] can only provide [[Trilemma|two of the following three]] guarantees:<ref name="Gilbert Lynch">Seth Gilbert and Nancy Lynch, [http://dl.acm.org/citation.cfm?id=564601&CFID=609557487&CFTOKEN=15997970 "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services"], ''ACM SIGACT News'', Volume 33 Issue 2 (2002), pg. 51–59. {{doi|10.1145/564585.564601}}.</ref><ref>[http://www.julianbrowne.com/article/viewer/brewers-cap-theorem "Brewer's CAP Theorem"], julianbrowne.com, Retrieved 02-Mar-2010</ref><ref>[https://www.royans.net/2010/02/brewers-cap-theorem-on-distributed.html "Brewers CAP theorem on distributed systems"], royans.net</ref>
In [[theoretical computer science]], the '''CAP theorem''', also named '''Brewer's theorem''' after computer scientist [[Eric Brewer (scientist)|Eric Brewer]], states that any [[distributed data store]] can only provide [[Trilemma|two of the following three]] guarantees:<ref name="Gilbert Lynch">Seth Gilbert and Nancy Lynch, [http://dl.acm.org/citation.cfm?id=564601&CFID=609557487&CFTOKEN=15997970 "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services"], ''ACM SIGACT News'', Volume 33 Issue 2 (2002), pg. 51–59. {{doi|10.1145/564585.564601}}.</ref><ref>[http://www.julianbrowne.com/article/viewer/brewers-cap-theorem "Brewer's CAP Theorem"], julianbrowne.com, Retrieved 02-Mar-2010</ref><ref>[https://www.royans.net/2010/02/brewers-cap-theorem-on-distributed.html "Brewers CAP theorem on distributed systems"], royans.net</ref>
; [[Consistency model|Consistency]]: Every read receives the most recent write or an error.
; [[Consistency model|Consistency]]: Every read receives the most recent write or an error.
; [[Availability]]: Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
; [[Availability]]: Every request receives a (non-error) response, without the guarantee that it contains the most recent write.

Revision as of 10:56, 10 April 2022

In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can only provide two of the following three guarantees:[1][2][3]

Consistency
Every read receives the most recent write or an error.
Availability
Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
Partition tolerance
The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

When a network partition failure happens, it must be decided whether to

  • cancel the operation and thus decrease the availability but ensure consistency or to
  • proceed with the operation and thus provide availability but risk inconsistency.

Thus, if there is a network partition, one has to choose between consistency and availability. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID database transactions.[4]

Eric Brewer argues that the often-used "two out of three" concept can be somewhat misleading because system designers only need to sacrifice consistency or availability in the presence of partitions, but that in many systems partitions are rare.[5][6]

Explanation

No distributed system is safe from network failures, thus network partitioning generally has to be tolerated.[7][8] In the presence of a partition, one is then left with two options: consistency or availability. When choosing consistency over availability, the system will return an error or a time out if particular information cannot be guaranteed to be up to date due to network partitioning. When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning.

CAP is often misunderstood as a choice at all times of which one of the three guarantees to abandon. In fact, the choice is between consistency and availability only when a network partition or failure happens. When there is no network failure, both availability and consistency can be satisfied.[9][10] SQL relational databases such as YugabyteDB, CockroachDB, LeanXcale, NuoDB, or Google Spanner are counter-examples of this fallacy.

CAP has been used by many NoSQL database vendors as a justification for not providing transactional ACID consistency, claiming that the CAP theorem “proves” that it is impossible to provide scalability and ACID consistency at the same time. However, a closer look at the CAP theorem and, in particular, the formalisation by Gilbert & Lynch, reveals that the CAP theorem does not refer at all to scalability, but only availability (the A in CAP).[11]

Database systems designed with traditional ACID guarantees in mind such as RDBMS choose consistency over availability, whereas systems designed around the BASE philosophy, common in the NoSQL movement for example, choose availability over consistency.[5]

The PACELC theorem builds on CAP by stating that even in the absence of partitioning, there is another trade-off between latency and consistency.

History

According to University of California, Berkeley computer scientist Eric Brewer, the theorem first appeared in autumn 1998.[5] It was published as the CAP principle in 1999[12] and presented as a conjecture by Brewer at the 2000 Symposium on Principles of Distributed Computing (PODC).[13] In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a theorem.[1]

In 2012, Brewer clarified some of his positions, including why the often-used "two out of three" concept can be somewhat misleading because system designers only need to sacrifice consistency or availability in the presence of partitions; partition management and recovery techniques exist. Brewer also noted the different definition of consistency used in the CAP theorem relative to the definition used in ACID.[5][6]

A similar theorem stating the trade-off between consistency and availability in distributed systems was published by Birman and Friedman in 1996.[14] Birman and Friedman's result restricted this lower bound to non-commuting operations.

Blockchain technology sacrifices immediate consistency for availability and partition tolerance, by requiring a specific number of "confirmations", Blockchain consensus algorithms are basically reduced to eventual consistency. [15]

See also

References

  1. ^ a b Seth Gilbert and Nancy Lynch, "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services", ACM SIGACT News, Volume 33 Issue 2 (2002), pg. 51–59. doi:10.1145/564585.564601.
  2. ^ "Brewer's CAP Theorem", julianbrowne.com, Retrieved 02-Mar-2010
  3. ^ "Brewers CAP theorem on distributed systems", royans.net
  4. ^ Liochon, Nicolas. "The confusing CAP and ACID wording". This long run. Retrieved 1 February 2019.
  5. ^ a b c d Eric Brewer, "CAP twelve years later: How the 'rules' have changed", Computer, Volume 45, Issue 2 (2012), pg. 23–29. doi:10.1109/MC.2012.37.
  6. ^ a b Carpenter, Jeff; Hewitt, Eben (July 2016). "Cassandra: The Definitive Guide, 2nd Edition [Book]". www.oreilly.com. Archived from the original on 2020-08-07. Retrieved 2020-12-21. In February 2012, Eric Brewer provided an updated perspective on his CAP theorem [..] Brewer now describes the "2 out of 3" axiom as somewhat misleading. He notes that designers only need sacrifice consistency or availability in the presence of partitions, and that advances in partition recovery techniques have made it possible for designers to achieve high levels of both consistency and availability.
  7. ^ Kleppmann, Martin (2015-09-18). "A Critique of the CAP Theorem". arXiv:1509.05393. Bibcode:2015arXiv150905393K. doi:10.17863/CAM.13083. S2CID 1991487. Retrieved 24 November 2019. {{cite journal}}: Cite journal requires |journal= (help)
  8. ^ Martin, Kleppmann. "Please stop calling databases CP or AP". Martin Kleppmann's Blog. Retrieved 24 November 2019.
  9. ^ "Better explaining the CAP Theorem". DZone Big Data. Retrieved 2016-09-02.
  10. ^ Abadi, Daniel (2010-04-23). "DBMS Musings: Problems with CAP, and Yahoo's little known NoSQL system". DBMS Musings. Retrieved 2018-01-23.
  11. ^ "Understanding the cap theorem and its not relationship to scalability". Jimenez-Peris, Ricardo and Valduriez, Patrick (2021).
  12. ^ Armando Fox and Eric Brewer, "Harvest, Yield and Scalable Tolerant Systems", Proc. 7th Workshop Hot Topics in Operating Systems (HotOS 99), IEEE CS, 1999, pg. 174–178. doi:10.1109/HOTOS.1999.798396
  13. ^ Eric Brewer, "Towards Robust Distributed Systems"
  14. ^ Ken Birman and Roy Friedman, "Trading Consistency for Availability in Distributed Systems", April 1996. hdl:1813/7235.
  15. ^ Bashir, Imran. (2018). Mastering blockchain. Birmingham, England: Packt Publishing. p. 41. ISBN 978-1-78883-904-4.