Last ten users to contribute to the page (page_recent_contributors ) | [
0 => '220.225.78.67',
1 => '126.126.127.183',
2 => '141.226.11.27',
3 => 'Lord Belbury',
4 => 'Michael Shields',
5 => 'Bernardoduarte',
6 => 'Reedetch',
7 => '198.27.212.2',
8 => 'Abhishek.0909ch',
9 => '164.67.127.52'
] |
Old page wikitext, before the edit (old_wikitext ) | '{{Redirect|2PC|the play in American and Canadian football|Two-point conversion|the cryptographic protocol|Commitment scheme}}
In [[transaction processing]], [[database]]s, and [[computer networking]], the '''two-phase commit protocol''' ('''2PC''') is a type of [[Atomic commit|atomic commitment protocol]] (ACP). It is a [[distributed algorithm]] that coordinates all the processes that participate in a [[Distributed transaction|distributed atomic transaction]] on whether to ''[[Commit (data management)|commit]]'' or ''abort'' (''roll back'') the transaction (it is a specialized type of [[Consensus (computer science)|consensus]] protocol). The protocol achieves its goal even in many cases of temporary system failure (involving either process, network node, communication, etc. failures), and is thus widely used.<ref name="bernstein1987">[[Phil Bernstein|Philip A. Bernstein]], Vassos Hadzilacos, Nathan Goodman (1987): [http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx ''Concurrency Control and Recovery in Database Systems''], Chapter 7, Addison Wesley Publishing Company, {{ISBN|0-201-10715-5}}</ref><ref name="weikum2001">[[Gerhard Weikum]], Gottfried Vossen (2001): [http://www.elsevier.com/wps/find/bookdescription.cws_home/677937/description#description ''Transactional Information Systems''], Chapter 19, Elsevier, {{ISBN|1-55860-508-8}}</ref><ref name=Bern2009>Philip A. Bernstein, Eric Newcomer (2009): [http://www.elsevierdirect.com/product.jsp?isbn=9781558606234 ''Principles of Transaction Processing'', 2nd Edition], Chapter 8, Morgan Kaufmann (Elsevier), {{ISBN|978-1-55860-623-4}}</ref>
However, it is not resilient to all possible failure configurations, and in rare cases, manual intervention is needed to remedy an outcome. To accommodate recovery from failure (automatic in most cases) the protocol's participants use [[Server log|logging]] of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's [[recovery procedure]]s. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol.
In a "normal execution" of any single [[distributed transaction]] (i.e., when no failure occurs, which is typically the most frequent situation), the protocol consists of two phases:
#The ''commit-request phase'' (or ''voting phase''), in which a ''coordinator'' process attempts to prepare all the transaction's participating processes (named ''participants'', ''cohorts'', or ''workers'') to take the necessary steps for either committing or aborting the transaction and to ''vote'', either "Yes": commit (if the transaction participant's local portion execution has ended properly), or "No": abort (if a problem has been detected with the local portion), and
#The ''commit phase'', in which, based on ''voting'' of the participants, the coordinator decides whether to commit (only if ''all'' have voted "Yes") or abort the transaction (otherwise), and notifies the result to all the participants. The participants then follow with the needed actions (commit or abort) with their local transactional resources (also called ''recoverable resources''; e.g., database data) and their respective portions in the transaction's other output (if applicable).
Note that the two-phase commit (2PC) protocol should not be confused with the [[two-phase locking]] (2PL) protocol, a [[concurrency control]] protocol.
==Assumptions==
The protocol works in the following manner: one node is a designated '''coordinator''', which is the master site, and the rest of the nodes in the network are designated the '''participants'''. The protocol assumes that there is [[stable storage]] at each node with a [[Write ahead logging|write-ahead log]], that no node crashes forever, that the data in the write-ahead log is never lost or corrupted in a crash, and that any two nodes can communicate with each other. The last assumption is not too restrictive, as network communication can typically be rerouted. The first two assumptions are much stronger; if a node is totally destroyed then data can be lost.
The protocol is initiated by the coordinator after the last step of the transaction has been reached. The participants then respond with an '''agreement''' message or an '''abort''' message depending on whether the transaction has been processed successfully at the participant.
==Basic algorithm==
===Commit request (or voting) phase===
#The coordinator sends a '''query to commit''' message to all participants and waits until it has received a reply from all participants.
#The participants execute the transaction up to the point where they will be asked to commit. They each write an entry to their ''undo log'' and an entry to their ''[[redo log]]''.
#Each participant replies with an '''agreement''' message (participant votes '''Yes''' to commit), if the participant's actions succeeded, or an '''abort''' message (participant votes '''No''', not to commit), if the participant experiences a failure that will make it impossible to commit.
===Commit (or completion) phase===
====Success====
If the coordinator received an '''agreement''' message from ''all'' participants during the commit-request phase:
#The coordinator sends a '''commit''' message to all the participants.
#Each participant completes the operation, and releases all the locks and resources held during the transaction.
#Each participant sends an '''acknowledgment''' to the coordinator.
#The coordinator completes the transaction when all acknowledgments have been received.
====Message flow====
<pre>
Coordinator Participant
QUERY TO COMMIT
-------------------------------->
VOTE YES/NO prepare*/abort*
<-------------------------------
commit*/abort* COMMIT/ROLLBACK
-------------------------------->
ACKNOWLEDGMENT commit*/abort*
<--------------------------------
end
</pre>
An * next to the record type means that the record is forced to stable storage.<ref name="mohan1986">[[C. Mohan]], Bruce Lindsay and R. Obermarck (1986): [http://dl.acm.org/citation.cfm?id=7266 "Transaction management in the R* distributed database management system"],''ACM Transactions on Database Systems (TODS)'', Volume 11 Issue 4, Dec. 1986, Pages 378 - 396</ref>
==Disadvantages==
The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some participants will never resolve their transactions: After a participant has sent an '''agreement''' message to the coordinator, it will block until a '''commit''' or '''rollback''' is received.
==Implementing the two-phase commit protocol==
===Common architecture===
In many cases the 2PC protocol is distributed in a computer network. It is easily distributed by implementing multiple dedicated 2PC components similar to each other, typically named ''[[Transaction manager]]s'' (TMs; also referred to as ''2PC agents'' or Transaction Processing Monitors), that carry out the protocol's execution for each transaction (e.g., [[The Open Group]]'s [[X/Open XA]]). The databases involved with a distributed transaction, the ''participants'', both the coordinator and participants, ''register'' to close TMs (typically residing on respective same network nodes as the participants) for terminating that transaction using 2PC. Each distributed transaction has an ad hoc set of TMs, the TMs to which the transaction participants register. A leader, the coordinator TM, exists for each transaction to coordinate 2PC for it, typically the TM of the coordinator database. However, the coordinator role can be transferred to another TM for performance or reliability reasons. Rather than exchanging 2PC messages among themselves, the participants exchange the messages with their respective TMs. The relevant TMs communicate among themselves to execute the 2PC protocol schema above, "representing" the respective participants, for terminating that transaction. With this architecture the protocol is fully distributed (does not need any central processing component or data structure), and scales up with number of network nodes (network size) effectively.
This common architecture is also effective for the distribution of other [[Atomic commit|atomic commitment protocol]]s besides 2PC, since all such protocols use the same voting mechanism and outcome propagation to protocol participants.<ref name="bernstein1987" /><ref name="weikum2001" />
===Protocol optimizations===
[[Database]] research has been done on ways to get most of the benefits of the two-phase commit protocol while reducing costs by ''protocol optimizations''<ref name="bernstein1987" /><ref name="weikum2001" /><ref name="Bern2009" /> and protocol operations saving under certain system's behavior assumptions.
====Presumed Abort and Presumed Commit====
''Presumed abort'' or ''Presumed commit'' are common such optimizations.<ref name="weikum2001" /><ref name=Bern2009/><ref name="mohan1983">[[C. Mohan]], Bruce Lindsay (1985): [http://portal.acm.org/citation.cfm?id=850772 "Efficient commit protocols for the tree of processes model of distributed transactions"],''ACM SIGOPS Operating Systems Review'',
19(2),pp. 40-52 (April 1985)</ref> An assumption about the outcome of transactions, either commit, or abort, can save both messages and logging operations by the participants during the 2PC protocol's execution. For example, when presumed abort, if during system recovery from failure no logged evidence for commit of some transaction is found by the recovery procedure, then it assumes that the transaction has been aborted, and acts accordingly. This means that it does not matter if aborts are logged at all, and such logging can be saved under this assumption. Typically a penalty of additional operations is paid during recovery from failure, depending on optimization type. Thus the best variant of optimization, if any, is chosen according to failure and transaction outcome statistics.
==See also==
*[[Atomic commit]]
*[[Commit (data management)]]
*[[Three-phase commit protocol]]
*[[X/Open XA|XA]]
*[[Paxos algorithm]]
*[[Two Generals' Problem]]
==References==
{{Reflist}}
==External links==
*[http://exploredatabase.blogspot.in/2014/07/two-phase-commit-protocol-in-pictures.html Two Phase Commit protocol explained in Pictures] by exploreDatabase
{{DEFAULTSORT:Two-Phase Commit Protocol}}
[[Category:Data management]]
[[Category:Transaction processing]]' |
New page wikitext, after the edit (new_wikitext ) | '{{Redirect|2PC|the play in American and Canadian football|Two-point conversion|the cryptographic protocol|Commitment scheme}}
In [[transaction processing]], [[database]]s, and [[computer networking]], the '''two-phase commit protocol''' ('''2PC''') is a type of [[Atomic commit|atomic commitment protocol]] (ACP). It is a [[distributed algorithm]] that coordinates all the processes that participate in a [[Distributed transaction|distributed atomic transaction]] on whether to ''[[Commit (data management)|commit]]'' or ''abort'' (''roll back'') the transaction (it is a specialized type of [[Consensus (computer science)|consensus]] protocol). The protocol achieves its goal even in many cases of temporary system failure (involving either process, network node, communication, etc. failures), and is thus widely used.<ref name="bernstein1987">[[Phil Bernstein|Philip A. Bernstein]], Vassos Hadzilacos, Nathan Goodman (1987): [http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx ''Concurrency Control and Recovery in Database Systems''], Chapter 7, Addison Wesley Publishing Company, {{ISBN|0-201-10715-5}}</ref><ref name="weikum2001">[[Gerhard Weikum]], Gottfried Vossen (2001): [http://www.elsevier.com/wps/find/bookdescription.cws_home/677937/description#description ''Transactional Information Systems''], Chapter 19, Elsevier, {{ISBN|1-55860-508-8}}</ref><ref name=Bern2009>Philip A. Bernstein, Eric Newcomer (2009): [http://www.elsevierdirect.com/product.jsp?isbn=9781558606234 ''Principles of Transaction Processing'', 2nd Edition], Chapter 8, Morgan Kaufmann (Elsevier), {{ISBN|978-1-55860-623-4}}</ref>
However, it is not resilient to all possible failure configurations, and in rare cases, manual intervention is needed to remedy an outcome. To accommodate recovery from failure (automatic in most cases) the protocol's participants use [[Server log|logging]] of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's [[recovery procedure]]s. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol.
In a "normal execution" of any single [[distributed transaction]] (i.e., when no failure occurs, which is typically the most frequent situation), the protocol consists of two phases:
#The ''commit-request phase'' (or ''voting phase''), in which a ''coordinator'' process attempts to prepare all the transaction's participating processes (named ''participants'', ''cohorts'', or ''workers'') to take the necessary steps for either committing or aborting the transaction and to ''vote'', either "Yes": commit (if the transaction participant's local portion execution has ended properly), or "No": abort (if a problem has been detected with the local portion), and
#The ''commit phase'', in which, based on ''voting'' of the participants, the coordinator decides whether to commit (only if ''all'' have voted "Yes") or abort the transaction (otherwise), and notifies the result to all the participants. The participants then follow with the needed actions (commit or abort) with their local transactional resources (also called ''recoverable resources''; e.g., database data) and their respective portions in the transaction's other output (if applicable).
Note that the two-phase commit (2PC) protocol should not be confused with the [[two-phase locking]] (2PL) protocol, a [[concurrency control]] protocol.
==Assumptions==
The protocol works in the following manner: one node is a designated '''coordinator''', which is the master site, and the rest of the nodes in the network are designated the '''participants'''. The protocol assumes that there is [[stable storage]] at each node with a [[Write ahead logging|write-ahead log]], that no node crashes forever, that the data in the write-ahead log is never lost or corrupted in a crash, and that any two nodes can communicate with each other. The last assumption is not too restrictive, as network communication can typically be rerouted. The first two assumptions are much stronger; if a node is totally destroyed then data can be lost.
The protocol is initiated by the coordinator after the last step of the transaction has been reached. The participants then respond with an '''agreement''' message or an '''abort''' message depending on whether the transaction has been processed successfully at the participant.
==Basic algorithm==
===Commit request (or voting) phase===
#The coordinator sends a '''query to commit''' message to all participants and waits until it has received a reply from all participants.
#The participants execute the transaction up to the point where they will be asked to commit. They each write an entry to their ''undo log'' and an entry to their ''[[redo log]]''.
#Each participant replies with an '''agreement''' message (participant votes '''Yes''' to commit), if the participant's actions succeeded, or an '''abort''' message (participant votes '''No''', not to commit), if the participant experiences a failure that will make it impossible to commit.
===Commit (or completion) phase===
====Success====
If the coordinator received an '''agreement''' message from ''all'' participants during the commit-request phase:
#The coordinator sends a '''commit''' message to all the participants.
#Each participant completes the operation, and releases all the locks and resources held during the transaction.
#Each participant sends an '''acknowledgment''' to the coordinator.
#The coordinator completes the transaction when all acknowledgments have been received.
====Message flow====
<pre>
Coordinator Participant
QUERY TO COMMIT
-------------------------------->
VOTE YES/NO prepare*/abort*
<-------------------------------
commit*/abort* COMMIT/ROLLBACK
-------------------------------->
ACKNOWLEDGMENT commit*/abort*
<--------------------------------
end
</pre>
An * next to the record type means that the record is forced to stable storage.<ref name="mohan1986">[[C. Mohan]], Bruce Lindsay and R. Obermarck (1986): [http://dl.acm.org/citation.cfm?id=7266 "Transaction management in the R* distributed database management system"],''ACM Transactions on Database Systems (TODS)'', Volume 11 Issue 4, Dec. 1986, Pages 378 - 396</ref>
==Disadvantages==
The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some participants will never resolve their transactions: After a participant has sent an '''agreement''' message to the coordinator, it will block until a '''commit''' or '''rollback''' is received.
==Implementing the two-phase commit protocol==
===Common architecture===
In many cases the 2PC protocol is distributed in a computer network. It is easily distributed by implementing multiple dedicated 2PC components similar to each other, typically named ''[[Transaction manager]]s'' (TMs; also referred to as ''2PC agents'' or Transaction Processing Monitors), that carry out the protocol's execution for each transaction (e.g., [[The Open Group]]'s [[X/Open XA]]). The databases involved with a distributed transaction, the ''participants'', both the coordinator and participants, ''register'' to close TMs (typically residing on respective same network nodes as the participants) for terminating that transaction using 2PC. Each distributed transaction has an ad hoc set of TMs, the TMs to which the transaction participants register. A leader, the coordinator TM, exists for each transaction to coordinate 2PC for it, typically the TM of the coordinator database. However, the coordinator role can be transferred to another TM for performance or reliability reasons. Rather than exchanging 2PC messages among themselves, the participants exchange the messages with their respective TMs. The relevant TMs communicate among themselves to execute the 2PC protocol schema above, "representing" the respective participants, for terminating that transaction. With this architecture the protocol is fully distributed (does not need any central processing component or data structure), and scales up with number of network nodes (network size) effectively.
This common architecture is also effective for the distribution of other [[Atomic commit|atomic commitment protocol]]s besides 2PC, since all such protocols use the same voting mechanism and outcome propagation to protocol participants.<ref name="bernstein1987" /><ref name="weikum2001" />
===Protocol optimizations===
[[Database]] research has been done on ways to get most of the benefits of the two-phase commit protocol while reducing costs by ''protocol optimizations''<ref name="bernstein1987" /><ref name="weikum2001" /><ref name="Bern2009" /> and protocol operations saving under certain system's behavior assumptions.
==See also==
*[[Atomic commit]]
*[[Commit (data management)]]
*[[Three-phase commit protocol]]
*[[X/Open XA|XA]]
*[[Paxos algorithm]]
*[[Two Generals' Problem]]
==References==
{{Reflist}}
==External links==
*[http://exploredatabase.blogspot.in/2014/07/two-phase-commit-protocol-in-pictures.html Two Phase Commit protocol explained in Pictures] by exploreDatabase
{{DEFAULTSORT:Two-Phase Commit Protocol}}
[[Category:Data management]]
[[Category:Transaction processing]]' |
Unified diff of changes made by edit (edit_diff ) | '@@ -59,8 +59,4 @@
===Protocol optimizations===
[[Database]] research has been done on ways to get most of the benefits of the two-phase commit protocol while reducing costs by ''protocol optimizations''<ref name="bernstein1987" /><ref name="weikum2001" /><ref name="Bern2009" /> and protocol operations saving under certain system's behavior assumptions.
-
-====Presumed Abort and Presumed Commit====
-''Presumed abort'' or ''Presumed commit'' are common such optimizations.<ref name="weikum2001" /><ref name=Bern2009/><ref name="mohan1983">[[C. Mohan]], Bruce Lindsay (1985): [http://portal.acm.org/citation.cfm?id=850772 "Efficient commit protocols for the tree of processes model of distributed transactions"],''ACM SIGOPS Operating Systems Review'',
-19(2),pp. 40-52 (April 1985)</ref> An assumption about the outcome of transactions, either commit, or abort, can save both messages and logging operations by the participants during the 2PC protocol's execution. For example, when presumed abort, if during system recovery from failure no logged evidence for commit of some transaction is found by the recovery procedure, then it assumes that the transaction has been aborted, and acts accordingly. This means that it does not matter if aborts are logged at all, and such logging can be saved under this assumption. Typically a penalty of additional operations is paid during recovery from failure, depending on optimization type. Thus the best variant of optimization, if any, is chosen according to failure and transaction outcome statistics.
==See also==
' |
Lines removed in edit (removed_lines ) | [
0 => false,
1 => '====Presumed Abort and Presumed Commit====',
2 => '''Presumed abort'' or ''Presumed commit'' are common such optimizations.<ref name="weikum2001" /><ref name=Bern2009/><ref name="mohan1983">[[C. Mohan]], Bruce Lindsay (1985): [http://portal.acm.org/citation.cfm?id=850772 "Efficient commit protocols for the tree of processes model of distributed transactions"],''ACM SIGOPS Operating Systems Review'',',
3 => '19(2),pp. 40-52 (April 1985)</ref> An assumption about the outcome of transactions, either commit, or abort, can save both messages and logging operations by the participants during the 2PC protocol's execution. For example, when presumed abort, if during system recovery from failure no logged evidence for commit of some transaction is found by the recovery procedure, then it assumes that the transaction has been aborted, and acts accordingly. This means that it does not matter if aborts are logged at all, and such logging can be saved under this assumption. Typically a penalty of additional operations is paid during recovery from failure, depending on optimization type. Thus the best variant of optimization, if any, is chosen according to failure and transaction outcome statistics.'
] |