Transaction processing
Transaction: abstraction of an atomic and reliable execution of a program / a set of logically coherent operations
Notation #
- $r_i[x]$ = Transaction $T_i$ reads object $x$
- $w_i[x]$ = Transaction $T_i$ writes object $x$
- $c_i$ = Transaction $T_i$ executes a commit operation
- $a_i$ = Transaction $T_i$ executes a abort operation
ACID properties #
- Atomicity: all operations of a transaction are executed or none.
- Consistency: a transaction transforms a consistent state into another consistent state.
- Isolation: the effects of a transaction are only visible after its successful completion.
- Durability: the result of a successful transaction is permanent.
Architectural Model #
Every node has a:
- Transaction Manager (TM)
- Registers and interacts with transactions
- Decides the node to execute the transaction on and hands the operation over to the Scheduler of that node
- Scheduler (S)
- Controls the concurrent execution of transactions with the goal of creating a serializable execution sequence
- Can perform the following operations: Execute / Reject / Delay
- Data Manager (DM)
- Reads and writes the data to the database
- Commitment: transaction changes are made permanent
- Abort: transaction changes are discarded
Serializable #
Serializable: when concurrent execution of transactions is equivalent to any serial execution
- Lost Update Problem: two or more transactions read the same data and update it simultaneously, leading to the final state reflecting only the last update, "losing" the earlier update.
- Dirty Read Problem: a transaction reads uncommitted changes made by another concurrent transaction, leading to inconsistencies in the output.
The execution of the transactions in these problems is not serializable and hence incorrect!
Correctness #
$(T_i, <_i)$ is a partial order on the operations of $T_i$ where:
- $T_i \subseteq \{ r_i[x], w_i[w] \mid \text{x is a data object} \} \cup \{ a_i, c_i \}$ (defines the possible operations)
- $a_i \in T_i$ if $c_i \notin T_i$ (either contains a commit or an abort)
- $t_i = a_i \lor t_i = c_i \ \Rightarrow\ \forall p_i \in T_i \setminus \{ t_i \}: p_i <_i t_i$ (commit or abort is the last operation in the sequence)
- $r_i[x], w_i[x] \in T_i \ \Rightarrow\ r_i[x] <_i w_i[x] \lor w_i[x] <_i r_i[x] $ ($<_i$ has to define an order between a read and write operation on every object $x$)
Two operations of different transactions have a conflict when:
- they both access the same object
- at least one of them is a write operation
History #
A history is a partial order $(S, <_H)$ defining the execution order of all transactions.
Set of all transactions $T = \{ T_1, T_2, \ldots, T_n \}$
- Contains all operations of al transactions: $$ S = \bigcup_{i=1}^n T_i $$
- The order of all transactions is maintained $$ <_H \ \supseteq\ \bigcup_{i=1}^n <_i $$
- For all conflicting operations $p, q \in S \Rightarrow p <_H q \lor q <_H p$
Histories are equivalent if they are defined over the same set of transactions AND conflicting operations are put in same order.
Serializability #
- A history $H$ is serial if for any $(T_i, T_k) \in H$ all operations of $T_i$ appear before $T_k$
- A history $H$ is serializable if it is equivalent to a serial history $H_s$
- A serial(izable) history is correct
- Serialization Graph $SG(H)$ of history $H$: a directed graph where the vertices are the transactions and edges are the conflicting pairs
- A history $H$ is serializable iff $SG(H)$ has no cycles
Synchoronization algorithms #
Ensure that the outcome of executing concurrent transactions is equivalent to some serial execution of those transactions, i.e that the transaction histories are serializable (and thus correct!).
- Pessimistic algorithms: locking / timestamp-based
- Optimistic algorithms: 3 phases, validates and aborts if cannot do commit without violating serialisability
Locking algorithms #
- Each object has an associated lock.
- Transactions can only access objects if they have obtained the lock.
- The Scheduler (S) guarantees that at most one transaction can hold the lock of a object. If an object is already locked then the transaction gets delayed until the lock is released.
- Locks can have different locking modes that might be compatible with each other, e.g. either multiple readers or one writer
Two-Phase Locking (2PL) #
Notation:
- $T_i$ obtains a read lock $rL_i[x]$ or write lock $wL_i[x]$ for object $x$
- $T_i$ releases a read lock $rU_i[x]$ or write lock $wU_i[x]$ for object $x$
If the Scheduler (S) receives an operation that is in conflict with an already obtained lock, the operation is delayed until the lock can be obtained. Otherwise, it obtains the lock and sends the operation to the Data Manager (DM) and holds to lock until execution is confirmed.
2-phase locking rule: once S has released a lock for a transaction, it may not obtain any lock for the same transaction again.
The 2-phase locking rule guarantees the history is serializable, however there is a danger of deadlocks!
Cascading abort: when the abort of one transaction forces other transactions to be aborted.
Strict Two-Phase Locking: changes the 2-phase locking rule to only release the locks for $T_i$ after the DM has confirmed the execution of $c_i$ or $a_i$. This simplifies the scheduler and prevents cascading aborts.
Recovery #
Local recovery: Intentions List #
Architectural Model includes two storage types:
- Volatile storage: main memory, lost in case of node failure
- Stable storage: survives node failures, write operation is atomic. Fault-tolerance can be increased by disk replication.
Intentions lists (IL): is kept for each transaction, write operations are not executed immediately but inserted in the IL. After commitment of the transaction the write operations in the IL are executed so no undo operations are necessary for aborted transactions.
On commit the IL is written to stable storage before the write operations are executed, this allows the IL to be recovered in case of node failures.
Distributed recovery: 2-Phase Commit Protocol (2PC) #
Phase 1: Coordinator (node which initiated transaction) asks all participants (other nodes) if they are ready to commit. Participants vote yes or no if they are ready or not. If all participants vote yes the coordinator decides
Phase 2: Coordinator propagates its decision and the participating DMs terminate consistently.
Failure modes:
- Communication failures: coordinator periodically resends COMMIT/ABORT until all it receives ACK from all participants. Cooperative Termination: instead of blocking participant nodes send a DECISION-REQ to all other participants.
- Node failures: each TM maintains a TM log on stable storage which is written before a message is send