Data replication
Distributed systems replicate data across nodes for various reasons including increasing availability, decreasing response times and improving throughput while keeping replication transparency: the system has identical behaviour to system without replication.
Architectural model #
- Node: computer inside a distributed system with unique identifier
- Logical data object: unique value with read/write operations
- Physical copy: replicate of logical object on a node
Classification #
- Syntactic: only consider sets of objects that are read/written by transactions
- Semantic: exploit application knowledge about semantics of transactions and data
(only syntactic methods are covered)
- Pessimistic methods / strong consistency: guarantee consistent data in every failure situation
- Optimistic methods / weak consistency: increase availability by accepting temporary inconsistencies
Type of protocols #
one-copy serializability: equivalent to a serial execution of these transactions without replication, i. e. guarantees that the distributed system behaves like a single, non-distributed system.
- Synchoronisation protocol: synchronizes physical operations on objects copies
- Replication Management Protocol (RMP): maps logical operations to physical operations
- Update Protocol (UP) for physical copies: failures can make physical copies stale 'oudated', ensures that copies are updated after recovery
1 and 2 together guarantee "one-copy" serializability
Pessimistic replication #
Two extremes: "Read one – write all" and "Read all – write one"
- Read one – write all
- Increased availability for reading, decreased availability for writing
- Cost of write operations gets higher as degree of replication increases
- Read all - write one
- Reads every node to find the latest version (based on timestamp!) but writes only to one node
- Decreased availability for reading, increased availability for writing
- Cost of read operations gets higher as degree of replication increases
- Primary copy: has a primary copy $x^*$, the lock for $x^*$ is required for all read or write operations, for reading any copy can be read but for writing $x^*$ and all other copies must be written
- Reading is cheaper and writing is more expensive
Voting protocols #
Each copy gets a number of votes which it uses to vote on read or write operations
Version numbers $VN$: distinguish current from stale version. Each copy has a version number starting at $0$ and is increased to a value greater than all previous version numbers on a write.
Quorums: set of copies with a minimum number of votes
- Every logical object $x$ has a read threshold $q_r[x]$ and write threshold $q_w[w]$.
- Every physical copy $x_p$ has $q[x_p]$ votes
- A set of copies is a read/write quorum if their total number of votes $\geq$ read/write threshold
- Minimal quorum: removing the votes of a single copy would make it invalid
Condition: each write columns must overlap with each other write or read quorum:
$q_w[x] > q[x]/2$
$q_r[x] > q_w[x] > q[x]$
Majority consensus
- Each copy has one vote
- $2n + 1$ copies are needed to tolerate $n$ node failures
- In case of partitioning, the nodes in at most one partition can do read or write operations
- Reading: availability lower than "read one, write all" but higher than non-replicted
- Writing: availability is higher than non-replicted or "read one, write all"
Weighted voting
- Each copy has an individual number of votes
- Every logical object can have an individual read/write threshold
- Copies on reliable notes get more votes, increasing availability, unreliable local copies can have 0 notes, failure does not impact availability
Optimistic replication #
Increase availability by decreasing the guaranteed degree of consistency.
Eventual consistency: all replicates will eventually converge into a mutually consistent state.
Update as Soon as Possible #
Node with primary copy $x^*$ has the request queue RQ(x)
An Epidemic Protocol #
- Every object has a associated timestamp
- Writing: any copy which is then propagated to all other nodes, a node updates its copy only when the timestamp is higher
- Reading: any available copy
- Anti-entropy (Update Protocol): each node periodically resolves the difference with another pull/push
CAP Theorem #
(also known as Brewer's theorem)
In a distributed system is is impossible to simultaneously provide all three of the following guarantees: strong Consistency, Availability, Partition tolerance.
- Pessimistic replication ensures strong consistency with improved availability but assuming no partitioning
- Optimistic replication does not provide strong consistency but partitioning is possible