Data replication

Distributed systems replicate data across nodes for various reasons including increasing availability, decreasing response times and improving throughput while keeping replication transparency: the system has identical behaviour to system without replication.

Architectural model #

Node: computer inside a distributed system with unique identifier
Logical data object: unique value with read/write operations
Physical copy: replicate of logical object on a node

Classification #

Syntactic: only consider sets of objects that are read/written by transactions
Semantic: exploit application knowledge about semantics of transactions and data

(only syntactic methods are covered)

Pessimistic methods / strong consistency: guarantee consistent data in every failure situation
Optimistic methods / weak consistency: increase availability by accepting temporary inconsistencies

Type of protocols #

one-copy serializability: equivalent to a serial execution of these transactions without replication, i. e. guarantees that the distributed system behaves like a single, non-distributed system.

Synchoronisation protocol: synchronizes physical operations on objects copies
Replication Management Protocol (RMP): maps logical operations to physical operations
Update Protocol (UP) for physical copies: failures can make physical copies stale 'oudated', ensures that copies are updated after recovery

1 and 2 together guarantee "one-copy" serializability

Pessimistic replication #

Two extremes: "Read one – write all" and "Read all – write one"

Read one – write all
- Increased availability for reading, decreased availability for writing
- Cost of write operations gets higher as degree of replication increases
Read all - write one
- Reads every node to find the latest version (based on timestamp!) but writes only to one node
- Decreased availability for reading, increased availability for writing
- Cost of read operations gets higher as degree of replication increases
Primary copy: has a primary copy $x^*$, the lock for $x^*$ is required for all read or write operations, for reading any copy can be read but for writing $x^*$ and all other copies must be written
- Reading is cheaper and writing is more expensive

Voting protocols #

Each copy gets a number of votes which it uses to vote on read or write operations

Version numbers $VN$: distinguish current from stale version. Each copy has a version number starting at $0$ and is increased to a value greater than all previous version numbers on a write.

Quorums: set of copies with a minimum number of votes

Every logical object $x$ has a read threshold $q_r[x]$ and write threshold $q_w[w]$.
Every physical copy $x_p$ has $q[x_p]$ votes
A set of copies is a read/write quorum if their total number of votes $\geq$ read/write threshold
Minimal quorum: removing the votes of a single copy would make it invalid

Condition: each write columns must overlap with each other write or read quorum:

$q_w[x] > q[x]/2$
$q_r[x] > q_w[x] > q[x]$
Majority consensus
- Each copy has one vote
- $2n + 1$ copies are needed to tolerate $n$ node failures
- In case of partitioning, the nodes in at most one partition can do read or write operations
- Reading: availability lower than "read one, write all" but higher than non-replicted
- Writing: availability is higher than non-replicted or "read one, write all"
Weighted voting
- Each copy has an individual number of votes
- Every logical object can have an individual read/write threshold
- Copies on reliable notes get more votes, increasing availability, unreliable local copies can have 0 notes, failure does not impact availability

Optimistic replication #

Increase availability by decreasing the guaranteed degree of consistency.

Eventual consistency: all replicates will eventually converge into a mutually consistent state.

Update as Soon as Possible #

Node with primary copy $x^*$ has the request queue RQ(x)

An Epidemic Protocol #

Every object has a associated timestamp
Writing: any copy which is then propagated to all other nodes, a node updates its copy only when the timestamp is higher
Reading: any available copy
Anti-entropy (Update Protocol): each node periodically resolves the difference with another pull/push

CAP Theorem #

(also known as Brewer's theorem)

In a distributed system is is impossible to simultaneously provide all three of the following guarantees: strong Consistency, Availability, Partition tolerance.

Pessimistic replication ensures strong consistency with improved availability but assuming no partitioning
Optimistic replication does not provide strong consistency but partitioning is possible