Data replication

Distributed systems replicate data across nodes for various reasons including increasing availability, decreasing response times and improving throughput while keeping replication transparency: the system has identical behaviour to system without replication.

Architectural model #

  • Node: computer inside a distributed system with unique identifier
  • Logical data object: unique value with read/write operations
  • Physical copy: replicate of logical object on a node

Classification #

  • Syntactic: only consider sets of objects that are read/written by transactions
  • Semantic: exploit application knowledge about semantics of transactions and data

(only syntactic methods are covered)

  • Pessimistic methods / strong consistency: guarantee consistent data in every failure situation
  • Optimistic methods / weak consistency: increase availability by accepting temporary inconsistencies

Type of protocols #

one-copy serializability: equivalent to a serial execution of these transactions without replication, i. e. guarantees that the distributed system behaves like a single, non-distributed system.

  1. Synchoronisation protocol: synchronizes physical operations on objects copies
  2. Replication Management Protocol (RMP): maps logical operations to physical operations
  3. Update Protocol (UP) for physical copies: failures can make physical copies stale 'oudated', ensures that copies are updated after recovery

1 and 2 together guarantee "one-copy" serializability

Pessimistic replication #

Two extremes: "Read one – write all" and "Read all – write one"

  • Read one – write all
    • Increased availability for reading, decreased availability for writing
    • Cost of write operations gets higher as degree of replication increases
  • Read all - write one
    • Reads every node to find the latest version (based on timestamp!) but writes only to one node
    • Decreased availability for reading, increased availability for writing
    • Cost of read operations gets higher as degree of replication increases
  • Primary copy: has a primary copy $x^*$, the lock for $x^*$ is required for all read or write operations, for reading any copy can be read but for writing $x^*$ and all other copies must be written
    • Reading is cheaper and writing is more expensive

Voting protocols #

Each copy gets a number of votes which it uses to vote on read or write operations

Version numbers $VN$: distinguish current from stale version. Each copy has a version number starting at $0$ and is increased to a value greater than all previous version numbers on a write.

Quorums: set of copies with a minimum number of votes

  • Every logical object $x$ has a read threshold $q_r[x]$ and write threshold $q_w[w]$.
  • Every physical copy $x_p$ has $q[x_p]$ votes
  • A set of copies is a read/write quorum if their total number of votes $\geq$ read/write threshold
  • Minimal quorum: removing the votes of a single copy would make it invalid

Condition: each write columns must overlap with each other write or read quorum:

  • $q_w[x] > q[x]/2$

  • $q_r[x] > q_w[x] > q[x]$

  • Majority consensus

    • Each copy has one vote
    • $2n + 1$ copies are needed to tolerate $n$ node failures
    • In case of partitioning, the nodes in at most one partition can do read or write operations
    • Reading: availability lower than "read one, write all" but higher than non-replicted
    • Writing: availability is higher than non-replicted or "read one, write all"
  • Weighted voting

    • Each copy has an individual number of votes
    • Every logical object can have an individual read/write threshold
    • Copies on reliable notes get more votes, increasing availability, unreliable local copies can have 0 notes, failure does not impact availability

Optimistic replication #

Increase availability by decreasing the guaranteed degree of consistency.

Eventual consistency: all replicates will eventually converge into a mutually consistent state.

Update as Soon as Possible #

Node with primary copy $x^*$ has the request queue RQ(x)

An Epidemic Protocol #

  • Every object has a associated timestamp
  • Writing: any copy which is then propagated to all other nodes, a node updates its copy only when the timestamp is higher
  • Reading: any available copy
  • Anti-entropy (Update Protocol): each node periodically resolves the difference with another pull/push

CAP Theorem #

(also known as Brewer's theorem)

In a distributed system is is impossible to simultaneously provide all three of the following guarantees: strong Consistency, Availability, Partition tolerance.

  • Pessimistic replication ensures strong consistency with improved availability but assuming no partitioning
  • Optimistic replication does not provide strong consistency but partitioning is possible