A good hash function:

  • Is deterministic (always produces the same output given the same input)
  • Has a uniform distribution over the output range
  • Can be computed fast
  • Each key hashes independently of other keys
  • Uses all of the input key
  • Slight changes of the input should result in a completely different output

Hash table methods:

  • Chaining: linked-list for each location
  • Open addressing: when cell is filled uses a probing sequence

Probing methods:

  • Linear probing: $h(k,i) = (h'(k) + i)\mod m$
    • Causes primary clustering: values close to each other
  • Quadratic probing
    • Also causes secondary hashing
  • Double hashing