Hash table data structure (aka dictionary, hash map, associate array) is a key-value pairs mapping backed by a resizeable array data structure

### Attributes

Hash table uses a load factor to decide when to grow the internal array capacity

A hash function is used to generate a hash code and array index to properly distribute entries across the array buckets

Hash codes generated by the hash function may be duplicated. Separate chaining through a linked list or red-black tree data structure is the most common solution to resolve the problem

### Public operations

Search, aka Get: retrieve the value by key

Add, aka Put: add a key-value entry into the hash table

Delete, aka Remove: remove an entry from the hash table by key

### Internal operations

Hashing: properly distribute entries across the hash table

Resize: auto grow hash table capacity

Rehash: re-execute hashing operation for every entry in the buckets

### Time complexity

- Hash table offers constant-time O(1) on average and linear-time O(n) in worst case performance for search, add and delete operations

Let's walk through this tutorial to explore them in more details

### The load factor

Hash table is backed by an array and it uses the load factor to decide when to grow the array capacity

Load factor is defined as below

```
loadFactor = numberOfEntries / arrayCapacity
arrayCapacity, aka the number of buckets
```

As a general rule, `loadFactor = 0.75`

offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup costs and vice versa

### Hashing operation

Hashing is the hash table internal operation which distributes entries across an array

Hash table uses a hash function to compute the key object to a hash code number

```
hashCode = hashFunction(keyObject)
```

Hash code is reduced to array index via modulo operator %

```
index = hashCode % arrayCapacity
```

Then the computed index is used to assign value object to the array

```
array[index] = valueObject
```

### Hash collision

There's no perfect hash function. The hash codes computed by the hash function may be duplicate. However, the less duplicate hash codes, the better performance gain

There're some solutions to resolve the hash collision. Use a linked list or red-black tree data structure to save duplicate hash code entries in separate chaining is the most common way

### Resize and rehash operations

Resize happens when the number of entries in the hash table is greater than the product of load factor and the array capacity

```
numberOfEntries > loadFactor * arrayCapacity
```

Hashing operation has to be executed again for every entry in the buckets right after the execution of resize operation

Resize and rehash operations are costly in term of performance, so choosing the initial capacity when creating a new Hash table is important to minimize the performance impact

### Implementation example

```
```

Output

```
k3=3
k1=1
k2=2
3
2
1
null
2
```

### Applications

Hash table can be used to implement associative arrays, caches, database indexing and sets data structure