Java Interview Questions #1 - HashMap

in #java6 years ago

How does a HashMap work?

Basic answer:

HashMap contains a number of buckets where it stores objects. Object's hashcode function determines which bucket will get the object. Many different objects may have same hashcode, so they can end up in the same bucket, but "equals" relation makes it possible to distinguish different objects.

Few more facts:

1. Buckets and hashcode

It's worth mentioning that same bucket will be chosen for different objects NOT ONLY when they have same hashcode. Under the hood values are stored in the array and proper index is calculated by transforming hashcode to one value between 0 and array size. Array size should be always a power of two, so it basically takes less significant bytes by applying binary "AND" operation for hashcode and (arraySize-1). Example:

hashCode = (HEX) A41E = (BIN) 1010 0100 0001 1110
arraySize = (DEC) 32 = (HEX) 20 = (BIN) 0010 0000
arraySize-1 = (BIN) 0001 1111

hashCode & (arraySize-1) = (BIN) 1010 0100 0001 1110 & 0001 1111 = (BIN) 0001 1110 = (DEC) 30

So in this example, table index would be 30 which is lower than arraySize = 32.

Initially array size is 16 and it grows only when load factor rises over the load factor. Initial values for Java 9 are:

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16


    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

2. Internal structure

HashMap is wrapping objects with Entries (Java < 8) or Nodes (Java >=8). This structure holds information about key, value, and also links to the next object in the same bucket. This allows to store many entries in the same bucket. Structure looks like:

array:
[0] -> null
[1] -> Node -> Node -> null
[2] -> Node -> null
[3] -> null
...

so basically, it's an array of lists.

for Java>=8 implementation looks like:

    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
    ...

but when number of elements in bucket is bigger than the threshold it's converted into tree. To be more specific it's a red-black tree: https://en.wikipedia.org/wiki/Red%E2%80%93black_tree .

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8;

Nodes are converted to TreeNodes which contains more info about tree structure:

    static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;

3. Computational complexity.

We can assume that hash map with not so many object with well-designed hashcode function will be O(1) for insertion, selection and removal. Unfortunately, while it stores more objects, more and more objects are put into same bucket which gives us O(b) complexity for Java<8 and O(lg(b)) for Java>=8 (b is bucket size). Badly designed hashcode would also cause same problem.

=======

Few words about the project.
I've been working as a technical interviewer for a few years in 2 companies. Unfortunately, what I notice is that many candidates do not take care about subjects I find important. Like computational complexity or proper usage of collections. I've decided to collect questions with some example answers so maybe someone can learn what kind of knowledge is expected on most interviews for a Java developer.

Coin Marketplace

STEEM 0.27
TRX 0.11
JST 0.030
BTC 70765.98
ETH 3797.96
USDT 1.00
SBD 3.46