HashMap in c++

HashMap in c++

Data structures

Data structures are the way to store and retrieve data efficiently. So Let's find out the basic data structures in c/c++ we have discussed so far from previous blogs.

  • numbers (int, float, double)
  • character (char)
  • pointers
  • arrays

There are several types of DS out there but we have not discussed them yet.

Hashmap/map

Now suppose we need to store data in a key-value pair fashion, where both key and value are data. Can we store data in such a manner with the above data structures? No, we can't.

You may argue why we can't write string role = "Iron man" where role is key and Iron man is value.

But here only Iron man is our data, role is just a variable name. Now if you can use structure as key-value pair, but this logic does not work when key names are dynamic.

So hashmapis the solution. This is a new type of data structure, where key and value both are our data and key is unique.

Let's code

Suppose we want to store details about our favorite marvel's character as per the picture.

ironman.png

The standard library for hashmap in c++ is unordered_map, all operations(CRUD) in hash map take O(1).

#include <iostream>
#include <string>
#include <unordered_map> // header file to use hash map

using namespace std;

int main()
{

    // <key_type, value_type> define the type of key and value
    // here key and value both are of string type
    unordered_map<string, string> favMarvelCharacter;

    // different ways to insert data
    favMarvelCharacter.insert({"real name", "Robert Jr."});
    favMarvelCharacter["reel name"] = "Toney";
    pair<string, string> p = {"role", "Iron man"};
    favMarvelCharacter.insert(p);
    favMarvelCharacter.insert({"sex", "Male"});

    // accessing the value with key
    cout << favMarvelCharacter["role"]; // Iron man

    // retriving with [] operator insert default value if key doesn't exist, <key, default_value>
    // default value of string is null
    cout << favMarvelCharacter["height"] << endl; // will print nothing but insert {"height", '\n} as height is not inserted before

    // retrieving the value with at() function, cause exception if key doesn't exist
    cout << favMarvelCharacter.at("role"); // Iron man

    cout << favMarvelCharacter.at("age"); // exception, program crashes

    // modify the exisitng value
    favMarvelCharacter["height"] = "6 ft";
    favMarvelCharacter.at("height") = "5.7ft";

    return 0;
}

Syntax details

details.png

Check if a key exists?

If the key doesn't exist then accessing value with at( ) causes an exception and [ ] operator inserts the default value, so what's the solution then. Well, there is an inbuilt function count which outputs either 0 if the key doesn't exist or 1 in case the key exists. It can't output > 1 since keys are unique.

    favMarvelCharacter.count("role"); // 1
    favMarvelCharacter.count("height"); // 1
    favMarvelCharacter.count("weight"); // 0

Accessing all keys

A Hash map doesn't guarantee the order of elements.

for ( auto pair: favMarvelCharacter)
{
    cout << pair.first << ", ";
}

// real name, reel name, role, height, sex

Accessing all values

for (const auto &pair: favMarvelCharacter)
{
    cout << pair.second << ", ";
}

// Robert Jr., Toney, Iron man, Male, 5.7ft

Map of string and function

Every newbie thinks it's a complex thing but no, it's not complex at all think like the key is of string type and value is a function pointer type which will take 2 int arg and return int.

map_of_function.png

#include <iostream>
#include <unordered_map>

using namespace std;

// a normal fun which takes 2 int args and return int
int myfun(int x, int y)
{
    if (y == 0)
        return -1;
    return x / y;
};

int main()
{

    //<key_type, value_type>
    unordered_map<string, int (*)(int, int)> mapOfFunctions; // value type is funtion_pointer

    // mapping a lambda function ;
    mapOfFunctions["add"] = [](int x, int y)
    { return x + y; };

    mapOfFunctions["multiply"] = [](int x, int y)
    { return x * y; };

    // mapping a normal function
    mapOfFunctions["divide"] = myfun;

    // calling the function
    cout << mapOfFunctions["add"](5, 10);     //  15
    cout << mapOfFunctions["divide"](20, 10); // 2
    cout << endl;

    return 0;
}

Deep dive

In c++ there is another library map which stores key-value pairs using BST, Binary Search Tree, where each operation takes O(log n) and order is guaranteed either ascending or descending way, But we won't discuss it here.

unordered_map uses hashtable. Hash table is the actual key-value pair underneath the unordered_map. We will look in a basic manner at how the hash table stores the data. To work with a hash table we need a hash function, a function that takes any arbitrary size input and returns a fixed-size value called a hash value. A specific hash function always maps the same hash value for specific input. Consider a basic hash function that takes a string and maps an integer.

int hash_fun(string key)
{
    int hash = 557;
    int c, i = 0;
    while (c = key[i++])
    {
        hash = (hash + c);
    }
    return hash;
}

An actual hash function is optimized to work with large data set to prevent collision for two different values, i.e. doesn't produce the same result for different inputs. And if a collision occurs it is handled by using a linked list for entries. But in our case, hash_fun is neither collision prone nor optimized. Now, let's understand how hashmap works through the picture.

  1. The hash function produces a hash value for each key.
  2. The hash value is used as an index of a bucket array, where it points to another array or linked list having an actual key and value.

map.png hashmap.png

If you want to implement a basic hash map in your way in c++ then an article by Abdullah Ozturk may be helpful.