HashSets and HashTables in Python

Hashtables And Hashsets

There are many ways in which data can be stored and retrieved. Let us say you want to store the data related to a cricket match. One way to do it is to create a list of lists for each attribute and its value as shown below:

cricketData = [['Name', 'Anthony'], ['Runs', 46], ['Wickets', 2], ['Wides', 3]]
print(cricketData)

Output:

[['Name', 'Anthony'], ['Runs', 46], ['Wickets', 2], ['Wides', 3]]

Here, the data is stored in contiguous memory locations as shown in the figure:

Array Storage

In this representation, the data can be retrieved using a for loop. If you want to know how many runs were scored, you can write a simple for loop as:

for ele in cricketData:
    if ele[0]=='Runs':
        print(ele[1])

Output:

46

This is a time-consuming process, especially when the dataset is large. A better way to do this is to turn the data into pairs of key-value format.


Working with Hashtables in Python

Sometimes, instead of the sequential data storage structure, we need to map the data to its corresponding information. This is known as mapping data structure.

One of the most popular and important such representations of data is a hash table. 

A hash table, also known as a hash map, stores information in the form of key-value pairs. Each key is unique and is used as a reference to identify the data associated with it. This is a reason why hash tables are used as a look-up data structure. Hash tables provide fast insertion and access of key-value pairs stored in them.

The keys are mapped to values and stored in the memory using a hash function. Data of any size can be mapped to fixed-size values using the hashing algorithm. The hash function can be any function like mod (%), plus(+) or any custom function based on the need.

Key Value Pairs

Hash tables are implemented in Python using the built-in data-type called a dictionary. Dictionary is a Python specific implementation of a hash table. Let us see how to perform different operations on hash tables using Python.

Creating a hash table in Python using a dictionary

You can use the curly brackets {} or the dict() keyword to create a dictionary in Python.

cricket = {'Name': 'Anthony', 'Runs': 46, 'Wickets': 2, 'Wides': 3}
print(cricket)
print(type(cricket))
{'Name': 'Anthony', 'Runs': 46, 'Wickets': 2, 'Wides': 3}
<class 'dict'>

Now that we have created a hash table, we can access the data stored in it.

Accessing the data inside a hash table

Dictionaries allow you to access items using its key name.

print(cricket['Name'], "scored", cricket['Runs'], "runs and took", cricket['Wickets'], "wickets")
Anthony scored 46 runs and took 2 wickets

You can also use the get() method to get the value for a particular key.

print(cricket.get('Runs'))
46

You can get only the keys and only the values using the keys() and values() function respectively.

print(cricket.keys())
dict_keys(['Name', 'Runs', 'Wickets', 'Wides'])
print(cricket.values())
dict_values(['Anthony', 46, 2, 3])

Updating the values in a hash table

Since dictionaries are mutable, we can change the values for the keys inside it as per our requirements.
If you want to change the runs scored to 58, you can write:

cricket['Runs'] = 58
print(cricket['Name'], "scored", cricket['Runs'], "runs")
Anthony scored 58 runs

Adding more key-value pairs:

cricket['No Balls'] = 2
cricket['Players'] = 11
print(cricket)
{'Name': 'Anthony', 'Runs': 58, 'Wickets': 2, 'Wides': 3, 'No Balls': 2, 'Players': 11}

Deleting the values from a hash table in Python

There are many functions that let you delete values from a dictionary.

del(): Removes the key-value pair from the dictionary for the given key.

del cricket['Wides']
print(cricket)
{'Name': 'Anthony', 'Runs': 58, 'Wickets': 2, 'No Balls': 2, 'Players': 11}

pop(): Similar to del, removes the key-value pair for given key.

cricket.pop('Wickets')
print(cricket)
{'Name': 'Anthony', 'Runs': 58, 'No Balls': 2, 'Players': 11}

popitem(): This function removes the last inserted item from hash table.

cricket.popitem()
print(cricket)
{'Name': 'Anthony', 'Runs': 58, 'No Balls': 2}

Working with HashSets in Python

Similar to hash table, a hash set is also a collection of objects. In hash table, data was stored in the form of key-value pairs, whereas in hash sets, the data is stored as objects. A hash set internally uses the hash table data structure to store data items. Just like a set, a hash set also does not allow storage of duplicate elements.

In Python, the implementation of a hash set is done by using a set itself. You can perform insert, update, delete and other operations on the hash set using Python.

Creating a hash set using Python set

You can either use {} or the set() constructor to create a set.

colours = {'red', 'green', 'pink', 'blue'}
print(colours)
{'green', 'blue', 'red', 'pink'}

Or,

coloursList = ['red', 'green', 'pink', 'blue']
coloursSet = set(colours)
print(coloursSet)
{'green', 'blue', 'red', 'pink'}

Accessing elements in a set

You can the belonging of an element to a set using in keyword. This is a Pythonic way of doing it.

if 'pink' in colours:
    print('Pink is present in the set')
else:
    print('Pink not found in the set')
Pink is present in the set

All elements in the set can be printed as follows:

for c in colours:
    print(c)
green
blue
red
pink

Updating a HashSet

Since the values in sets cannot be, you can only add new elements to the set using various functions.

add(): Using this function you can add one new item to the set.

print(colours)
colours.add('yellow')
print(colours)
{'green', 'blue', 'red', 'pink'}
{'yellow', 'pink', 'blue', 'green', 'red'}

update(): Add elements from any iterable object list, tuple, etc. You can also add elements from another set into the current set using this function.

numbers = [2, 5, 4]
print(colours)
colours.update(numbers)
print(colours)
{'yellow', 'pink', 'blue', 'green', 'red'}
{2, 4, 5, 'yellow', 'pink', 'blue', 'green', 'red'}

Remove elements from the hashset

remove() and discard(): Both these functions remove the mentioned element from the set.

colours.remove(4)
print(colours)

colours.discard('red')
print(colours)
{2, 5, 'yellow', 'pink', 'blue', 'green', 'red'}
{2, 5, 'yellow', 'pink', 'blue', 'green'}

pop(): This method removes any random element from the set and returns it.

colours.pop()
print(colours)
  • clear(): It completely empties the set.
  • del(): This function will delete the set.
colours.clear()
print(colours)
set()

Conclusion

In conclusion, Hash Table and Hash Set are the data structures used to store data in the form of key-value pairs and objects respectively. The hashing function works for only one element in a hash set as compared to two elements in a hash table. Usage-wise, we utilise Set implementation when the task is to carry out a check for the existence of an element. The code is simpler and easier to grasp. We employ Map implementation when the task calls for storing data for elements or when key-based faster search operations are needed.


Reference