Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
wb_sunny

Implementing a Trie Data Structure in Python

Trie Implementation

Trie data structure is very efficient when it comes to information retrieval. It is majorly used in the implementation of dictionaries and phonebooks.

It is also useful for implementing auto-text suggestions you see while typing on a keyboard.

In this tutorial, we will understand how to implement our own trie data structure in Python.

In this tutorial you will learn the following:

  • How to implement your own implementation for a Trie data structure.
  • How to make insertions into a Trie data structure.
  • How to query for words in a Trie data structure.

Implementing the TrieNode class

Let’s start by writing the code for TrieNode class. Each trie node needs have the following fields:

  1. A Character
  2. List of children
  3. A Boolean that tells whether a word ends at that node.

Let’s write the code for our TrieNode class:

class TrieNode:

    def __init__(self, char):

        self.char = char

        self.is_end = False

        self.children = {}

While initializing a TrieNode we need to provide a character.

.is_end marks whether a word ends at the current node or not. It is set to false by default.

Writing the Trie Data Structure class

Let’s move on to writing code for our Trie class.

To initialize a trie, we need to initialize a trie node and provide methods for inserting and searching in the trie.

class Trie(object):

    def __init__(self):

        self.root = TrieNode("")

This part takes care of initializing an empty TrieNode.

How to perform Insertions into our Trie?

Let’s look at how insertions happen in a Trie data structure.

To make an insertion we need to traverse the word to be inserted character by character.

Simultaneously we need to move down the Trie from the root and see if the list of children has that character. In case the character is not present then we need to make a new TrieNode with that character and add it to the list of children.

When we reach the end of the word we need to set is_end to true for the node corresponding to the last character of the word.

Here’s the implementation of the approach discussed above.

def insert(self, word):

        node = self.root

#traverse the word character by character 
        for char in word:
#check if the character is there in the list of children 

            if char in node.children:
                node = node.children[char]
            else:
# else make a new TrieNode corresponding to that character 

                new_node = TrieNode(char)

# add the new node to the list of children 

                node.children[char] = new_node
                node = new_node

#after traversig the word set .is_end to true for the last #char
        node.is_end = True

This will take care of all our insertions.

Consider a Trie with the following words :

  • Here
  • Hear
  • Her
  • He
  • Hello
  • How

The trie corresponding to these words will look like :

Trie
Trie

Here green nodes correspond to is_end being true for that node.

How to Search in our Trie?

Now let’s see how to search for words in our trie. We don’t want to perform an exact match for searching. Rather what we want is to get the list of words that start with the string we are searching for.

While searching we will only provide the prefix and the search function should be able to return all the words starting with that prefix.

For example if we search for “He”, we should get the following words.

  • He
  • Here
  • Hear
  • Her
  • Hello

These are the words that start with “He”. This aspect of a trie makes it useful for implementing auto-completion in keyboards.

While searching for words we search in a DFS fashion. Therefore we need to write a function for performing a DFS search in our trie.

 def dfs(self, node, pre):

        if node.is_end:
            self.output.append((pre + node.char))
        
        for child in node.children.values():
            self.dfs(child, pre + node.char)

While calling the function we need to pass a node and the prefix searched so far. Whenever the search reaches a node with is_end as true it appends the word to the output list.

Otherwise it keeps on searching among the children in a DFS fashion.

The search function is as follows:

def search(self, x):
       
        node = self.root
# traverse the search query and move down the trie        
        for char in x:
            if char in node.children:
                node = node.children[char]
            else:
              #if query doesn't match the nodes in trie
                return []
        
        self.output = []
#call DFS 
        self.dfs(node, x[:-1])

        return self.output

While searching we traverse through the search query and move down the trie simultaneously.

Then we call DFS on the node corresponding to the last character of the query.

The DFS function then moves down from this last character and adds all the complete words to our output list.

Complete Code

The complete code of this tutorial is given below :

class TrieNode:

    def __init__(self, char):

        self.char = char

        self.is_end = False

        self.children = {}

class Trie(object):

    def __init__(self):

        self.root = TrieNode("")
    
    def insert(self, word):

        node = self.root

        for char in word:
            if char in node.children:
                node = node.children[char]
            else:

                new_node = TrieNode(char)
                node.children[char] = new_node
                node = new_node
        
        node.is_end = True
        
    def dfs(self, node, pre):

        if node.is_end:
            self.output.append((pre + node.char))
        
        for child in node.children.values():
            self.dfs(child, pre + node.char)
        
    def search(self, x):
       
        node = self.root
        
        for char in x:
            if char in node.children:
                node = node.children[char]
            else:
              
                return []
        
        self.output = []
        self.dfs(node, x[:-1])

        return self.output

Trie in Action

Let’s try adding some words to a trie and make a search query.

tr = Trie()
tr.insert("here")
tr.insert("hear")
tr.insert("he")
tr.insert("hello")
tr.insert("how ")
tr.insert("her")

This will make a trie and add these five words to it.

Now we can make a query using the following line :

tr.search("he")

Output :

['he', 'her', 'here', 'hear', 'hello']

Let’s make another query :

tr.search("her")

Output :

['her', 'here']

Conclusion

This tutorial covered the implementation for Trie data structure in Python. We learnt how to make a Trie class, how to perform insertions and how to query for words in the trie.