Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages

Remove Duplicate Elements from List in Python

In this article, we’ll look at how we can remove duplicate elements from List in Python. There are multiple ways of approaching this problem, and we will show you some of them.


Methods to Remove Duplicate Elements from List – Python

1. Using iteration

To remove duplicate elements from List in Python, we can manually iterate through the list and add an element to the new list if it is not present. Otherwise, we skip that element.

The code is shown below:

a = [2, 3, 3, 2, 5, 4, 4, 6]

b = []

for i in a:
    # Add to the new list
    # only if not present
    if i not in b:
        b.append(i)

print(b)

Output

[2, 3, 5, 4, 6]

The same code can be written using List Comprehension to reduce the number of lines of code, although it is essentially the same as before.

a = [2 3, 4, 2, 5, 4, 4, 6]
b = []
[b.append(i) for i in a if i not in b]
print(b)

The problem with this approach is that it is a bit slow since a comparison is done for every element in the new list, while already iterating through our original list.

This is computationally expensive, and we have other methods to deal with this issue. You should use this only if the list size is not very large. Otherwise, refer to the other methods.

2. Using set()

A simple and fast approach to remove duplicate elements from list in Python would be to use Python’s built-in set() method to convert the list elements into a unique set, following which we can convert it into a List now removed of all its duplicate elements.

first_list = [1, 2, 2, 3, 3, 3, 4, 5, 5, 6]

# Convert to a set first
set_list = set(first_list)

# Now convert the set into a List
print(list(set_list))

second_list = [2, 3, 3, 2, 5, 4, 4, 6]

# Does the same as above, in a single line
print(list(set(second_list)))

Output

[1, 2, 3, 4, 5, 6]
[2, 3, 4, 5, 6]

The problem with this approach is that the original List order is not maintained as with the case of the second List since we create the new List from an unordered Set. so if you wish to still preserve the relative ordering, you must avoid this method.

3. Preserving Order: Use OrderedDict

If you want to preserve the order while you remove duplicate elements from List in Python, you can use the OrderedDict class from the collections module.

More specifically, we can use OrderedDict.fromkeys(list) to obtain a dictionary having duplicate elements removed, while still maintaining order. We can then easily convert it into a list using the list() method.

from collections import OrderedDict

a = [2, 3, 3, 2, 5, 4, 4, 6]

b = list(OrderedDict.fromkeys(a))

print(b)

Output

[2, 3, 5, 4, 6]

NOTE: If you have Python 3.7 or later, we can use the built in dict.fromkeys(list) instead. This will also guarantee the order.

As you can observe, the order is indeed maintained, so we get the same output as of the first method. But this is much faster! This is the recommended solution to this problem. But for illustration, we will show you a couple of more approaches to remove duplicate elements from List in Python.

4. Using list.count()

The list.count() method returns the number of occurrences of the value. We can use it along with the remove() method to eliminate any duplicate elements. But again, this does not preserve the order.

Note that this method modifies the input list in place, so the changes are reflected there itself.

a = [0, 1, 2, 3, 4, 1, 2, 3, 5]

for i in a:
    if a.count(i) > 1:
        a.remove(i)

print(a)

Output

[0, 4, 1, 2, 3, 5]

Everything seems fine, isn’t it?

But, there is a small issue with the above code.

When we are iterating over the list using the for loop and removing the element at the same time, the iterator skips one element. So, the code output depends on the list elements and if you are lucky then you will never get the issue. Let’s understand this scenario with a simple code.

a = [1, 2, 3, 2, 5]

for i in a:
    if a.count(i) > 1:
        a.remove(i)
    print(a, i)

print(a)

Output:

[1, 2, 3, 2, 5] 1
[1, 3, 2, 5] 2
[1, 3, 2, 5] 2
[1, 3, 2, 5] 5
[1, 3, 2, 5]

You can see that the for loop is executed only four times and it’s skipping 3, the next element after the remove() call. If you pass the input list as [1, 1, 1, 1], the final list will be [1, 1].

So, is there any workaround?

Of course, there is a workaround. Use the copy of the list in the for loop but remove the elements from the main list. A simple way to create a copy of the list is through slicing. Here is the update code that will work fine in all the cases.

a = [1, 1, 1, 1]

for i in a[:]:  # using list copy for iteration
    if a.count(i) > 1:
        a.remove(i)
    print(a, i)

print(a)

Output:

[1, 1, 1] 1
[1, 1] 1
[1] 1
[1] 1
[1]

5. Using sort()

We can use the sort() method to sort the set that we obtained in approach 2. This will also remove any duplicates, while preserving the order, but is slower than the dict.fromkeys() approach.

a = [0, 1, 2, 3, 4, 1, 2, 3, 5]
b = list(set(a))
b.sort(key=a.index)
print(b)   

Output

[0, 1, 2, 3, 4, 5]

6. Using pandas module

In case we are working with the Pandas module, we can use the pandas.drop_duplicates() method to remove the duplicates and then convert it into a List, while also preserving the order.

import pandas as pd

a = [0, 1, 2, 3, 4, 1, 2, 3, 5]

pd.Series(a).drop_duplicates().tolist()

Output

[0, 1, 2, 3, 4, 5]

References