Get difference between two lists with Unique Entries

In this article, we learn how to find the difference between two lists in python using .set() .union() and numpy functions in python. Lists in python are data structures which are a collection of elements that are ordered and mutable which allows us to perform operations on them.

By finding the difference between them we can uniquely identify a list, manipulate data, compare data, and perform data validation. We will implement different approaches below to achieve the difference between lists.

Python list difference can also be defined as returning elements that are present in one list but not another.

What is Asymmetric and symmetric difference?

There exist two major types of differences:

Asymmetric Difference
Symmetric Difference

Asymmetric Difference

The asymmetric difference is a set theory that states the output will be the values that are present in and only in list1 if the difference is applied as list1 – list2 and does not return values that are in list2 and not list1.
For a better understanding look at the example below.

fruit1 = ['Apple' , 'Grapes']
fruit2 = ['Grapes' , 'Orange']
difference = list(set(fruit1) - set(fruit2))
print(difference)

When we implement this code, the expected output is ['Apple'] and not ['Apple', 'Orange'], since only ‘Apple’ is present in fruit1 and not in fruit2.

Output:

['Apple']

To get values of only list2 ie fruit2 we write

fruit1 = ['Apple' , 'Grapes']
fruit2 = ['Grapes' , 'Orange']
difference = list(set(fruit2) - set(fruit1))
print(difference)

['Orange']

To solve this issue, we introduce symmetric difference.

Symmetric Difference

In symmetric difference, all the elements of list1 that are not present in list2 and all the elements of list2 that are not present in list1 will be returned. For example, list1 = [1, 2, 3, 4] and list2 = [3, 4, 5], so the symmetric difference = [1, 2, 5].

fruit1 = ['Apple' , 'Grapes']
fruit2 = ['Grapes' , 'Orange']
difference = list(set(fruit1).symmetric_difference(set(fruit2)))
print(difference)

fruits that do not occur in both sets are returned (items that occur only in one set)

Ouput:

['Orange', 'Apple']

Code Implementation to Find The Difference Between Two Lists

In Python, to find the difference between two lists, you can use the set subtraction method, the .union() method, or the numpy function setdiff1d. Set subtraction returns elements present in one list but not in the other, while the .union() method and setdiff1d return unique entries from both lists.

Example 1: Difference Using Set Subtraction

list_a = [1,2,3,4,5]
list_b = [3,5,6,7,8]
difference = list(set(list_a) - set(list_b))
print(difference)

We convert the lists to set and find the asymmetric difference between them. 1,2,4 are the elements present in list_a and not in list_b hence returned.

Output:

List with unique entries

list_a = [1,2,3,4,5]
list_b = [6,7,8,9,10]
difference = list(set(list_a) - set(list_b))
print(difference)

Example 2: Difference Using .union()

list_a = [1,2,3,4,5]
list_b = [3,5,6,7,8]
first_set = set(list_a)
sec_set = set(list_b)
differences = (first_set - sec_set).union(sec_set - first_set)
print('Differences between two lists: ')
print(differences)

We declare two lists list_a and list_b which are later converted to sets using set() . We calculate the difference between both lists by forming the union of individual differences in lists.
(first_set - sec_set) contains elements 1,2,4 and (sec_set - first_set) contains 8,6,7 both these outputs are combined and returned.

Output:

List with unique entries

list_a = [1,2,3,4,5]
list_b = [6,7,8,9,10]
first_set = set(list_a)
sec_set = set(list_b)
differences = (first_set - sec_set).union(sec_set - first_set)
print('Differences between two lists: ')
print(differences)
print(first_set - sec_set)
print(sec_set - first_set)

Example 3: Using Numpy Function

import numpy as np
list_a = [1,2,3,4,5]
list_b = [3,5,6,7,8]
dif_1 = np.setdiff1d(list_a, list_b)
dif_2= np.setdiff1d(list_b, list_a)
new_list = [np.concatenate((dif_1, dif_2))]
print("Difference :",new_list)

We calculate individual differences of list_a and list_b using np.setdiff1d and later combine them using np.concatenate.

Output:

Lets consider unique entries in both lists

import numpy as np
list_a = [1,2,3,4,5]
list_b = [6,7,8,9,10]
dif_1 = np.setdiff1d(list_a, list_b)
dif_2= np.setdiff1d(list_b, list_a)
new_list = [np.concatenate((dif_1, dif_2))]
print("Difference :",new_list)

Conclusion

We have explored three popular methods for finding the difference between lists in Python: set subtraction, .union(), and numpy’s setdiff1d function. These methods help us identify unique elements, manipulate data, compare data, and perform data validation. Remember that the order of lists in the function plays a crucial role in obtaining the desired output. Are there any other methods you find useful for comparing lists in Python?

Have a look at more interesting articles: