Hello everyone! In this tutorial, we will learn about isin()
method present in Pandas module and we will look into behaviour of this function when different types of values are passed. So let’s get started.
DataFrame.isin() method
Pandas isin()
method is used to filter the data present in the DataFrame. This method checks whether each element in the DataFrame is contained in specified values. This method returns the DataFrame of booleans. If the element is present in the specified values, the returned DataFrame contains True
, else it shows False
. Thus this method is useful in filtering the dataframes as we will see through examples below.
Syntax of isin()
method is shown below. It takes only 1 parameter:
DataFrame.isin(values)
Here the parameter values
could be any one of them:
- List or Iterable
- Dictionary
- Pandas Series
- Pandas DataFrame
Lets see the result of isin()
method when different values are passed to the method.
Examples of the isin() method
Let’s consider some examples of isin()
method by passing values of different types. For the examples below, we will use the following data:
import pandas as pd
data = pd.DataFrame({
'Name': ['John', 'Sam', 'Luna', 'Harry'],
'Age': [25, 45, 23, 32],
'Department': ['Sales', 'Engineering', 'Engineering', 'Human Resource']
})
print(data)
Name Age Department
0 John 25 Sales
1 Sam 45 Engineering
2 Luna 23 Engineering
3 Harry 32 Human Resource
isin() method when value is a List
When a list is passed as a parameter value to the isin()
method, it checks whether each element in the DataFrame is present in the list, and if found, shows True
. For example, if we pass a list of values containing some departments, those values in Department
column will be marked as True
.
import pandas as pd
# Creating DataFrame
data = pd.DataFrame({
'Name': ['John', 'Sam', 'Luna', 'Harry'],
'Age': [25, 45, 23, 32],
'Department': ['Sales', 'Engineering', 'Engineering', 'Human Resource']
})
#List of Departments to filter
departments_to_filter = ['Engineering', 'Sales', 'Finance']
result = data.isin(departments_to_filter)
print(result)
Name Age Department
0 False False True
1 False False True
2 False False True
3 False False False
So, using this way, we can also filter the DataFrame depending on the situation. For example, we want to find employees between age 20 to 30, we can use isin()
method on Age
column.
import pandas as pd
# Creating DataFrame
data = pd.DataFrame({
'Name': ['John', 'Sam', 'Luna', 'Harry'],
'Age': [25, 45, 23, 32],
'Department': ['Sales', 'Engineering', 'Engineering', 'Human Resource']
})
start_age=20
end_age=30
# Using isin() method to filter employees on age
age_filter = data['Age'].isin(range(start_age, end_age+1))
# Using the filter to retrieve the data
result = data[ age_filter ]
print(result)
Name Age Department
0 John 25 Sales
2 Luna 23 Engineering
isin() method when value is a Dictionary
When a dictionary is passed as a parameter value to the isin()
method, the data range to search for will be different for different columns of the DataFrame. Thus we can search for each column separately. For example, in a dictionary, we can pass a list for Name
and Department
with their own values to search as shown below.
import pandas as pd
# Creating DataFrame
data = pd.DataFrame({
'Name': ['John', 'Sam', 'Luna', 'Harry'],
'Age': [25, 45, 23, 32],
'Department': ['Sales', 'Engineering', 'Engineering', 'Human Resource']
})
#Dictionary data to filter DataFrame
dict_data_to_filter = {'Name': ['Sam', 'Harry'], 'Department': ['Engineering']}
result = data.isin(dict_data_to_filter)
print(result)
Name Age Department
0 False False False
1 True False True
2 False False True
3 True False False
isin() method when value is a Series
When a Pandas Series is passed as a parameter value to the isin()
method, the order in which values are written in Series becomes important. Each column of the DataFrame will be checked one by one with the values present in the Series in the order in which they are written. Consider the example below.
import pandas as pd
# Creating DataFrame
data = pd.DataFrame({
'Name': ['John', 'Sam', 'Luna', 'Harry'],
'Age': [25, 45, 23, 32],
'Department': ['Sales', 'Engineering', 'Engineering', 'Human Resource']
})
#Series data, changing index of Sam and Luna
series_data = pd.Series(['John', 'Luna', 'Sam', 'Harry'])
result = data.isin(series_data)
print(result)
Name Age Department
0 True False False
1 False False False
2 False False False
3 True False False
Although, the values present in the Series contain all the Names
present in data DataFrame, the result at index 1 and 2 contains False
because we interchanged the index of ‘Sam’ and ‘Luna’. Hence index matters when the Series is passed as value.
isin() method when value is a DataFrame
When a Pandas DataFrame is passed as a parameter value to the isin()
method, both index and column of the passed DataFrame must match. If both the DataFrames are same but column names don’t match, the result will show False
for those columns. If data in both DataFrames are same, but the order is different, the result will be False
for those rows that are different. Thus both index and column are important if DataFrame is passed. Consider the example.
import pandas as pd
# Creating DataFrame
data = pd.DataFrame({
'Name': ['John', 'Sam', 'Luna', 'Harry'],
'Age': [25, 45, 23, 32],
'Department': ['Sales', 'Engineering', 'Engineering', 'Human Resource']
})
# DataFrame to filter, here column name Age to lowercased to age
df = pd.DataFrame({
'Name': ['John', 'Sam', 'Luna', 'Harry'],
'age': [25, 45, 23, 32],
'Department': ['Sales', 'Engineering', 'Engineering', 'Human Resource']
})
result = data.isin(df)
print(result)
print("-----------------")
# DataFrame to filter, here last 2 rows are swapped
df = pd.DataFrame({
'Name': ['John', 'Sam', 'Harry', 'Luna'],
'Age': [25, 45, 32, 23],
'Department': ['Sales', 'Engineering', 'Human Resource', 'Engineering']
})
result = data.isin(df)
print(result)
Name Age Department
0 True False True
1 True False True
2 True False True
3 True False True
-----------------
Name Age Department
0 True True True
1 True True True
2 False False False
3 False False False
Conclusion
In this tutorial, we learned about Pandas isin()
method, its different use cases, and how this method is helpful in filtering out data from a DataFrame. So now you know how to use isin()
method and you can filter data easily in a DataFrame, so Congratulations.
Thanks for reading!!