Hello, readers! In this article, we will be focusing on how to get unique values from a DataFrame in Python.
So, let us get started!
What is a Python DataFrame?
Python Pandas module offers us various data structures and functions to store and manipulate a huge volume of data.
DataFrame is a data structured offers by Pandas module to deal with large datasets in more than one dimension such as huge csv or excel files, etc.
As we can store a large volume of data in a data frame, we often come across a situation to find the unique data values from a dataset which may contain redundant or repeated values.
This is when
pandas.dataframe.unique() function comes into picture.
Let us now focus on the functioning of unique() function in the upcoming section.
Python pandas.unique() Function to Get Unique Values From a Dataframe
pandas.unique() function returns the unique values present in a dataset.
It basically uses a technique based on hash tables to return the non-redundant values from the set of values present in the data frame/series data structure.
Let us try to understand the role of unique function through an example–
Consider a dataset containing values as follows: 1,2,3,2,4,3,2
Now, if we apply unique() function, we would obtain the following result: 1,2,3,4. By this, we have found the unique values of the dataset easily.
Now, let us discuss the structure of pandas.unique() function in the next section.
Syntax of Python unique() function
Have a look at the below syntax:
The above syntax is useful when the data is of 1-Dimensional. It represents the unique value from the 1-Dimensional data values(Series data structure).
But, what if the data contains more than a single dimension i.e. rows and columns? Yes, we do have a solution for that in the below syntax–
This syntax enables us to find unique values from the particular column of a dataset.
It is good for the data to be of categorical type for the unique function to avail proper results. Moreover, the data gets displayed in the order of its occurrence in the dataset.
Python unique() function with Pandas Series
In the below example, we have created a list which contains redundant values.
Further, we have converted the list into a series data structure because it has a single dimension. Finally, we have applied the unique() function to fetch the unique values from the data.
lst = [1,2,3,4,2,4] df = pandas.Series(lst) print("Unique values:\n") print(pandas.unique(df))
Unique values: [1 2 3 4]
Python unique() function with Pandas DataFrame
Let us first load the dataset into the environment as shown below–
import pandas BIKE = pandas.read_csv("Bike.csv")
You can find the dataset here.
pandas.dataframe.nunique() function represents the unique values present in each column of the dataframe.
season 4 yr 2 mnth 12 holiday 2 weathersit 3 temp 494 hum 586 windspeed 636 cnt 684 dtype: int64
Further, we have represented the unique values presents in the column ‘season’ using the below piece of code–
array([1, 2, 3, 4], dtype=int64)
By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.
For more such posts related to Python, Stay tuned and till then, Happy Learning!! 🙂