Pandas Melt – Unpivot a Data Frame From Wide to Long Format

How To Unpivot A DataFrame From Wide To Long Format Using Pandas

In Python, pandas is the most efficient, uncomplicated, and powerful open-source library tool for data modification and analysis. The Pandas melt function is one of the techniques used to resize Pandas Data Frames, which is especially helpful in data science. A Pandas melt function can be used as the .melt() function in Pandas. This article will go into more detail about Pandas’ melt function so that you can fully understand its most important features and how it works with Python. We will first go over the syntax and parameters of this method. After that, we’ll use a few examples along with the implementation to explain each of the parameters for the pd.melt() function.

What Does Pandas Melt Do?

The melt function helps us reshape a dataframe from a wide to a long format. This is used so that our dataset gets simplified and we can easily analyze our data and get more insights from it. In Pandas.melt(), the Data Frame object is created in a specified format with one or more columns acting as identifiers. In simple terms, for each column, we will have one row. Thus, one or more columns remain: identifiers. The remaining columns are all regarded as values.

In python , identifiers are name given to variables, classes , functions and arrays etc. For example:

Name = "ABC" 

In above code Name is a variable(identifier) which contains value ABC.

Wide and Long Data Frame:

Let us have a look how wide and long DataFrame looks like, In order to get better insights into pandas.melt() function. Below is the image of DataFrame in wide format:

Python Wide Format Image 1
Wide Format DataFrame

Now, by default DataFrame is in the wide format in Pandas. Let us look at the same DataFrame in a long Format:

Long Format DataFrame
Long Format DataFrame

Syntax and Parameters:

Syntax: We can use two types of syntax to apply melt functions:

DataFrame.melt(id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index= True)

pandas.melt(frame,id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index = True)

Parameters:

frame : DataFrame
id_vars [tuple, list, or array, optional]: identifying the column(s) to use as variables.
value_vars [tuple, list, or array, optional]: Unpivot columns, uses all columns if nothing is provided.
var_name[scalar]: To use as the ‘variable’ column’s name.
value_name[scalar, default ‘value’]: The column’s name should be used.
col_level[int or string, optional]: Use this level to melt whether the columns are MultiIndex.

ignore_index[True or False]: Default True, indicates whether or not to ignore the original index.

Implementiong .melt() Function:

The to_melt() function’s syntax is now clear to you. Now, let me demonstrate how to use it.

Importing Pandas and Creating Dataframe:

Firstly we will import our pandas library using the import pandas syntax. In order to use functions for dealing with date values, we must import pandas as pd. Additionally, we require this library in order to use other crucial tools like the melt() function and many other functions. we are here importing it as pd because of simplicity purpose. After that, we have to make a Dataframe to understand the concept of melt function easily. You can also read excel and CSV files to apply the melt function but for the sake of understanding, I am creating a Dataframe of my own.

To learn more on the basics of pandas library check out this tutorial.

import pandas as pd                                                                                                                                                                                                             

data = {
'Name': ['Aman', 'Rohini', 'Neha'],
'Age': [56, 23, 32],
'Rank': [1, 2, 3]
}   

df = pd.DataFrame(data)                                                                                                                                                                                                    df
Pandas Dataframe 1
Pandas Dataframe

Applying .melt() Function:

Now, I will apply .melt() directly on the DataFrame (df). Let’s see the output.

df.melt() 
Image 52
melt function

let us see what happens when we add id_vars and value_vars parameters in .melt()

Adding Single Value in id_vars and value_vars:

df2 = df.melt( id_vars =['Name'], value_vars =['Age'])                                                                                                                                                            df2
Image 53
melt function with parameters

From the above output, we can see only one variable Age is there, and Name is used as identifying column.

Adding Multiple Values to id_vars and value_vars:

let us see what happens when we add more than one column to the value_vars parameter. Similarly, we can also add multiple columns in id_vars as well. An example of such is given below:

df2 = df.melt( id_vars =['Name'], value_vars =['Age','Rank'])                                                                                                                                                     df2
Image 54
melt function with parameters
df2 = df.melt( id_vars =['Name', 'Age'], value_vars =['Rank'])                                         
Image
melt function with parameters

Customization:

we can also customize our Dataframe by adding var_name and value_name parameters. One such example is given below:

df.melt(id_vars =['Name'], value_vars =['Rank','Age'],
              var_name ='Newname', value_name ='Newnumerical') 
Image 1
melt customization.

Adding col_level :

The col level argument can be used to melt the DataFrame using the index of the column levels, as seen in the example below:

df.melt(id_vars =['Name'], value_vars =['Rank','Age'],
var_name ='Newname', value_name ='Newname' , col_level=0)
Image 4
melt function with parameter

Adding ignore_index:

In order to ignore or keep the original index, we use the ignore_index function. If it is false, the original index is kept. Below are the examples for both cases:

df.melt(id_vars =['Name'], value_vars =['Rank','Age'],
              var_name ='Newname', value_name ='Newnerical' , col_level=0, ignore_index = True) 
Image 5
melt function with parameter
df.melt(id_vars =['Name'], value_vars =['Rank','Age'],
              var_name ='Newname', value_name ='Newnerical' , col_level=0, ignore_index = False) 
Image 3
melt function with parameter

Conclusion:

In this tutorial, you learned about the Pandas melt() function that enables you to reshape a data frame from wide format to long format. I hope you understood the ideas and illustrations presented above and are now prepared to apply them to your Pandas DataFrame. Thanks for reading! Keep checking back with us for more fantastic Python programming learning materials.

Reference:

Here’s pydata documentation page that explains how to use the pandas.melt() function, including its parameters, their allowed values, and how the function works.