Melt and Unmelt data using Pandas melt() and pivot() function

Pandas Melt And Unmelt Data Variables

Hello, readers! This article will focus on Melting and Unmelting data values in Pandas data frame using melt() and pivot() function.

So, let us get started! 🙂


Melting and Unmelting data in Pandas – Crisp Overview

Before diving deep into the concept of melting and unmelting data, want to grab your attention to this tempting word – ICE CREAM 🙂

Yes! Ice-cream… When you have it on your plate and you get a call right when you are about to take the first bit. The ice cream will obviously melt and turn into a milkshake.

On similar lines, now think of the concept of Melting data values in a data frame. Melting of data values is used to configure and alter the data values from a wider format to a more narrow and long format. The basic purpose of Melting is to create a particular format of Dataframe where one or more data columns play the role of identifiers for the data attributes.

In this scenario, the remaining data variables are actually considered as data values and there exist only two columns:: variable and value.

On the other hand, we perform Unmelting on the data variables to get the values back to the original format.

Having understood Melting and Unmelting data, let us now understand the Pandas functions that enable us to achieve the same.


1. Melting data variables in Pandas

To perform Melting on the data variables, the Python Pandas module provides us with the melt() function.

Syntax:

pandas.melt(frame, id_vars=None, value_vars=None,
 var_name=None, value_name='value')
  • frame: the actual dataframe that needs to be melted.
  • id_vars: Column names that will act as identifiers.
  • value_vars: All the variable names that will be moulded to values (other than the identifier variables).
  • value_name: The name for the column value, default being value.

Example:

In this example, we have created a data frame with the variables: City, ID, and Fav using the DataFrame() function.

Further, we now pass the entire data frame to the melt() function, passing ID as the identifier variable, City and Fav as the value variables.

import pandas as pd

data = {"City": ["Pune", "Satara", "Solapur"], "ID": [1, 2, 3], "Fav": ["1", "3", "10"]}

dataf = pd.DataFrame(data)
print("Before melting..")
print(dataf)

melt_df = pd.melt(dataf, id_vars=["ID"], value_vars=["City", "Fav"])
print("After melting..")
print(melt_df)

Output:

As a result, the data post melting has only three columns: ID (the identifier variable), variable, and value. This way it turns the data frame from wider to a long format.

Before melting..
      City  ID Fav
0     Pune   1   1
1   Satara   2   3
2  Solapur   3  10
After melting..
   ID variable    value
0   1     City     Pune
1   2     City   Satara
2   3     City  Solapur
3   1      Fav        1
4   2      Fav        3
5   3      Fav       10

We can also skip columns from the original data frame while passing it to the melt() function for it to exclude certain columns.

import pandas as pd

data = {"City": ["Pune", "Satara", "Solapur"], "ID": [1, 2, 3], "Fav": ["1", "3", "10"]}

dataf = pd.DataFrame(data)
print("Before melting..")
print(dataf)

melt_df = pd.melt(dataf, id_vars=["City"], value_vars=["Fav"])
print("After melting..")
print(melt_df)

Output–

Here, as we have excluded the variable ID, it is nowhere taken into consideration while melting the data variables.

Before melting..
      City  ID Fav
0     Pune   1   1
1   Satara   2   3
2  Solapur   3  10
After melting..
      City variable value
0     Pune      Fav     1
1   Satara      Fav     3
2  Solapur      Fav    10

2. Unmelting the data values using Pandas pivot() function

Having melted the data variables, it is now time to get the data frame back in shape. For the same, Python provides us with the pivot() function.

Syntax:

pandas.pivot(index, columns) 
  • index: Labels that need to be applied to have the new data frame’s index in place.
  • columns: Labels that need to be applied to have the new data frame’s columns in place.

Example:

  1. At first, we have created a data frame with the columns: ID, City and Fav.
  2. Then, we apply melting and enlongate the data frame using melt() function against ID variable as the identifier, with Expression as the variable name and Value as the name for column represeting the unpivoted variables.
  3. Lastly, we unmelt the data using pivot() function, providing ID as the index set for the new data frame.
import pandas as pd

data = {"City": ["Pune", "Satara", "Solapur"], "ID": [1, 2, 3], "Fav": ["1", "3", "10"]}

dataf = pd.DataFrame(data)
print("Before melting..")
print(dataf)

melt_df = pd.melt(dataf, id_vars=["ID"], value_vars=["City","Fav"], var_name="Expression", value_name="Value")
print("After melting..")
print(melt_df)

unmelt = melt_df.pivot(index='ID', columns='Expression')
print("Post unmelting..")
print(unmelt)

Output–

Before melting..
      City  ID Fav
0     Pune   1   1
1   Satara   2   3
2  Solapur   3  10
After melting..
   ID Expression    Value
0   1       City     Pune
1   2       City   Satara
2   3       City  Solapur
3   1        Fav        1
4   2        Fav        3
5   3        Fav       10
Post unmelting..
              Value    
Expression     City Fav
ID
1              Pune   1
2            Satara   3
3           Solapur  10

Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any questions. For more such posts related to Python programming, Stay tuned with us.

Till then, Happy Learning!! 🙂