How to Read Pickle Files in Pandas?

Reading Pickled Files

Most often we use or store data in the form of DataFrames in CSV, excel, or as a text file. But we can also save data as Pickle files. Pickles are a way of representing Python objects on disk. They store the object in a serialized format, which can be used to reconstruct the object later. Pickles are useful for storing data that needs to be accessed quickly and easily. In this article, we are going to learn about how you can store and read data in Pandas from pickle files. Le’s get started!

Reading Pickle Files Using Pandas

Pandas provides a way for reading and writing pickle files. The most basic way to read a pickle file is to use the read_pickle() function. This function takes the name of the pickle file as an argument and returns a pandas DataFrame.

One can read pickle files in Python using the read_pickle() function. 

Syntax of the function:

pd.read_pickle(path, compression='infer')

Similar to the read_csv() function, this function will also return a Pandas DataFrame as output.

For example:

df = pd.read_pickle('data.pkl')

Let’s now see how to save a data to pickle file in python. We will start by creating a DataFrame.

import pandas as pd
data = {
    'Name': ['Microsoft Corporation', 'Google, LLC', 'Tesla, Inc.',\
             'Apple Inc.', 'Netflix, Inc.'],
    'Icon': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX'],
    'Field': ['Tech', 'Tech', 'Automotive', 'Tech', 'Entertainment'],
    'Market Shares': [100, 50, 160, 300, 80]
           }
df = pd.DataFrame(data)
# print dataframe
print(df)

Output

  Name  Icon          Field  Market Shares
0  Microsoft Corporation  MSFT           Tech            100
1            Google, LLC  GOOG           Tech             50
2            Tesla, Inc.  TSLA     Automotive            160
3             Apple Inc.  AAPL           Tech            300
4          Netflix, Inc.  NFLX  Entertainment             80

Now let’s save the DataFrame into a pickle file.

df.to_pickle('company info.pkl')

Now let’s read the pickle file.

df2 = pd.read_pickle('company info.pkl')
# print the dataframe
print(df2)

Output

   Name  Icon          Field  Market Shares
0  Microsoft Corporation  MSFT           Tech            100
1            Google, LLC  GOOG           Tech             50
2            Tesla, Inc.  TSLA     Automotive            150
3             Apple Inc.  AAPL           Tech            200
4          Netflix, Inc.  NFLX  Entertainment             80

Conclusion

In summary, we learned how to read pickle files using the read_pickle() function in Pandas. One can also use the read_pickle() function to read DataFrames serialized as pickled objects. Pickle files are great for storing data but make sure if you are using data from pickle files, it is from a trusted source.