How to Install and Use Pickle for Python 3

Install Pickle On Python 3 X

Pickle is an incredibly useful Python module for serializing and deserializing Python object structures. By “serializing”, we mean converting a Python object hierarchy into a byte stream. And by “deserializing”, we mean reconstructing the object hierarchy from the byte stream. In this guide, we will cover these topics in detail using practical, real-world examples you can try for yourself. By the end, you’ll be a Pickle pro!

What is Pickle, and Why Use It?

Pickle allows you to save Python objects to files (or other storage) and load them later. This is incredibly handy for:

  • Persisting Python objects across program runs: Store an object now, close the program, and then access the object later by loading it. This lets you resume right where you left off!
  • Sending Python objects across networks: Serialize objects on one computer, send the bytes over the network, and then deserialize them on another computer.
  • Storing Python objects in databases: Serialize and store objects in database blob columns. Works great with NoSQL databases.

Without Pickle, you must manually extract object data into simple primitive types before saving or sending. And manually reassembled into complex objects afterward. This gets tedious fast!

Installing Pickle on Python

The good news is that Pickle comes pre-installed with Python 3! There’s nothing extra to install.

To confirm this, open up a Python 3 interpreter:

import pickle
print(pickle.__version__)

This prints out your installed Pickle version. In Python 3, this will likely be something like:

1.0

And that’s it! Pickle is ready to use.

The only thing you may need to install is PickleDB if you want to easily serialize Python objects to a simple key-value style database powered by Pickle.

You can install PickleDB via pip:

pip install pickledb

Otherwise, vanilla Pickle can go out of the box with Python 3.

Serializing Python Objects with Pickle

Serializing objects is where Pickle really shines.

The main function for this is pickle.dumps(). It serializes a Python object hierarchy into a stream of bytes.

Here’s a quick example of serializing a simple Python dict:

import pickle

data = {
  'name': 'Daniel',
  'age': 25,
  'favorite_colors': ['blue', 'green']
}

serialized_data = pickle.dumps(data)

print(serialized_data)

Which outputs bytes:

b'\x80\x04\x95\x1c\x00\x00\x00\x00\x00\x00\x00}\x94(X\x04\x00\x00\x00nameq\x01X\x06\x00\x00\x00Danielq\x02X\x03\x00\x00\x00ageq\x03K\x19X\x11\x00\x00\x00favorite_colorsq\x04]q\x05(X\x04\x00\x00\x00blueq\x06X\x05\x00\x00\x00greenq\x07eu.'

We could then write these bytes to a file, database, network socket etc. When we want to access the data again, we just deserialize!

Some key notes on pickle.dumps():

  • Accepts a Python object hierarchy
  • Returns a bytes object
  • Use pickle.loads() to deserialize the bytes back into objects

You can serialize nearly any Python object like this. For example NumPy arrays, Pandas DataFrames, custom classes, and more. Pickle just recurses through the object structure and serializes it out.

There are also complementary methods like:

  • pickle.dump() – serialize object to open file
  • pickle.load() – deserialize from open file back into object

But pickle.dumps()/loads() give you the bytes to work with manually.

Also read: How to Load Pickled Pandas Object From a File as a Path?

Deserializing Byte Streams into Python Objects

Once you have serialized bytes, you can reconstitute them back into live Python objects using pickle.loads().

Continuing the example above:

import pickle

# Serialized bytes from previous example
serialized_data = b'\x80\x04\x95\x1c\x00\x00\x00\x00\x00\x00\x00}\x94(X\x04\x00\x00\x00nameq\x01X\x06\x00\x00\x00Danielq\x02X\x03\x00\x00\x00ageq\x03K\x19X\x11\x00\x00\x00favorite_colorsq\x04]q\x05(X\x04\x00\x00\x00blueq\x06X\x05\x00\x00\x00greenq\x07eu.'

# Deserialize bytes back into a Python dict
deserialized_data = pickle.loads(serialized_data)

print(deserialized_data)
print(type(deserialized_data))

Prints out the reconstructed data:

{'name': 'Daniel', 'age': 25, 'favorite_colors': ['blue', 'green']}
<class 'dict'>

We get back the exact same dict, thanks to Pickle!

The complementary method to pickle.loads() is pickle.load() which deserializes from an open file.

So in summary, core serialization functions are:

  • pickle.dumps() – serialize to bytes
  • pickle.loads() – deserialize from bytes
  • pickle.dump() – serialize to file
  • pickle.load() – deserialize from file

Use these to get Python objects in and out of byte streams.

Also read: Pandas to_pickle(): Pickle (serialize) object to File

Pickle Best Practices and Downsides

Pickle is enormously useful, but does come with some downsides to be aware of:

  • Security vulnerability: Pickle allows arbitrary code execution during deserialization by reconstructing Python bytecode. Only unpickle data you trust!
  • Python version dependence: Pickle byte streams are specific to a Python version. So, Python 2 pickles may not load in Python 3.
  • Class definition dependence: The original class definitions must be importable during deserialization. So, if you refactor code, Pickle may break trying to load old data.

The big one is security. Deserializing untrusted data can execute malicious attacks by reconstructing Python bytecode inside your process.

So only deserialize Pickle streams from trusted sources! For example, data you serialized yourself or from secured databases.

Avoid loading random Pickle streams from unknown Internet sources. Consider using JSON for untrusted data as it does not support arbitrary code execution during deserialization.

Beyond this, pickle streams will likely only work on the same Python version and require the original class definitions to reconstruct objects. So you need to be careful when upgrading Python versions or refactoring code.

But used correctly on trusted data – Pickle is incredibly handy!

It can serialize nearly any Python object while allowing convenient data storage, network transmission, and workflow persistence.

I hope you now feel empowered to start using Pickle for your Python workflows! Happy coding!