The tarfile Module – How to work with tar files in Python?

Python Tarfile Module

In this tutorial, we will see what tar files are and we will try to create and manipulate tar files with tarfile module of python programming language.

In this article, we’ll see how to:

  • Create a tar file using the tarfile module
  • Add and append files to the tar files
  • Get the list of files in the tar file
  • Extract the files from the tar file

What is a tar file?

The name tar in tar files stands for Tape Archive Files. Tar files are archive files to keep many files in a single file.

Tar files are used for the distribution of open-source software. Generally, tar files have .tar extension but when they are compressed with other utilities like gzip then they have an extension tar.gz.

Working with the tarfile module in Python

Let’s get right into working with the tarfile module now. If you’re more interested in learning about working with zip files in Python, the zipfile module tutorial here will be perfect.

1. How to create a tar file using the tarfile module?

In Python, we can create tar files using the tarfile module. Open a file in write mode and then add other files to the tar file. The following screenshot shows the files in the folder before creating a zipped file.

Folder Before Tar Creation
Folder Before Tar Creation

The following code is an implementation for creating a tar file in Python. Here we use open() method for creating tar file and add() method for adding other files to a tar file.

#import module
import tarfile

#declare filename
filename= "tutorial.tar"

#open file in write mode
file_obj= tarfile.open(filename,"w")

#Add other files to tar file
file_obj.add("plane.xml")
file_obj.add("sample.txt")
file_obj.add("person.ini")

#close file
file_obj.close()

Here the open() method takes the filename of the tar file to be created as first argument and “w” for opening the file in write mode. add() method takes the filename of the file to be added to the tar file as an argument.

The following image shows the tar file created when the above code is run.

Folder After Creating Tar
Folder After Creating Tar

2. How to check if a file is tarfile?

We can check if a file ends with .tar extension just by using is_tarfile() method in tarfile module. The following code shows the implementation of the code.

#import module
import tarfile

#declare filename
filename= "tutorial.tar"

#Check for the file being tarfile
#this will give true
flag=tarfile.is_tarfile(filename)
print("tutorial.tar is a tar file?")
print(flag)

#this will give false
flag=tarfile.is_tarfile("plane.xml")
print("plane.xml is a tar file?")
print(flag)

Output of the above code is:

tutorial.tar is a tar file?
True
plane.xml is a tar file?
False

3. How to use the tarfile module to check contents of a tar file?

To check the contents of a tar file without extracting them, we can use the getnames() method of the tarfile module. getnames() method returns a list of names of files in the tar file.

Here we have opened the file in “read” mode hence “r” is given as a second argument to open(). method

#import module
import tarfile

#declare filename
filename= "tutorial.tar"

#open file in write mode
file_obj= tarfile.open(filename,"r")

# get the names of files in tar file
namelist=file_obj.getnames()

#print the filenames
print("files in the tar file are:")
for name in namelist:
    print(name)

#close file
file_obj.close()

Output for above code is:

files in the tar file are:
plane.xml
sample.txt
person.ini

4. How to append new files directly to tar file?

We can add extra files into a tar file directly using the add() method from the tarfile module as we have done while creating tar file.

The only difference is that we have to open the file in append mode hence “a” is passed as the second argument to open() method.

#import module
import tarfile

#declare filename
filename= "tutorial.tar"

#open file in append mode
file_obj= tarfile.open(filename,"a")

# print initial content of tarfile
namelist=file_obj.getnames()
print("Initial files in the tar file are:")
for name in namelist:
    print(name)
file_obj.add("sampleoutput.txt")

# print final content of tarfile
namelist=file_obj.getnames()
print("Final files in the tar file are:")
for name in namelist:
    print(name)

#close file
file_obj.close()

Output of above code is:

Initial files in the tar file are:
plane.xml
sample.txt
person.ini
Final files in the tar file are:
plane.xml
sample.txt
person.ini
sampleoutput.txt

5. How to extract a single file from tar file in Python?

To extract only a single file from a zipped folder, we can use the extractfile() method of the tarfile module.

This method takes a filename as an argument and extracts the file in our working directory.

#import module
import tarfile

#declare filename
filename= "tutorial.tar"

#open file in write mode
file_obj= tarfile.open(filename,"r")

#extract a file
file=file_obj.extractfile("sample.txt")
print("Content of the extracted file are")

#print content of extracted file
print(file.read())

#close file
file_obj.close()

Output of above code is:

Content of the extracted file are
b'This is a sample file for tarfile tutorial in python on askpython.com'

6. How to extract all files from a tarball using the tarfile module?

To extract the whole tar file instead of a single file, we can use the extractall() method of the tarfile module.

The image given below shows the snap of the folder before extracting the contents of the tar file.

Folder Before Extracting From Tar
Folder Before Extracting From Tar\

The extractall() method takes the name of the output folder as its argument and extracts the entire content of the tar file into the folder in our working directory.

#import module
import tarfile

#declare filename
filename = "tutorial.tar"

#open file in write mode
file_obj = tarfile.open(filename,"r")

#extract all files
file = file_obj.extractall("extracted_tar_folder")

#close file
file_obj.close()

Following image shows a snap of the working directory after extraction of folder from tar file.

Folder After Extracting From Tar
Folder After Extracting From Tar

Conclusion

In this tutorial, we have seen what tar files are and we have seen ways to create,access and manipulate tar files using tarfile module in python. Happy Learning!