Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
wb_sunny

How to List Files in a Directory Using Python?

List Files Python Featured Image

In this tutorial, we’re covering everything you need to know about how to list files in a directory using Python.

Python is a general-purpose language, used in a variety of fields like Data Science, Machine Learning, and even in Web Development. There seems to be no restriction in the application of Python Language.

Therefore, it seems quite trivial Python can be used to list files and directories in any system. The aim of this article is to illuminate the reader about the ways to list files in a system using Python.

List All Files in a Directory Using Python

For the purpose of interacting with directories in a system using Python, the os library is used.

1. Using the ‘os’ library

The method that we are going to exercise for our motive is listdir(). As the name suggests, it is used to list items in directories.

# Importing the os library
import os

# The path for listing items
path = '.'

# The list of items
files = os.listdir(path)

# Loop to print each filename separately
for filename in files:
	print(filename)

Output:

game_file.py
hi-lo_pygame.py
Journaldev
list_files1.py
hi_lo_pygame.mp4
test.py
list_files.py
my_program.cpp
a.out
cut.cpp

Linux users can easily match the above output using the standard ls command on the terminal.

List Files Python Ls
List Items using ‘ls’ command

As we can see the outputs of each method matches.


2. Using the ‘glob’ library

glob is mostly a filename pattern matching library, but it can be used to list items in the current directory by:

# Importing the glob library
import glob 

# Path to the directory
path = ''

# or 
# path = './'

# Extract the list of filenames
files = glob.glob(path + '*', recursive=False)

# Loop to print the filenames
for filename in files:
	print(filename)

Output:

game_file.py
hi-lo_pygame.py
Journaldev
list_files1.py
hi_lo_pygame.mp4
test.py
list_files.py
my_program.cpp
a.out
cut.cpp

The wildcard character '*' is used to match all the items in the current directory. Since we wish to display the items of the current directory, we need to switch off the recursive nature of glob() function.


3. List only files in the current directory

In the above methods, the python code was returning all the items in the current directory irrespective of their nature. We can extract only the files using the path.isfile() function inside the os library.

# Importing the os library
import os

# The path for listing items
path = '.'

# List of only files
files = [f for f in os.listdir(path) if os.path.isfile(f)]

# Loop to print each filename separately
for filename in files:
	print(filename)

Output:

game_file.py
hi-lo_pygame.py
list_files1.py
hi_lo_pygame.mp4
test.py
list_files.py
my_program.cpp
a.out
cut.cpp

In the above code snippet, List Comprehension is used to filter out only those items that are actually a file.

Another key thing to note here is that, the above code does not work for other directories as the variable 'f' is not an absolute path, but a relative path to the current directory.


List All Files in a Directory Recursively

In order to print the files inside a directory and its subdirectories, we need to traverse them recursively.

1. Using the ‘os’ library

With the help of the walk() method, we can traverse each subdirectory within a directory one by one.

# Importing the os library
import os

# The path for listing items
path = './Documents/'

# List of files in complete directory
file_list = []

"""
	Loop to extract files inside a directory

	path --> Name of each directory
	folders --> List of subdirectories inside current 'path'
	files --> List of files inside current 'path'

"""
for path, folders, files in os.walk(path):
	for file in files:
		file_list.append(os.path.join(path, file))

# Loop to print each filename separately
for filename in file_list:
	print(filename)

Output:

./Documents/game_file.py
./Documents/hi-lo_pygame.py
./Documents/list_files1.py
./Documents/hi_lo_pygame.mp4
./Documents/test.py
./Documents/list_files.py
./Documents/my_program.cpp
./Documents/a.out
./Documents/cut.cpp
./Documents/Journaldev/mastermind.py
./Documents/Journaldev/blackjack_terminal.py
./Documents/Journaldev/lcm.cpp
./Documents/Journaldev/super.cpp
./Documents/Journaldev/blackjack_pygame.py
./Documents/Journaldev/test.java

The os.walk() method simply follows each subdirectory and extracts the files in a top-down manner by default. There are three iterators used for going through the output of os.walk() function:

  • path – This variable contains the present directory the function is observing during a certain iteration
  • folders – This variable is a list of directories inside the 'path' directory.
  • files – A list of files inside the 'path' directory.

The join() method is used to concatenate the file name with its parent directory, providing us with the relative path to the file.


2. Using the ‘glob’ library

Similar to the above procedure, glob can recursively visit each directory and extract all items and return them.

# Importing the glob library
import glob 

# Importing the os library
import os

# Path to the directory
path = './Documents/'

# Extract all the list of items recursively
files = glob.glob(path + '**/*', recursive=True)

# Filter only files
files = [f for f in files if os.path.isfile(f)]

# Loop to print the filenames
for filename in files:
	print(filename)

Output:

./Documents/game_file.py
./Documents/hi-lo_pygame.py
./Documents/list_files1.py
./Documents/hi_lo_pygame.mp4
./Documents/test.py
./Documents/list_files.py
./Documents/my_program.cpp
./Documents/a.out
./Documents/cut.cpp
./Documents/Journaldev/mastermind.py
./Documents/Journaldev/blackjack_terminal.py
./Documents/Journaldev/lcm.cpp
./Documents/Journaldev/super.cpp
./Documents/Journaldev/blackjack_pygame.py
./Documents/Journaldev/test.java

The '**' symbol used along with the path variable tells the glob() function to match files within any subdirectory. The '*' tells the function to match with all the items within a directory.

Since we wish to extract only the files in the complete directory, we filter out the files using the isfile() function used before.


List All Subdirectories Inside a Directory

Instead of listing files, we can list all the subdirectories present in a specific directory.

# Importing the os library
import os

# The path for listing items
path = './Documents/'

# List of folders in complete directory
folder_list = []

"""
	Loop to extract folders inside a directory

	path --> Name of each directory
	folders --> List of subdirectories inside current 'path'
	files --> List of files inside current 'path'

"""
for path, folders, files in os.walk(path):
	for folder in folders:
		folder_list.append(os.path.join(path, folder))

# Loop to print each foldername separately
for foldername in folder_list:
	print(foldername)

Output:

./Documents/Journaldev

The minor difference between listing files and directories is the selection of iterator during the process of os.walk() function. For files, we iterate over the files variable. Here, we loop over the folders variable.


List Files in a Directory with Absolute Path

Once we know how to list files in a directory, then displaying the absolute path is a piece of cake. The abspath() method provides us with the absolute path for a file.

# Importing the os library
import os

# The path for listing items
path = './Documents/'

# List of files in complete directory
file_list = []

"""
	Loop to extract files inside a directory

	path --> Name of each directory
	folders --> List of subdirectories inside current 'path'
	files --> List of files inside current 'path'

"""
for path, folders, files in os.walk(path):
	for file in files:
		file_list.append(os.path.abspath(os.path.join(path, file)))

# Loop to print each filename separately
for filename in file_list:
	print(filename)

Output:

/home/aprataksh/Documents/game_file.py
/home/aprataksh/Documents/hi-lo_pygame.py
/home/aprataksh/Documents/list_files1.py
/home/aprataksh/Documents/hi_lo_pygame.mp4
/home/aprataksh/Documents/test.py
/home/aprataksh/Documents/list_files.py
/home/aprataksh/Documents/my_program.cpp
/home/aprataksh/Documents/a.out
/home/aprataksh/Documents/cut.cpp
/home/aprataksh/Documents/Journaldev/mastermind.py
/home/aprataksh/Documents/Journaldev/blackjack_terminal.py
/home/aprataksh/Documents/Journaldev/lcm.cpp
/home/aprataksh/Documents/Journaldev/super.cpp
/home/aprataksh/Documents/Journaldev/blackjack_pygame.py
/home/aprataksh/Documents/Journaldev/test.java

One thing to note here is that abspath() must be provided with the relative path of the file and that is the purpose of join() function.


List Files in a Directory by Matching Patterns

There are multiple ways to filter out filenames matching a particular pattern. Let us go through each of them one by one.

1. Using the ‘fnmatch’ library

As the name suggests, fnmatch is a filename pattern matching library. Using fnmatch with our standard filename extracting libraries, we can filter out those files matching a specific pattern.

# Importing the os and fnmatch library
import os, fnmatch

# The path for listing items
path = './Documents/'

# List of files in complete directory
file_list = []

"""
	Loop to extract files containing word "file" inside a directory

	path --> Name of each directory
	folders --> List of subdirectories inside current 'path'
	files --> List of files inside current 'path'

"""
print("List of files containing \"file\" in them")
for path, folders, files in os.walk(path):
	for file in files:
		if fnmatch.fnmatch(file, '*file*'):
			file_list.append(os.path.join(path, file))

# Loop to print each filename separately
for filename in file_list:
	print(filename)

Output:

List of files containing "file" in them
./Documents/game_file.py
./Documents/list_files1.py
./Documents/list_files.py

The fnmatch() function takes in two parameters, the filename followed by the pattern to be matched. In the above code, we are looking at all the files containing the word file in them.


2. Using the ‘glob’ library

As we mentioned before, glob's primary purpose is filename pattern matching.

# Importing the glob library
import glob 

# Importing the os library
import os

# Path to the directory
path = './Documents/'

# Extract items containing numbers in name
files = glob.glob(path + '**/*[0-9]*.*', recursive=True)

# Filter only files
files = [f for f in files if os.path.isfile(f)]

# Loop to print the filenames
for filename in files:
	print(filename)

Output:

./Documents/list_files1.py

The above pattern matching regular expression '**/*[0-9]*.*' can be explained as:

  • '**' – Traverse all subdirectories inside the path
  • '/*' – The filename can start with any character
  • '[0-9]' – Contains a number within its filename
  • '*.*' – The filename can end with any character and can have any extension

3. Using the ‘pathlib’ library

pathlib follows an object-oriented way of interacting with the filesystem. The rglob() function inside the library can be used to recursively extract list of files through a certain Path object.

These list of files can be filtered using a pattern within the rglob() function.

# Importing the pathlib library
import pathlib

# Creating a Path object
path = pathlib.Path('./Documents/')

# Extracting a list of files starting with 'm'
files = path.rglob('m*')

# Loop to print the files separately
for file in files:
	print(file)

Output:

Documents/my_program.cpp
Documents/Journaldev/mastermind.py

The above code snippet is used to list all the files starting with the letter 'm'.


List Files in a Directory with a Specific Extension

Listing files with a specific extension in Python is somewhat similar to pattern matching. For this purpose, we need to create a pattern with respect to the file extension.

# Importing the os and fnmatch library
import os, fnmatch

# The path for listing items
path = './Documents/'

# List to store filenames 
file_list = []

"""
	Loop to extract python files 

	path --> Name of each directory
	folders --> List of subdirectories inside current 'path'
	files --> List of files inside current 'path'

"""
print("List of python files in the directory:")
for path, folders, files in os.walk(path):
	for file in files:
		if fnmatch.fnmatch(file, '*.py'):
			file_list.append(os.path.join(path, file))

# Loop to print each filename separately
for filename in file_list:
	print(filename)

Output:

List of python files in the directory:
./Documents/game_file.py
./Documents/hi-lo_pygame.py
./Documents/list_files1.py
./Documents/test.py
./Documents/list_files.py
./Documents/Journaldev/mastermind.py
./Documents/Journaldev/blackjack_terminal.py
./Documents/Journaldev/blackjack_pygame.py

The fnmatch() function filters out those files ending with '.py', that is python files. If we want to extract files with different extensions, then we have to alter this part of the code. For example, in order to fetch only C++ files, '.cpp' must be used.

This sums up the ways to fetch list of files in a directory using Python.


Conclusion

There can be multiple ways to solve any problem at hand, and the most convenient one is not always the answer. With respect to this article, a Python programmer must be aware of every way we can list files in a directory.

We hope this article was easy to follow. Feel free to comment below for any queries or suggestions.