Python | List all files in a Directory

python-list-all-files-in-a-directory

Handling files and folders is a common task in any programming. In Python, you can easily handle files and directory operations with the help of various built-in functions and libraries. In this post, we will explore how to list all files in a directory or sub-directory (folder or sub folder) using Python.

Create a folder using Python

Before we begin, let’s create an empty directory and put some dummy files into it with the help of following Python code.

import os

# create a new directory
os.mkdir('dummy_directory')

# create some files in the directory
with open('dummy_directory/text_file1.txt', 'w') as f:
    f.write('This is test file 1.')
    
with open('dummy_directory/text_file2.txt', 'w') as f:
    f.write('This is test file 2.')
    
with open('dummy_directory/text_file3.txt', 'w') as f:
    f.write('This is test file 3.')

In the above code, we are creating an example folder (dummy_directory) inside our current working directory. After that inserting three text files (file1.txt, file2.txt, file3.txt) into that dummy folder.

Along with that, I am going to create three subdirectories (named sub_dir1, sub_dir2, sub_dir3) inside our main directory (dummy_directory). I am also going to insert some .png and .csv files into the main directory.

Next, I will insert one png (sub_dir1_png_file, sub_dir2_png_file, sub_dir3_png_file) file into each subdirectory. So now our folder structure should look like below:

dummy_directory
    ├── sub_dir1
    |     └── sub_dir1_png_file
    ├── sub_dir2
    |     └── sub_dir2_png_file
    ├── sub_dir3
    |     └── sub_dir3_png_file
    ├── csv_file1.csv
    ├── csv_file2.csv
    ├── csv_file3.csv
    ├── png_file1.png
    ├── png_file2.png
    ├── png_file3.png
    ├── text_file1.txt
    ├── text_file2.txt
    └── text_file3.txt

List all files in a Directory

So now we have some dummy files inside our example directory, let’s now explore different ways to list all files in Python.

Method 1: os.listdir()

You can use listdir() function of os library of Python to return list of all the files and sub-folder names for a specific directory. Below Python code is to do that.

import os

# get a list of all files in the directory
files = os.listdir('D:/dummy_directory')

# print the list of files
print(files)
['csv_file1.csv', 'csv_file2.csv', 'csv_file3.csv', 'png_file1.png', 'png_file2.png', 'png_file3.png', 'sub_dir1', 'sub_dir2', 'sub_dir3', 'text_file1.txt', 'text_file2.txt', 'text_file3.txt']

Method 2: os.scandir()

You can also use scandir() function to achieve same result. In this case, the code will be a little bit longer compared to listdir().

import os

# create a list to store the file names
files = []

# iterate over the directory entries
for entry in os.scandir('D:/dummy_directory'):
    if entry.is_file():
        files.append(entry.name)

# print the list of files
print(files)
['csv_file1.csv', 'csv_file2.csv', 'csv_file3.csv', 'png_file1.png', 'png_file2.png', 'png_file3.png', 'text_file1.txt', 'text_file2.txt', 'text_file3.txt']

Method 3: glob.glob()

Like os.listdir() method globe() is also a simplest method to return list of all files path with a specific pattern.

import glob

# get a list of all files in the directory
files = glob.glob('dummy_directory/*')

# print the list of files
print(files)
['dummy_directory\\csv_file1.csv', 'dummy_directory\\csv_file2.csv', 'dummy_directory\\csv_file3.csv', 'dummy_directory\\png_file1.png', 'dummy_directory\\png_file2.png', 'dummy_directory\\png_file3.png', 'dummy_directory\\sub_dir1', 'dummy_directory\\sub_dir2', 'dummy_directory\\sub_dir3', 'dummy_directory\\text_file1.txt', 'dummy_directory\\text_file2.txt', 'dummy_directory\\text_file3.txt']

Above python code will print list of files in our dummy folder along with subdirectories names (sub_dir1, sub_dir2, sub_dir3). This method will return full path of files and sub folders.

Also Read:  Python Multithreading vs Multiprocessing

Method 4: os.walk()

os.walk() is another technique to return list of all file names in a directory tree structure by walking the tree either top-down or bottom-up. Like globe, this method will also return the directory path.

import os

# create a list to store the file names
files = []

# walk the directory tree
for dirpath, dirnames, filenames in os.walk('dummy_directory'):
    for filename in filenames:
        files.append(os.path.join(dirpath, filename))

# print the list of files
print(files)
['dummy_directory\\csv_file1.csv', 'dummy_directory\\csv_file2.csv', 'dummy_directory\\csv_file3.csv', 'dummy_directory\\png_file1.png', 'dummy_directory\\png_file2.png', 'dummy_directory\\png_file3.png', 'dummy_directory\\text_file1.txt', 'dummy_directory\\text_file2.txt', 'dummy_directory\\text_file3.txt']

Other Methods

Let me now list down other commonly used techniques to list and print files and directories in Python.

List of Folders in a directory

To list only folders in a directory you can use listdir() method of os library in Python. Below python code will print only subdirectories names (sub_dir1, sub_dir2, sub_dir3).

import os

# Specify the directory path
path = 'dummy_directory'

# Get a list of all files and directories in the directory
files_and_folders = os.listdir(path)

# Loop through the list and print only the directories
for item in files_and_folders:
    if os.path.isdir(os.path.join(path, item)):
        print(item)
sub_dir1
sub_dir2
sub_dir3

You can also use os.scandir() method to print list of sub folders in a particular directory using below python code.

import os

# create a list to store the file names
files = []

# iterate over the directory entries
for entry in os.scandir('dummy_directory'):
    if entry.is_dir():
        files.append(entry.name)

# print the list of files
print(files)
['sub_dir1', 'sub_dir2', 'sub_dir3']

List files using directory matching pattern

glob.glob in Python allows you to find all the files in a directory or subdirectory that match a specified pattern. It returns a list of file paths that match a specific pattern.

For example, in the below python code, we can use glob.glob to list all the files in a directory that have the extension .csv:

import glob

files = glob.glob('dummy_directory/*.csv')
print(files)
['dummy_directory\\csv_file1.csv', 'dummy_directory\\csv_file2.csv', 'dummy_directory\\csv_file3.csv']

Similarly, you can apply the same technique to list out any other file type or extensions like json, xml, png, jpg html, zip etc.

Also Read:  Convert Python Script to Compiler Executable (.exe File)

List files in directory recursively

Now if you want to get all file paths in your directory and its subdirectories recursively (together) in Python, glob.glob() method help you.

import glob

# Set the path of the directory you want to list files from
directory = 'dummy_directory'

# Use glob to get all file paths in the directory and its subdirectories recursively
file_paths = glob.glob(directory + '/**/*', recursive=True)

# Print the list of file paths
print(file_paths)
['dummy_directory\\csv_file1.csv', 'dummy_directory\\csv_file2.csv', 'dummy_directory\\csv_file3.csv', 'dummy_directory\\png_file1.png', 'dummy_directory\\png_file2.png', 'dummy_directory\\png_file3.png', 'dummy_directory\\sub_dir1', 'dummy_directory\\sub_dir2', 'dummy_directory\\sub_dir3', 'dummy_directory\\text_file1.txt', 'dummy_directory\\text_file2.txt', 'dummy_directory\\text_file3.txt', 'dummy_directory\\sub_dir1\\sub_dir1_png_file.png', 'dummy_directory\\sub_dir2\\sub_dir2_png_file.png', 'dummy_directory\\sub_dir3\\sub_dir3_png_file.png']

Display File Size

Now let’s see how we can list all files in directory and subdirectories with size, below python code is to do that. The file sizes we are printing are in byte format.

import glob
import os

# Set the path of the directory you want to list files from
directory = 'dummy_directory'

# Initialize an empty list to store file paths and sizes
file_info = []

# Iterate over all files in the directory and its subdirectories recursively using glob
for file_path in glob.glob(directory + '/**/*', recursive=True):
    # Get the size of the file in bytes
    size = os.path.getsize(file_path)
    # Add the path and size of the file to the list
    file_info.append((file_path, size))

# Print the list of file paths and sizes
for path, size in file_info:
    print(path, size)
dummy_directory\csv_file1.csv 61154
dummy_directory\csv_file2.csv 1538
dummy_directory\csv_file3.csv 558
dummy_directory\png_file1.png 14545
dummy_directory\png_file2.png 162655
dummy_directory\png_file3.png 225201
dummy_directory\sub_dir1 0
dummy_directory\sub_dir2 0
dummy_directory\sub_dir3 0
dummy_directory\text_file1.txt 20
dummy_directory\text_file2.txt 20
dummy_directory\text_file3.txt 20
dummy_directory\sub_dir1\sub_dir1_png_file.png 14545
dummy_directory\sub_dir2\sub_dir2_png_file.png 14545
dummy_directory\sub_dir3\sub_dir3_png_file.png 14545

End Note

There are various ways to get a list of all files in a directory using Python, but os and glob are two popular and widely used libraries due to their flexibility and ease of use.

If you want to add anything to this article please let me know in the comment section below.

Leave a comment