In this tutorial we will explore how to unzip files using Python.
Table of Contents
- Introduction
- Create a sample ZIP file
- Extract all files from a ZIP file using Python
- Extract individual files from a ZIP file using Python
- Extract files based on condition from a ZIP file using Python
- Conclusion
Introduction
A ZIP file is something we see very often. It is simply a file (archive) containing multiple compressed files.
It is useful for efficient data transfers as well storing larger files by reducing their sizes.
Working with many ZIP files at once can be a very manual task, however Python allows us to efficiently work with multiple ZIP files and extract data from them very fast.
To continue following this tutorial we will need the following Python library: zipfile (which is built-in in Python).
Create a sample ZIP file
In order to continue in this tutorial we will need a ZIP file to work with.
If you have one already, that’s great. If you don’t, then feel free to download a sample ZIP file (my_files.zip) that I created and uploaded to the Google Drive.
This ZIP file contains three files:
- customers.csv
- products.csv
- code_snippet.png
Once downloaded, place it in the same directory as the Python code file.
Extract all files from a ZIP file using Python
One of the most common tasks we perform with ZIP files manually is extracting all the files from them.
Using zipfile library in Python, we can do this in a few lines of code:
from zipfile import ZipFile
with ZipFile('my_files.zip', 'r') as zip_object:
zip_object.extractall()
All we need to do is create an instance of a ZipFile class and pass the location of the ZIP file and “read” mode to it as parameters, and then extract all the files using the .extractall() method.
Here is another way of writing the same code:
from zipfile import ZipFile
zip_object = ZipFile('my_files.zip', 'r')
zip_object.extractall()
In both cases, the three files will be extracted from the ZIP file.
Extract individual files from a ZIP file using Python
Another task we might have is to extract specific individual files from a ZIP file using Python.
First, let’s find the list of files that are archived in the ZIP file:
from zipfile import ZipFile
with ZipFile('my_files.zip', 'r') as zip_object:
print(zip_object.namelist())
And you should get:
['code_snippet.png', 'customers.csv', 'products.csv']
Let’s say we wanted to extract only the customers.csv and products.csv files.
Since we know the names of the files, we can use them as identifiers when extracting files from ZIP file using Python:
from zipfile import ZipFile
with ZipFile('my_files.zip', 'r') as zip_object:
zip_object.extract('customers.csv')
zip_object.extract('products.csv')
And you should see the two .csv files extracted to the same folder where your Python code is located.
Extract files based on condition from a ZIP file using Python
In the example above we extracted two .csv files from the ZIP file using Python.
Extracting individual files one by one only works when we are working with a few files.
Let’s say we now have a large ZIP file that has hundreds of files and we want to extract only the CSV files.
We can extract these files based on some condition in the names of the files.
In case of CSV files, their names end with “.csv” and we can use it as a filter condition when extracting files from a ZIP file using Python:
from zipfile import ZipFile
with ZipFile('my_files.zip', 'r') as zip_object:
file_names = zip_object.namelist()
for file_name in file_names:
if file_name.endswith('.csv'):
zip_object.extract(file_name)
And you should see the two CSV files extracted from the ZIP file which is the same result as in the previous section.
Conclusion
In this article we explored how to extract files from a ZIP file using Python.
Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Python Programming tutorials.