Upzip S3 files using AWS Lambda
I developed a static web application on my local machine and attempted to deploy it using Amazon S3. I couldn’t find an option to upload the entire application folder directly, so I zipped the folder and uploaded it to the S3 bucket. However, I was unable to unzip the file inside the S3 bucket.
Proposal
After exploring S3 for a while, I realized that there was no direct option to unzip files in S3. Creating each folder and uploading each small file manually was troublesome since my application was quite large. So, I decided to automate this process which can be re-used for other use cases as well and came up with the following solution.
Steps Involved
Here are the steps I carried out:
- Upload the ZIP File: Upload a ZIP file (in my case, a zipped application folder) to an S3 Source bucket.
- Trigger the Lambda Function: The file upload(S3 Event) triggers a Lambda function
- Un-archive the file: The logic in Lambda function extracts all the files and folders inside the ZIP file
- Upload into destination bucket: Once the unarchiving process completes, the lambda uploads files to a new S3 destination bucket.
Lambda code
import json
import boto3
from io import BytesIO
import zipfile
def lambda_handler(event, context):
# TODO implement
s3_resource = boto3.resource('s3')
source_bucket = 'upload-zip-folder'
target_bucket = 'upload-extracted-folder'
my_bucket = s3_resource.Bucket(source_bucket)
for file in my_bucket.objects.all():
if(str(file.key).endswith('.zip')):
zip_obj = s3_resource.Object(bucket_name=source_bucket, key=file.key)
buffer = BytesIO(zip_obj.get()["Body"].read())
z = zipfile.ZipFile(buffer)
for filename in z.namelist():
file_info = z.getinfo(filename)
try:
response = s3_resource.meta.client.upload_fileobj(
z.open(filename),
Bucket=target_bucket,
Key=f'{filename}'
)
except Exception as e:
print(e)
else:
print(file.key+ ' is not a zip file.')
The source file (ZIP file) is stored as bytes in an in-memory buffer and extracted using the Python zipfile
package. Finally, after extraction, the files are uploaded to the new S3 bucket using the S3 client.
GitHub repo here
Automating the Process
To automate the process, simply add a trigger to the Lambda function on the S3 PUT
event in the source bucket.
By default, the execution time for a Lambda function is set to 3 seconds. For large ZIP files, you may need to increase this execution time.