top of page
Search

Uploading large objects into Amazon S3 using Multipart upload

  • Writer: Abhijith Nair
    Abhijith Nair
  • Jan 9, 2023
  • 2 min read

Updated: Jan 10, 2023


Amazon S3 is an object-based storage service provided by AWS and depending upon the size of the uploading object, there are a few things you must know:

  1. Using Amazon S3 Console, you can upload a single object up to 160 GB in size.

  2. To upload a file larger in size than 160 GB, you may use the AWS CLI, AWS SDK or even S3 REST API. However through a single PUT operation, you can only upload a single object up to 5 GB in size.

  3. The maximum size of an individual object in S3 cannot be greater than 5 TB.

But if a single PUT operation can upload an object of size only up to 5 GB, how will you upload a larger file? This is where you would leverage Amazon S3's Multipart Upload feature. Using multipart upload, you can upload a single data object into Amazon S3 as a set of parts. Each part is a portion of the data object which can be independently uploaded in any order. Once all the parts are uploaded, Amazon S3 fetches these parts, combines & creates a single data object. This feature also allows you to resume the upload process, even if the connection breaks while uploading files. It is recommended by AWS to use the multipart upload feature if the uploading data object has a size of 100 MB or more.



What to expect?


In this blog, I will explain how to upload a video file into Amazon S3 using the S3 Multipart upload feature.


Prerequisites


In order to pursue this task, I have a video file of size 232.3 MB which currently resides on my local machine. I also have an Amazon S3 bucket my-s3-multipart-upload created in the us-east-1 region with all the default settings. I will be using AWS CloudShell to perform the following tasks.


Procedure


1. Login to AWS CloudShell and upload the video file.


2. Use the split command to split the original file. Here I have split them into separate contiguous 100 MB file. As you may see there are 3 parts in total: file-aa, file-ab and file-ac respectively.


3. Install openssl to generate MD5 checksum values for our files.


4. Generate MD5 checksum values for our files. Copy and save them to a clipboard.


5. Start the multipart upload process using the following command. Copy & save the generated UploadId to a clipboard.

aws s3api create-multipart-upload --bucket <bucket_name> --key <original_file_name> --metadata md5=<original_file_checksum_value>


6. Upload each individual file parts using the following command. The upload-part command generates an ETag value for each parts.

aws s3api upload-part --bucket <bucket_name> --key <original_file_name> --part-number 1 --body <file_name_1> --upload-id <upload_id_from_step5>


7. List all the file parts using the following command.

aws s3api list-parts --bucket <bucket_name> --key <original_file_name> --upload-id <upload_id_from_step5>


8. Copy & save the PartNumber and ETag values for all the file parts into a JSON file. You can use the nano command to create a JSON file.


9. Complete the upload process using the following command.

aws s3api complete-multipart-upload --multipart-upload file://<JSON_file> --bucket <bucket_name> --key <original_file_name> --upload-id <upload_id_from_step5>


10. The original file can be now fetched from the Amazon S3 bucket.


I hope this blog helped in explaining how to upload a large file into Amazon S3 using the S3 Multipart Upload feature. Please do check out my other blogs on this portfolio.


Until then, Happy Blogging!


 
 
 

Comments


Copyright © 2023 - All Rights Reserved.

bottom of page