How to Securely Provide a Zip Download of a S3 File Bundle

Posted 2 Comments

Back in 2012, we added a “Download Multiple Files” option to Teamwork Projects. However, this option depended on browser support and dumped all the files to the browser’s “downloads” folder without keeping the categories’ directory structure.

For years, we have meant to find the time to add a better ZIP download option that would download all the files in one bundle wwhile still maintaining the defined categories’ directory structure.

Here, I outline how we built an elegant file zipper in just one night thanks to the power of Go. Even if you don’t currently use Go (aka “Golang,” a language from Google that we are massive fans of), the mechanism we present here works with your server-side language of choice and you just run the file zipper as a microservice.

Impatient? Go grab github.com/Teamwork/s3zipper

A Streaming Solution

The standard way to provide a backup of S3 files would be to download all the files to a temp folder, zip them, and then serve up the zipped file. However, that method is slow to start for the user, takes a lot of server file space, and requires cleanup. That’s just slow, inelegant and messy.

What if we could steam the files to the user while zipping them on the fly. Called ‘Piping,’ we wouldn’t have to store files, perform cleanup, and keep the user waiting for the download to start.

Well, that’s exactly what we did, in just a few hours thanks to the power of Go.

Just Show me The Bleedin’ Code

Enough of me ranting. If you’re reading this, and you are like me, you want working code to try. But first, please let me briefly outline how the download process and security works first :-

  • Our main platform takes an API request for a zip file with a number of fileIds passed. E.g. download/zip?fileIds=83748,379473,93894
  • The platform then authenticates the user as normal and pulls the details about the files from our database.
  • It then creates a unique download reference string and puts an array with descriptions of the files into Redis with the reference string as a key, e.g. “zip:gdi63783hdhA73”. The file descriptions include the file name, folder path, and s3 file path. The key is set to timeout after five minutes.
  • We simply redirect the user to the s3 file zipper passing along the reference string. E.g., zipper.teamwork.com?ref=gdi63783hdhA73

The s3 file zipper itself doesn’t have to perform security. If a key exists, it is happy to proceed. It just receives a request with a reference string, asks Redis for the corresponding files, and starts pulling them from S3 while simultaneously zipping blocks and sending them to the client. It’s just a dumb beautiful machine. Here’s the code:

View S3Zipper on GitHub

It’s extremely fast, low memory, and can handle thousands of simultaneous requests. It’s also secure (auth done elsewhere and keys timeout) and very simple.

After years of wanting to get this feature done, it was just one long night’s work thanks to the power of Go and some of its fantastic open source and internal libraries.

You’ll see some voodoo around line 211 – this was added to provide UTF character support for our many international customers.

Testing

If you want to quickly test this:

  1. Install Go, if you don’t have it yet
  2. Clone the s3Zipper repo
  3. Provide the config file (based on sample.config)
  4. Run “go s3zipper.go” and browse to http://localhost:8000/?ref=test.
    (If new to go you need to run “go get” first to get libs)

Your files should download as a Zip file instantly.

Moving to Production

Now, you just need to get this running on a server and have your server-side language put the file definitions into memory, and then redirect the user to the microservice.

Setting up the S3Zipper Microservice

  • Fire up a new EC2 Ubuntu server (I went with [S3Type and Ubuntu image]).
  • Install Go. Do not install Go via apt-get. At time of writing it’s an outdated version of go. Install go from source – tutorial
  • Create a new user to run the service under.
  • Clone our repo and checkout to server
  • Create your config file
  • Test the script with go run s3zipper.go (run “go get” first to get libraries)
  • Run as a service using the upstart script below

Upstart Config

Copy this upstart script to etc/init to run this as a service:
s3zipper.conf

Replace USERX with your new user, set the GOPATH and GOROOT correctly and fix up full/path/to. If this is new to you, see Upstart – Getting Started

Serving up the Zip Download

You’ll need to make an server side call in your language of choice that will:

<ol>

  • Authenticate the user (as normal)
  • Generate a unique random reference code
  • Put the details about the files to download into Redis with the key “zip:[ref]” (timeout 5 mins). Note that files must be in the format:
    [{"S3Path":"path", "FileName":"sample.txt", "Folder":"folder"}...]
  • </ul>

    <pre><code>1. Redirect the user to the microservice</code></pre>

    Sample: javascript function downloadZip(fileIds) Test_logged_in() files = Lookup_file_details_or_panic()

     

    // Encode files and save to redis json = JSON.encode(files) ref = Generate_Random_Ref() redis.set( key=”zip:”+ref, value=json, expiry=300 )

     

    // Redirect the user to the S3Zipper RedirectUser( “https://zipperURL/?ref=” + ref )

    I Hope that Works for You

    If you have any questions, just let us know in the comments below. I hope somebody somewhere finds this code useful and please if you do, just say hello (or come work with us). Enjoy!