Eddy Hintze/S3 Concat

Created Fri, 05 Mar 2021 15:25:46 +0000 Modified Fri, 05 Mar 2021 20:32:53 +0000


Command line tool and/or python library that takes s3 files and combines them together into fewer larger files.

A use case would be you have a micro service (like aws lambdas) generating many text files (cvs, json lines, etc…) and saving them to an s3 bucket. Then you need to read in all that data, which is very inefficient to do so with many small files. So you could use s3-concat tool to combine (concat) all of these files together very quickly then you can have a few larger files that you can read from much faster.