Python 3 In-Memory Zip File

In Python, BytesIO is the way to store binary data in memory. Most examples you’ll see using zip files in memory is to store string data and indeed the most common example you’ll find online from the zipfile module is zipfile.writestr(file_name, "Text Data"). But what if you want to store binary data of a PDF or Excel Spreadsheet that’s also in memory? You could use zipfile.write() (designed to take binary data) but then you can’t specify a filename (since our in-memory file was never written to a location on disk). The reason for this is simple: for a web request or for a test case, you shouldn’t need to store any files on disk.

After a bit of experimentation, I finally was able to get in-memory binary files written to an in-memory zip file:

from io import BytesIO
import zipfile


def generate_zip(files):
    mem_zip = BytesIO()

    with zipfile.ZipFile(mem_zip, mode="w",compression=zipfile.ZIP_DEFLATED) as zf:
        for f in files:
            zf.writestr(f[0], f[1])

    return mem_zip.getvalue()
 
def generate_pdf():
	from reportlab.pdfgen.canvas import Canvas
	from reportlab.lib.pagesizes import A4
	buffer = BytesIO()
    canvas = Canvas(buffer, pagesize=A4)
    textobject = canvas.beginText(1.5 * cm, -2.5 * cm)
	textobject.textLine("Hello, world!")
	canvas.saveState()
	canvas.save()
    pdf = buffer.getvalue()
    buffer.close()
    return pdf
	
def main():
	file_names = ["test1.pdf", "test2.pdf"]
	files = []

	for f in file_names:
		pdf = generate_pdf() # your file generation method goes here
		files.append((f, pdf))

	full_zip_in_memory = generate_zip(files)


if __name__ == "__main__":
	main()