How to extract a specific file from an archive donwloaded from internet using only memory

How to extract a specific file from an archive donwloaded from internet using only memory



I'm looking for a way to extract a specific file (knowing his name) from an archive containing multiple ones, without writing any file on the hard drive.



I tried to use both StringIO and zipfile, but I only get the entire archive, or the same error from Zipfile (open require another argument than a StringIo object)



Needed behaviour:


archive.zip #containing ex_file1.ext, ex_file2.ext, target.ext
extracted_file #the targeted unzipped file

archive.zip = getFileFromUrl("file_url")
extracted_file = extractFromArchive(archive.zip, target.ext)



What I've tried so far:


import zipfile, requests

data = requests.get("file_url")
zfile = StringIO.StringIO(zipfile.ZipFile(data.content))
needed_file = zfile.open("Needed file name", "r").read()




2 Answers
2



There is a builtin library, zipfile, made for working with zip archives.



https://docs.python.org/2/library/zipfile.html



You can list the files in an archive:


ZipFile.namelist()



and extract a subset:


ZipFile.extract(member[, path[, pwd]])



EDIT:
This question has in-memory zip info. TLDR, Zipfile does work with in-memory file-like objects.



Python in-memory zip library





Does it handle stringIO type?
– Chris Prolls
Sep 5 '18 at 7:15





I'm not sure what stringIO type is...that seems to be a library for reading a string buffer. I'm not sure if zipfile supports in-memory decompression, you'd have to do some research, starting with the documentation.
– NateTheGrate
Sep 5 '18 at 12:07



I finally found why I didn't succeed to do it after few hours of testing :



I was bufferring the zipfile object instead of buffering the file itself and then open it as a Zipfile object, which raised a type error.



Here is the way to do :


import zipfile, requests

data = requests.get(url) # Getting the archive from the url
zfile = zipfile.ZipFile(StringIO.StringIO(data.content)) # Opening it in an emulated file
filenames = zfile.namelist() # Listing all files
for name in filesnames:
if name == "Needed file name": # Verify the file is present
needed_file = zfile.open(name, "r").read() # Getting the needed file content
break



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Some of your past answers have not been well-received, and you're in danger of being blocked from answering.



Please pay close attention to the following guidance:



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

How do I collapse sections of code in Visual Studio Code for Windows?

Node.js puppeteer - Use values from array in a loop to cycle through pages