简体   繁体   中英

Python Load CSV File from API and iterate over it in memory

I'm using the requests library in python to hit an API endpoint that is supposed to return a CSV file. In the API's documentation they give an example on how to get the file.

requestDownload = requests.request("GET", requestDownloadUrl, headers=headers, stream=True)
# with open("RequestFile.zip", "w") as f:
for chunk in requestDownload.iter_lines(chunk_size=1024):
      f.write(chunk)
zipfile.ZipFile("RequestFile.zip").extractall("MyDownload")

I don't want to write out the file to a zip or anything else. I just want to iterate over each row. I've tried the following but it's returning binary instead of text.

from contextlib import closing
import csv
import requests  

     with closing(
         requests.get(
             'api_URL/csvfile',
             stream=True,
         )
     ) as r:
         reader = csv.reader(
            (line.replace('\0','') for line in r.iter_lines()),
            delimiter=',',
            quotechar='"'
        )

        for row in reader:
        # Handle each row here...
            print row

The result of printing out row is a bunch of the following:

['\x13\xa4\xa3\xedr\xae\xe6\x0b\x9b\x08\x9c\xabX\xda\xa3d%\\+\xcd\xd5\xfat\x13\xf3']
['51W\x91o\xe2\xef(\x19\x18\xa9\xe2}\xe2\xbca\xd4]\x93\x1d@8:\x8d\xab\xa0\x08\xe6\xd4\xc7\xc5\xcdb\xaf\x8d\xf6\xa2\x8d~s\xb5\xea?\x04\x1c\xfb\xc5\xed9\x
8c']

What do I need to do to see the actual text instead?

You can use the io module to read the url into a file-like object and then use that to create an in-memory zipfile. In this example, I didn't use streaming because the entire zipfile needs to be in memory to extract from it. At the point where the zipfile is created there are several copies of the data in memory which could be problematic on large files. You could potentially build a file-like object that wraps resp.iter_content but that was a bit much for this example.

from contextlib import closing
import requests
import zipfile
import io
import csv


with closing(requests.get("http://localhost:8000/test.zip")) as resp:
    incore_zip = zipfile.ZipFile(io.BytesIO(resp.content))
try:
    with incore_zip.open('test.csv') as fp:
        reader = csv.reader(io.TextIOWrapper(fp, encoding="utf-8"))
        for row in reader:
            print(row)
finally:
    del incore_zip

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM