How can I create the pandas DataFrame from csv file that's compressed in tar.gz? I found this code which does that but with zip file. What should I change in the following code to make it work with tar.gz without downloading the tar.gz and csv file.
import pandas, requests, zipfile, StringIO
r =requests.get('http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip')
z = zipfile.ZipFile(StringIO.StringIO(r.content))
df=pandas.read_csv(z.open('sample_CSV.csv'))
My file is https://ghtstorage.blob.core.windows.net/downloads/mysql-2016-06-16.tar.gz
Can you try below for extracting tar.gz as below :
import tarfile
tar = tarfile.open(fname, "r:gz")
tar.extractall()
tar.close()
Try simply supply your .tar.gz
file as the file name
to read_csv
and it will automatically decompress and open it,
since this is the default behavior for gz
files.
Make sure the extension is in lower case.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.