[英]Read a zipped Stata file from URL into pandas
Is it possible to read a .zip file that includes only a .dta
file from URL?是否可以从 URL 读取仅包含
.dta
文件的 .zip 文件?
For example, https://www.federalreserve.gov/econres/files/scfp2016s.zip contains one file: rscfp2016.dta
, but pandas.read_stata
doesn't work for it:例如, https :
rscfp2016.dta
包含一个文件: rscfp2016.dta
,但pandas.read_stata
不适用于它:
import pandas as pd
pd.read_stata('https://www.federalreserve.gov/econres/files/scfp2016s.zip')
ValueError: Version of given Stata file is not 104, 105, 108, 111 (Stata 7SE), 113 (Stata 8/9), 114 (Stata 10/11), 115 (Stata 12), 117 (Stata 13), or 118 (Stata 14)
ValueError: 给定 Stata 文件的版本不是 104、105、108、111 (Stata 7SE)、113 (Stata 8/9)、114 (Stata 10/11)、115 (Stata 12)、117 (Stata 13) 或118 (Stata 14)
read_csv
supports reading zipped files if the zip only includes the csv, via the compression
argument which defaults to inferring the compression.如果 zip 仅包含 csv,则
read_csv
支持读取压缩文件,通过compression
参数默认推断压缩。 read_stata
lacks this option. read_stata
缺少此选项。
I could do it by downloading and unzipping the file, then reading it, but this is messy.我可以通过下载和解压缩文件,然后阅读它来做到这一点,但这很麻烦。
!wget https://www.federalreserve.gov/econres/files/scfp2016s.zip
!unzip scfp2016s.zip
df = pd.read_stata('rscfp2016.dta')
Any better way?有什么更好的办法吗?
read_stata
accepts file-like objects, so you can do this: read_stata
接受类似文件的对象,因此您可以这样做:
import pandas as pd
from io import BytesIO
from zipfile import ZipFile
from urllib.request import urlopen
url = 'https://www.federalreserve.gov/econres/files/scfp2016s.zip'
with urlopen(url) as request:
data = BytesIO(request.read())
with ZipFile(data) as archive:
with archive.open(archive.namelist()[0]) as stata:
df = pd.read_stata(stata)
You can try it with requests:您可以尝试使用请求:
import io
import zipfile
import requests
response = requests.get('https://www.federalreserve.gov/econres/files/scfp2016s.zip')
a = zipfile.ZipFile(io.BytesIO(response.content))
b = a.read(a.namelist()[0])
pd.read_stata(io.BytesIO(b))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.