[英]Read SHP file from SFTP using pysftp
I am trying to use pysftp's getfo() to read a shapefile (without downloading it).我正在尝试使用 pysftp 的 getfo() 来读取 shapefile(无需下载)。 However the output I get does not seem workable and I'm not sure if its possible do this with a shapefile.
然而,我得到的 output 似乎不可行,我不确定它是否可以用 shapefile 做到这一点。
Ideally I would like to read in the file and convert it to a Geopandas GeoDataFrame.理想情况下,我想读入文件并将其转换为 Geopandas GeoDataFrame。
import pysftp
import io
with pysftp.Connection(host=myHostname, username=myUsername, password=myPassword, cnopts=cnopts) as sftp:
print("Connection established ... ")
# check files in directory
directory_structure = sftp.listdir('sites')
# Print data
for attr in directory_structure:
print(attr)
flo = io.BytesIO()
sites = sftp.getfo('sites/Sites.shp', flo)
value=flo.getvalue()
From here I can't decode the value and am unsure of how to proceed.从这里我无法解码该值,并且不确定如何继续。
Something like this should do:这样的事情应该做:
flo.seek(0)
df = geopandas.read_file(shp=flo)
Though using the Connection.getfo
unnecesarily keeps whole raw file in memory.尽管使用
Connection.getfo
不必要地将整个原始文件保留在 memory 中。 More efficient would be:更有效的是:
with sftp.open('sites/Sites.shp', bufsize=32768) as f:
df = geopandas.read_file(f)
(for the purpose of bufsize=32768
, see Reading file opened with Python Paramiko SFTPClient.open method is slow ) (对于
bufsize=32768
的目的,请参阅读取用 Python Paramiko SFTPClient.open 方法打开的文件很慢)
Btw, note that the code downloads the file anyway.顺便说一句,请注意代码无论如何都会下载文件。 You cannot parse a remote file contents, without actually downloading that file contents.
您无法解析远程文件内容,而无需实际下载该文件内容。 The code just avoids storing the downloaded file contents to a (temporary) local file.
该代码只是避免将下载的文件内容存储到(临时)本地文件中。
Using Martin's answer, I was able to get to the outcome I needed.使用马丁的回答,我能够得到我需要的结果。 However it was required that the data was zipped in the SFTP, so it required a change to the datasource.
但是,需要将数据压缩到 SFTP 中,因此需要更改数据源。
The output here is a dataframe but this is easily adapted to be a GeoDataFrame.这里的 output 是 dataframe 但这很容易适应为 GeoDataFrame。
with pysftp.Connection(host=myHostname, username=myUsername, password=myPassword, cnopts=cnopts) as sftp:
print("Connection succesfully stablished ... ")
zipshape = zipfile.ZipFile(sftp.open('sites/sites_test.zip', bufsize=32768))
r = shapefile.Reader(
shp=zipshape.open('sites/Sites.shp'),
shx=zipshape.open('sites/Sites.SHX'),
dbf=zipshape.open('sites/Sites.DBF')
)
#check we have actually got the file
print(r.bbox)
print(r.numRecords)
#get field names
fields = [x[0] for x in r.fields][1:]
records = r.records()
#get coords and tidy up
shps = [s.points for s in r.shapes()]
#write the records into a dataframe
gdf = pd.DataFrame(columns=fields, data=records)
#add the coordinate data to a column called "coords"
gdf = gdf.assign(coords=coords)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.