简体   繁体   中英

Read Shapefile from Google Cloud Storage using Dataflow + Beam + Python

How can one read Shapefile from Google Cloud Storage using Dataflow + Beam + Python.
I've found only beam.io.ReadFromText , but python shapefile reader demands file-like object: shp.Reader(shp=shp_file, dbf=dbf_file) or a shapefile.
I'm using Python 2.7.

This is the way to do it:

prj_file =  beam.io.gcp.gcsio.GcsIO().open(
    filenamePRJ, 
    mode='r',
    read_buffer_size=1677721600, 
    mime_type='application/octet-stream'
)

shp_file = beam.io.gcp.gcsio.GcsIO().open(
    filenameSHP, 
    mode='r',
    read_buffer_size=1677721600,
    mime_type='application/octet-stream'
)

dbf_file =  beam.io.gcp.gcsio.GcsIO().open(
    filenameDBF,
    mode='r',
    read_buffer_size=1677721600,
    mime_type='application/octet-stream'
)

sf = shp.Reader(shp=shp_file, dbf=dbf_file)      
euref  = osr.SpatialReference()
euref.ImportFromWkt(str(prj_file.read()))
wgs84 = osr.SpatialReference()
wgs84.ImportFromEPSG(4326)
transformation = osr.CoordinateTransformation(euref,wgs84)

这是包含用于读取 Shapefile 的自定义 Beam I/O 连接器的 Python 包: https : //github.com/GoogleCloudPlatform/dataflow-geobeam

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM