繁体   English   中英

python - 读入 csv 以提取相应的值

[英]python - read in csv to extract corresponding values

我有一个 GeoJSON 文件,它最初是一个已转换的 CAD 绘图。 不同的功能现在采用以下格式:

{
"type": "FeatureCollection",
"name": "entities",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature","id":"1","properties": { "Layer": "Ref02-Boundary-River Lagan", "SubClasses": "AcDbEntity:AcDbPolyline", "EntityHandle": "18D49" }, "geometry": { "type": "LineString", "coordinates": [ [ -5.912099758277697, 54.609878675075841, -5.5 ], [ -5.912101882049217, 54.609877129917869, -5.5 ], [ -5.912101882049217, 54.609877129917869, -5.5 ], [ -5.912385911796179, 54.609732322025998, -5.5 ], [ -5.912460771867132, 54.609694155565506, -5.5 ], [ -5.912509312980717, 54.609669407428946, -5.5 ], [ -5.912558290440344, 54.609644436775582, -5.5 ], [ -5.912593219883747, 54.609626628327483, -5.5 ], [ -5.912842999552883, 54.609451465147721, -5.5 ], [ -5.913303982987227, 54.609185055049799, -5.5 ], [ -5.913357621258934, 54.609154056244115, -5.5 ], [ -5.913475791249335, 54.609085762803012, -5.5 ], [ -5.91331045908047, 54.609008818748194, -5.5 ], [ -5.913567798505827, 54.608822686626787, -5.5 ], [ -5.913930556058254, 54.608559661895065, -5.5 ], [ -5.914450423684049, 54.608183772015252, -5.5 ], [ -5.914890158601113, 54.607859968684259, -5.5 ], [ -5.915538373511547, 54.607376600730291, -5.5 ], [ -5.916032865254835, 54.607009313468311, -5.5 ], [ -5.916697729452425, 54.606515616313651, -5.5 ], [ -5.917178407683183, 54.606158685978556, -5.5 ], [ -5.917766043503596, 54.605716418896513, -5.5 ], [ -5.918086236941109, 54.605476215232756, -5.5 ], [ -5.918640259978573, 54.605054489501633, -5.5 ], [ -5.918999711737682, 54.604774215636091, -5.5 ], [ -5.919357947073775, 54.604466516913604, -5.5 ], [ -5.919119477531201, 54.604364287495677, -5.5 ], [ -5.91911946365491, 54.604364281547021, -5.5 ], [ -5.919431266971372, 54.604107513356603, -5.5 ], [ -5.919746627591454, 54.603850932389285, -5.5 ], [ -5.920139467068778, 54.603531307996782, -5.5 ], [ -5.920248667309735, 54.603576372107611, -5.5 ], [ -5.920347471042307, 54.603606382368625, -5.5 ], [ -5.920402664320942, 54.603563838568519, -5.5 ], [ -5.920401738885977, 54.603563453056566, -5.5 ], [ -5.920344866066254, 54.60357812489795, -5.5 ], [ -5.920313678683745, 54.603565255922042, -5.5 ], [ -5.9203068464555, 54.603559486492991, -5.5 ], [ -5.920303789925165, 54.603552837726028, -5.5 ], [ -5.920305749751396, 54.603546999298253, -5.5 ], [ -5.920307430030534, 54.603541993654865, -5.5 ], [ -5.920331543924449, 54.603513953336815, -5.5 ], [ -5.920385677849353, 54.603467389662065, -5.5 ], [ -5.920482867206563, 54.603388806421506, -5.5 ], [ -5.919817541044252, 54.603159111715968, -5.5 ], [ -5.919360395554573, 54.60300128492522, -5.5 ], [ -5.91891893118377, 54.602848869166237, -5.5 ], [ -5.918698193844773, 54.603161730656176, -5.5 ], [ -5.918539363659911, 54.60337070372406, -5.5 ], [ -5.918499839799537, 54.603421756702978, -5.5 ], [ -5.918464898376052, 54.603466890427526, -5.5 ], [ -5.918225682242552, 54.603751339695293, -5.5 ], [ -5.918143444222999, 54.60384144161879, -5.5 ], [ -5.917978369447476, 54.604018308886261, -5.5 ], [ -5.917778937474949, 54.604228055853667, -5.5 ], [ -5.917517847307555, 54.604488421902253, -5.5 ], [ -5.917379353138514, 54.604618038030523, -5.5 ], [ -5.917365475232636, 54.604630431120313, -5.5 ], [ -5.917040197149845, 54.604908732611968, -5.5 ], [ -5.916978245474892, 54.60495804969085, -5.5 ], [ -5.916687343768886, 54.605183975261362, -5.5 ], [ -5.916486539190938, 54.605326465106707, -5.5 ], [ -5.916278267253799, 54.605462208806927, -5.5 ], [ -5.916206535865816, 54.605505889774236, -5.5 ], [ -5.916089228598353, 54.605430749771216, -5.5 ], [ -5.915062917485617, 54.606128298865329, -5.5 ], [ -5.915760690540097, 54.606488653530612, -5.5 ], [ -5.914815684195395, 54.607200189687191, -5.5 ], [ -5.914583029770388, 54.607363894821489, -5.5 ], [ -5.914122726280319, 54.607716003482558, -5.5 ], [ -5.91387658435291, 54.607889193166862, -5.5 ], [ -5.912160615340791, 54.6091048406185, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ], [ -5.912099758063087, 54.609878675231968, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ] ] } },
{ "type": "Feature","id":"2","properties": { "Layer": "Ref34-Herdman Channel", "SubClasses": "AcDbEntity:AcDbPolyline", "EntityHandle": "18D4A" }, "geometry": { "type": "LineString", "coordinates": [ [ -5.897789395249773, 54.623858842627953, -7.3 ], [ -5.902905537709284, 54.621963513299349, -7.3 ], [ -5.902980447128448, 54.621954733854786, -7.3 ], [ -5.906383586733336, 54.62068210571401, -7.3 ], [ -5.907394611057439, 54.620271860296093, -7.3 ], [ -5.908271700524285, 54.620438570481646, -7.3 ], [ -5.909237173007087, 54.619740109032655, -7.3 ], [ -5.909244065491171, 54.619670532212957, -7.3 ], [ -5.913579743708052, 54.616533508712052, -7.3 ], [ -5.912536954077766, 54.616201946358402, -7.3 ], [ -5.908610159222599, 54.619008315699411, -7.3 ], [ -5.908177623209888, 54.618817165973439, -7.3 ], [ -5.906296536124115, 54.619602975490572, -7.3 ], [ -5.906417907684791, 54.619696755525432, -7.3 ], [ -5.904505962012193, 54.620518375213322, -7.3 ], [ -5.904379303360807, 54.62042000830553, -7.3 ], [ -5.904178353077652, 54.620627706470792, -7.3 ], [ -5.903970927937185, 54.620842094302013, -7.3 ], [ -5.90145584424333, 54.621753037021421, -7.3 ], [ -5.900146085296913, 54.622228665744018, -7.3 ], [ -5.897978434406959, 54.623016291711536, -7.3 ], [ -5.896884714885545, 54.623410922201039, -7.3 ], [ -5.896807416440097, 54.623435588515115, -7.3 ], [ -5.896729976036048, 54.62345325358843, -7.3 ], [ -5.89661111577345, 54.623472342952041, -7.3 ], [ -5.896522858967185, 54.623480690555247, -7.3 ], [ -5.896428106251355, 54.623484457554135, -7.3 ], [ -5.896303166824568, 54.623481292811128, -7.3 ], [ -5.896197472248414, 54.623471252944306, -7.3 ], [ -5.896137759498506, 54.623462493994786, -7.3 ], [ -5.896052388054945, 54.623445836672367, -7.3 ], [ -5.895978571253911, 54.623427154229439, -7.3 ], [ -5.89590524094214, 54.623404208241219, -7.3 ], [ -5.895800416253622, 54.623362445424725, -7.3 ], [ -5.895757569679375, 54.62334176698927, -7.3 ], [ -5.895647923528012, 54.623292641829437, -7.3 ], [ -5.895577692064672, 54.623239431950907, -7.3 ], [ -5.895514910451845, 54.623170119742099, -7.3 ], [ -5.89538163536643, 54.623022980393102, -7.3 ], [ -5.895329415138476, 54.622808933902263, -7.3 ], [ -5.895351007802742, 54.622572974852275, -7.3 ], [ -5.894146142060189, 54.623536453956078, -7.3 ], [ -5.89412889154843, 54.623528837668523, -7.3 ], [ -5.893690605257398, 54.623885226178857, -7.3 ], [ -5.893623854708871, 54.625016224210214, -7.3 ], [ -5.893578643913801, 54.625782216489412, -7.3 ], [ -5.89347404142008, 54.625948034414961, -7.3 ], [ -5.893943848714391, 54.625901816579493, -7.3 ], [ -5.894893061951917, 54.625808430731169, -7.3 ], [ -5.894584457557488, 54.625586465495758, -7.3 ], [ -5.897239997520025, 54.624337565861637, -7.3 ], [ -5.897464132809517, 54.624498603197388, -7.3 ], [ -5.897738236705834, 54.623857397898696, -7.3 ], [ -5.897789395249773, 54.623858842627953, -7.3 ] ] } },
]
}

GeoJSON 文件包含超过 45,000 个不同的特征,但我只需要其中的 60 个。

我有一个 CSV 包含参考编号,例如在 CSV 参考2是 River Lagan,但在 GeoJSON 它是"Layer": "Ref02-Boundary-River Lagan"

有没有写一个 python 脚本来导入 csv,读出参考并从 GeoJSON 中提取相应的?

到目前为止,我只能检查 CSV 是否可以使用此代码读取

import csv
file = open("C:/test.csv")
csvreader = csv.reader(file)
header = next(csvreader)
print(header)
rows = []
for row in csvreader:
    rows.append(row)
print(rows)

要读取 GeoJSON 并将其过滤为仅包含在Layer属性中具有适当引用的要素:

import json

with open('geo.json') as f:
    geo_json = json.load(f)

reference = 2
geo_json['features'] = [
    feature for feature in geo_json['features']
    if int(feature['properties']['Layer'][3:5]) == reference
]

print (geo_json)

这只是使用标准json模块将 GeoJSON 作为常规 JSON 加载。 它只选择一个“引用”,在这种情况下为2 ,但当然您可以轻松扩展它以包含源数据中的任何引用。

过滤到与 csv 中的任何参考相匹配的所有功能(此处需要进行一些猜测,因为您没有共享 csv 的外观,但您可以从中工作):

import json
import csv

with open('references.csv') as f:
    cr = csv.reader(f)
    header = next(cr)
    ref_column = header.index('reference')  # or whatever the column is called
    references = [int(row[ref_column]) for row in cr]

with open('geo.json') as f:
    geo_json = json.load(f)

geo_json['features'] = [
    feature for feature in geo_json['features']
    if int(feature['properties']['Layer'][3:5]) in references
]

print(geo_json)

这假设每一行都有一个引用列的值,并且 csv 有一个 header - 但这都是需要改变的微不足道的东西。

它的工作原理是将geo_json['features']的内容替换为过滤后的特征列表,其中'Ref'之后的Layer属性部分(即从索引 3 处的字符开始,在索引 5 之前结束)等于当转换为 integer 时,您正在寻找的值。

这确实假设所有Layer属性都将采用Ref<nn>whatever的形式 - 如果不是这种情况,您将需要编写更安全的检查。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM