简体   繁体   English

python - 读入 csv 以提取相应的值

[英]python - read in csv to extract corresponding values

I have a GeoJSON file which was originally a CAD drawing which has been converted.我有一个 GeoJSON 文件,它最初是一个已转换的 CAD 绘图。 The different features are now in the following format:不同的功能现在采用以下格式:

{
"type": "FeatureCollection",
"name": "entities",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature","id":"1","properties": { "Layer": "Ref02-Boundary-River Lagan", "SubClasses": "AcDbEntity:AcDbPolyline", "EntityHandle": "18D49" }, "geometry": { "type": "LineString", "coordinates": [ [ -5.912099758277697, 54.609878675075841, -5.5 ], [ -5.912101882049217, 54.609877129917869, -5.5 ], [ -5.912101882049217, 54.609877129917869, -5.5 ], [ -5.912385911796179, 54.609732322025998, -5.5 ], [ -5.912460771867132, 54.609694155565506, -5.5 ], [ -5.912509312980717, 54.609669407428946, -5.5 ], [ -5.912558290440344, 54.609644436775582, -5.5 ], [ -5.912593219883747, 54.609626628327483, -5.5 ], [ -5.912842999552883, 54.609451465147721, -5.5 ], [ -5.913303982987227, 54.609185055049799, -5.5 ], [ -5.913357621258934, 54.609154056244115, -5.5 ], [ -5.913475791249335, 54.609085762803012, -5.5 ], [ -5.91331045908047, 54.609008818748194, -5.5 ], [ -5.913567798505827, 54.608822686626787, -5.5 ], [ -5.913930556058254, 54.608559661895065, -5.5 ], [ -5.914450423684049, 54.608183772015252, -5.5 ], [ -5.914890158601113, 54.607859968684259, -5.5 ], [ -5.915538373511547, 54.607376600730291, -5.5 ], [ -5.916032865254835, 54.607009313468311, -5.5 ], [ -5.916697729452425, 54.606515616313651, -5.5 ], [ -5.917178407683183, 54.606158685978556, -5.5 ], [ -5.917766043503596, 54.605716418896513, -5.5 ], [ -5.918086236941109, 54.605476215232756, -5.5 ], [ -5.918640259978573, 54.605054489501633, -5.5 ], [ -5.918999711737682, 54.604774215636091, -5.5 ], [ -5.919357947073775, 54.604466516913604, -5.5 ], [ -5.919119477531201, 54.604364287495677, -5.5 ], [ -5.91911946365491, 54.604364281547021, -5.5 ], [ -5.919431266971372, 54.604107513356603, -5.5 ], [ -5.919746627591454, 54.603850932389285, -5.5 ], [ -5.920139467068778, 54.603531307996782, -5.5 ], [ -5.920248667309735, 54.603576372107611, -5.5 ], [ -5.920347471042307, 54.603606382368625, -5.5 ], [ -5.920402664320942, 54.603563838568519, -5.5 ], [ -5.920401738885977, 54.603563453056566, -5.5 ], [ -5.920344866066254, 54.60357812489795, -5.5 ], [ -5.920313678683745, 54.603565255922042, -5.5 ], [ -5.9203068464555, 54.603559486492991, -5.5 ], [ -5.920303789925165, 54.603552837726028, -5.5 ], [ -5.920305749751396, 54.603546999298253, -5.5 ], [ -5.920307430030534, 54.603541993654865, -5.5 ], [ -5.920331543924449, 54.603513953336815, -5.5 ], [ -5.920385677849353, 54.603467389662065, -5.5 ], [ -5.920482867206563, 54.603388806421506, -5.5 ], [ -5.919817541044252, 54.603159111715968, -5.5 ], [ -5.919360395554573, 54.60300128492522, -5.5 ], [ -5.91891893118377, 54.602848869166237, -5.5 ], [ -5.918698193844773, 54.603161730656176, -5.5 ], [ -5.918539363659911, 54.60337070372406, -5.5 ], [ -5.918499839799537, 54.603421756702978, -5.5 ], [ -5.918464898376052, 54.603466890427526, -5.5 ], [ -5.918225682242552, 54.603751339695293, -5.5 ], [ -5.918143444222999, 54.60384144161879, -5.5 ], [ -5.917978369447476, 54.604018308886261, -5.5 ], [ -5.917778937474949, 54.604228055853667, -5.5 ], [ -5.917517847307555, 54.604488421902253, -5.5 ], [ -5.917379353138514, 54.604618038030523, -5.5 ], [ -5.917365475232636, 54.604630431120313, -5.5 ], [ -5.917040197149845, 54.604908732611968, -5.5 ], [ -5.916978245474892, 54.60495804969085, -5.5 ], [ -5.916687343768886, 54.605183975261362, -5.5 ], [ -5.916486539190938, 54.605326465106707, -5.5 ], [ -5.916278267253799, 54.605462208806927, -5.5 ], [ -5.916206535865816, 54.605505889774236, -5.5 ], [ -5.916089228598353, 54.605430749771216, -5.5 ], [ -5.915062917485617, 54.606128298865329, -5.5 ], [ -5.915760690540097, 54.606488653530612, -5.5 ], [ -5.914815684195395, 54.607200189687191, -5.5 ], [ -5.914583029770388, 54.607363894821489, -5.5 ], [ -5.914122726280319, 54.607716003482558, -5.5 ], [ -5.91387658435291, 54.607889193166862, -5.5 ], [ -5.912160615340791, 54.6091048406185, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ], [ -5.912099758063087, 54.609878675231968, -5.5 ], [ -5.911469071222665, 54.609592077745347, -5.5 ] ] } },
{ "type": "Feature","id":"2","properties": { "Layer": "Ref34-Herdman Channel", "SubClasses": "AcDbEntity:AcDbPolyline", "EntityHandle": "18D4A" }, "geometry": { "type": "LineString", "coordinates": [ [ -5.897789395249773, 54.623858842627953, -7.3 ], [ -5.902905537709284, 54.621963513299349, -7.3 ], [ -5.902980447128448, 54.621954733854786, -7.3 ], [ -5.906383586733336, 54.62068210571401, -7.3 ], [ -5.907394611057439, 54.620271860296093, -7.3 ], [ -5.908271700524285, 54.620438570481646, -7.3 ], [ -5.909237173007087, 54.619740109032655, -7.3 ], [ -5.909244065491171, 54.619670532212957, -7.3 ], [ -5.913579743708052, 54.616533508712052, -7.3 ], [ -5.912536954077766, 54.616201946358402, -7.3 ], [ -5.908610159222599, 54.619008315699411, -7.3 ], [ -5.908177623209888, 54.618817165973439, -7.3 ], [ -5.906296536124115, 54.619602975490572, -7.3 ], [ -5.906417907684791, 54.619696755525432, -7.3 ], [ -5.904505962012193, 54.620518375213322, -7.3 ], [ -5.904379303360807, 54.62042000830553, -7.3 ], [ -5.904178353077652, 54.620627706470792, -7.3 ], [ -5.903970927937185, 54.620842094302013, -7.3 ], [ -5.90145584424333, 54.621753037021421, -7.3 ], [ -5.900146085296913, 54.622228665744018, -7.3 ], [ -5.897978434406959, 54.623016291711536, -7.3 ], [ -5.896884714885545, 54.623410922201039, -7.3 ], [ -5.896807416440097, 54.623435588515115, -7.3 ], [ -5.896729976036048, 54.62345325358843, -7.3 ], [ -5.89661111577345, 54.623472342952041, -7.3 ], [ -5.896522858967185, 54.623480690555247, -7.3 ], [ -5.896428106251355, 54.623484457554135, -7.3 ], [ -5.896303166824568, 54.623481292811128, -7.3 ], [ -5.896197472248414, 54.623471252944306, -7.3 ], [ -5.896137759498506, 54.623462493994786, -7.3 ], [ -5.896052388054945, 54.623445836672367, -7.3 ], [ -5.895978571253911, 54.623427154229439, -7.3 ], [ -5.89590524094214, 54.623404208241219, -7.3 ], [ -5.895800416253622, 54.623362445424725, -7.3 ], [ -5.895757569679375, 54.62334176698927, -7.3 ], [ -5.895647923528012, 54.623292641829437, -7.3 ], [ -5.895577692064672, 54.623239431950907, -7.3 ], [ -5.895514910451845, 54.623170119742099, -7.3 ], [ -5.89538163536643, 54.623022980393102, -7.3 ], [ -5.895329415138476, 54.622808933902263, -7.3 ], [ -5.895351007802742, 54.622572974852275, -7.3 ], [ -5.894146142060189, 54.623536453956078, -7.3 ], [ -5.89412889154843, 54.623528837668523, -7.3 ], [ -5.893690605257398, 54.623885226178857, -7.3 ], [ -5.893623854708871, 54.625016224210214, -7.3 ], [ -5.893578643913801, 54.625782216489412, -7.3 ], [ -5.89347404142008, 54.625948034414961, -7.3 ], [ -5.893943848714391, 54.625901816579493, -7.3 ], [ -5.894893061951917, 54.625808430731169, -7.3 ], [ -5.894584457557488, 54.625586465495758, -7.3 ], [ -5.897239997520025, 54.624337565861637, -7.3 ], [ -5.897464132809517, 54.624498603197388, -7.3 ], [ -5.897738236705834, 54.623857397898696, -7.3 ], [ -5.897789395249773, 54.623858842627953, -7.3 ] ] } },
]
}

The GeoJSON file contains over 45,000 different features, but I only need 60 of them. GeoJSON 文件包含超过 45,000 个不同的特征,但我只需要其中的 60 个。

I have a CSV which contains reference numbers, eg In the CSV reference 2 is River Lagan, but in the GeoJSON it's "Layer": "Ref02-Boundary-River Lagan"我有一个 CSV 包含参考编号,例如在 CSV 参考2是 River Lagan,但在 GeoJSON 它是"Layer": "Ref02-Boundary-River Lagan"

Is there anyway of writing a python script to import the csv, read out the reference and extract the corresponding from the GeoJSON?有没有写一个 python 脚本来导入 csv,读出参考并从 GeoJSON 中提取相应的?

So far I have only been able to check the CSV is readable with this code到目前为止,我只能检查 CSV 是否可以使用此代码读取

import csv
file = open("C:/test.csv")
csvreader = csv.reader(file)
header = next(csvreader)
print(header)
rows = []
for row in csvreader:
    rows.append(row)
print(rows)

To read the GeoJSON and filter it down to only contain features that have the appropriate reference in the Layer property:要读取 GeoJSON 并将其过滤为仅包含在Layer属性中具有适当引用的要素:

import json

with open('geo.json') as f:
    geo_json = json.load(f)

reference = 2
geo_json['features'] = [
    feature for feature in geo_json['features']
    if int(feature['properties']['Layer'][3:5]) == reference
]

print (geo_json)

This just loads the GeoJSON as regular JSON, using the standard json module.这只是使用标准json模块将 GeoJSON 作为常规 JSON 加载。 It only selects a single 'reference', 2 in this case, but of course you could easily extend this to include any reference from your source data.它只选择一个“引用”,在这种情况下为2 ,但当然您可以轻松扩展它以包含源数据中的任何引用。

Filtering to all the features matching any reference in the csv (some guesswork required here, because you're not sharing what your csv looks like, but you can work from this):过滤到与 csv 中的任何参考相匹配的所有功能(此处需要进行一些猜测,因为您没有共享 csv 的外观,但您可以从中工作):

import json
import csv

with open('references.csv') as f:
    cr = csv.reader(f)
    header = next(cr)
    ref_column = header.index('reference')  # or whatever the column is called
    references = [int(row[ref_column]) for row in cr]

with open('geo.json') as f:
    geo_json = json.load(f)

geo_json['features'] = [
    feature for feature in geo_json['features']
    if int(feature['properties']['Layer'][3:5]) in references
]

print(geo_json)

This assumes each row has a value for the references column and that the csv has a header - but that's all trivial stuff to change.这假设每一行都有一个引用列的值,并且 csv 有一个 header - 但这都是需要改变的微不足道的东西。

It works by replacing the contents of geo_json['features'] by a filtered list of features, for which the part of the Layer property after 'Ref' (ie starting at character at index 3, ending before index 5) is equal to the value(s) you're looking for, when converted to an integer.它的工作原理是将geo_json['features']的内容替换为过滤后的特征列表,其中'Ref'之后的Layer属性部分(即从索引 3 处的字符开始,在索引 5 之前结束)等于当转换为 integer 时,您正在寻找的值。

This does assume that all Layer properties will be of the form Ref<nn>whatever - if that's not the case, you'll need to write a safer check.这确实假设所有Layer属性都将采用Ref<nn>whatever的形式 - 如果不是这种情况,您将需要编写更安全的检查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM