[英]Extract values from JSON nested list and string array with Python
I am trying to pull the coordinates from multiple neighborhoods in Boston, MA from a JSON dataset but am stuck trying to get just the first coordinate pair for each city;我正在尝试从 JSON 数据集中提取马萨诸塞州波士顿多个社区的坐标,但我一直试图只获取每个城市的第一个坐标对; below is a small version of the Roslindale coordinates.
下面是罗斯林代尔坐标的一个小版本。
"features": [{
"type": "Feature",
"properties": {
"Name": "Roslindale",
"Acres": 1605.5682375,
"SqMiles": 2.51,
},
"geometry": {
"type": "MultiPolygon",
"coordinates": [
[
[
[
-71.125927174853857,
42.272013107957406
],
[
-71.125927174853857,
42.272013107957406
]
]
],
[
[
[
-71.125830766767592,
42.272212845889705
],
[
-71.125830766767592,
42.272212845889705
]
]
],
[
[
[
-71.125767203228904,
42.272315958536389
],
[
-71.125767203228904,
42.272315958536389
]
]
]
]
}
},
Right now I have pulled the data i want using现在我已经提取了我想要使用的数据
for data in boston_neighborhoods:
neighborhood_name = data['properties']['Name']
neighborhood_id = data['properties']['Neighborhood_ID']
neighborhood_size = data['properties']['SqMiles']
neighborhood_latlon = data['geometry']['coordinates']
neighborhood_lat = neighborhood_latlon
neighborhood_lon = neighborhood_latlon
neighborhoods = neighborhoods.append({'Neighborhood': neighborhood_name,
'Neighborhood_ID': neighborhood_id,
'SqMiles': neighborhood_size,
'Latitude': neighborhood_lat,
'Longitude': neighborhood_lon}, ignore_index=True)
This returns multiple coordinate pairs, but i only want the first pair, below is example output of what I am now returning:这将返回多个坐标对,但我只想要第一对,下面是我现在返回的示例 output:
Latitude | Longitude
--------------------------------------------------------
[[[[-71.12592717485386, | [[[[-71.12592717485386,
42.272013107957406], [... | 42.272013107957406], [...
Might be overkill, but JMESPath
makes it really easy to query nested JSON structures like that one.可能有点矫枉过正,但
JMESPath
使得查询嵌套的 JSON 结构变得非常容易。
Traversing down the document, you first need to get every element in the array ( [*]
), then for each element you'll select items into an object (a Python dictionary).遍历文档,您首先需要获取数组中的每个元素 (
[*]
),然后对于每个元素,您将 select 项放入 object (ZA7F5F35426B927411FC9231B563821 字典)中。 You'll select the neighborhood under properties
and then Name
( properties.Name
).您将 select 在
properties
下的邻域,然后是Name
( properties.Name
)。 You do the same for similarly nested properties.您对类似的嵌套属性执行相同的操作。
Coordinates live under geometry.coordinates
which is an array of arrays of arrays of coordinate pairs.坐标位于
geometry.coordinates
下,它是 arrays 坐标对的 arrays 数组。
import jmespath
import pandas as pd
query = """
[*].{
Neighborhood: properties.Name,
Neighborhood_ID: properties.Neighborhood_ID,
SqMiles: properties.SqMiles,
Latitude: geometry.coordinates[0][0][0][0],
Longitude: geometry.coordinates[0][0][0][1]
}
"""
compiled = jmespath.compile(query)
result = compiled.search(boston_neighborhoods)
df = pd.DataFrame.from_records(result)
# Neighborhood Neighborhood_ID SqMiles Latitude Longitude
# 0 Roslindale None 2.51 -71.125927 42.272013
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.