简体   繁体   English

将字符串解析为字典列表

[英]Parsing a string into a list of dicts

I have a string that looks like this: 我有一个看起来像这样的字符串:

POLYGON ((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))

I can easily strip POLYGON out of the string to focus on the numbers but I'm kinda wondering what would be the easiest/best way to parse this string into a list of dict. 我可以轻松地从字符串中删除POLYGON来关注数字,但是我有点想知道将字符串解析为字典列表的最简单/最佳方法是什么。

The first parenthesis (right after POLYGON) indicates that multiple elements can be provided (separated by a comma , ). 第一个括号(POLYGON后右)表示多个元件可被提供(由逗号分隔, )。

So each pair of numbers is to supposed to be x and y . 因此,每对数字应该是xy

I'd like to parse this string to end up with the following data structure (using python 2.7 ): 我想解析此字符串以得到以下数据结构(使用python 2.7 ):

list [ //list of polygons
  list [ //polygon n°1
    dict { //polygon n°1's first point
      'x': 148210.445767647, //first number
      'y': 172418.761192525 //second number
    },
    dict { //polygon n°1's second point
      'x': 148183.930888667,
      'y': 148183.930888667
    },
    ... // rest of polygon n°1's points
  ], //end of polygon n°1
  list [ // polygon n°2
    dict { // polygon n°2's first point
      'x': 148221.9791684,
      'y': 172344.568316375
    },
    ... // rest of polygon n°2's points
  ] // end of polygon n°2
] // end of list of polygons

Polygons' number of points is virtually infinite. 多边形的点数实际上是无限的。
Each point's numbers are separated by a blank. 每个点的数字用空格分隔。

Do you guys know a way to do this in a loop or any recursive way ? 你们知道循环执行此方法还是任何递归方法?

PS: I'm kind of a python beginner (only a few months under my belt) so don't hesitate to explain in details. PS:我是一个python初学者(仅几个月的经验),所以请随时详细解释。 Thank you! 谢谢!

The data structure you have defining your Polygon object looks very similar to a python tuple declaration. 定义Polygon对象的数据结构看起来非常类似于python元组声明。 One option, albeit a bit hacky would be to use python's AST parser . 一个选项(尽管有点棘手)是使用python的AST解析器

You would have to strip off the POLYGON part and this solution may not work for other declarations that are more complex. 您将不得不剥离POLYGON零件,并且该解决方案可能不适用于更复杂的其他声明。

import ast
your_str = "POLYGON (...)"
# may be better to use a regex to split off the class part 
# if you have different types
data = ast.literal_eval(your_str.replace("POLYGON ",""))
x, y = data
#now you can zip the two x and y pairs together or make them into a dictionary

Lets say u have a string that looks like this 假设您有一个看起来像这样的字符串

my_str = 'POLYGON ((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))' my_str = 'POLYGON((148210.445767647 172418.761192525,148183.930888667 172366.054787545,148183.866770629 172365.316772032,148184.328078148 172364.737139913,148220.543522168 172344.042601933,148221.383518338 172343.971823159),(148221.97916844 172344.568316375,148244.61381946 172406.651932395,148244.578100039 172407.422441673,148244.004662562 172407.938319453,148211.669446582 172419.255646473,148210.631989339 172419.018894911,148210.445767647 172418.761192525))'

my_str = my_str.replace('POLYGON ', '')
coords_groups = my_str.split('), (')

for coords in coords_groups:
    coords.replace('(', '').replace(')', '')
    coords_list = coords.split(', ')
    coords_list2 = []
    for item in coords_list:
        item_split = item.split(' ')
        coords_list2.append({'x', item_split[0], 'y': item_split[1]})

I think this should help a little 我认为这应该有所帮助

All u need now is a way to get info between parenthesis, this should help Regular expression to return text between parenthesis 您现在需要的是在括号之间获取信息的方法,这应该有助于正则表达式在括号之间返回文本

UPDATE updated code above thanks to another answer by https://stackoverflow.com/users/2635860/mccakici , but this works only if u have structure of string as u have said in your question UPDATE更新上面的感谢代码由另一个答案https://stackoverflow.com/users/2635860/mccakici ,但是这只有当u有串的结构为u你的问题说

can you try? 你能试一下吗?

import ast

POLYGON = '((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))'
new_polygon = '(' + POLYGON.replace(', ', '),(').replace(' ', ',') + ')'


data = ast.literal_eval(new_polygon)
result_list = list()
for items in data:
    sub_list = list()
    for item in items:
        sub_list.append({
            'x': item[0],
            'y': item[1]
        })
    result_list.append(sub_list)

print result_list

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM