将字符串解析为字典列表

Question

I have a string that looks like this: 我有一个看起来像这样的字符串：

POLYGON ((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))

I can easily strip POLYGON out of the string to focus on the numbers but I'm kinda wondering what would be the easiest/best way to parse this string into a list of dict. 我可以轻松地从字符串中删除POLYGON来关注数字，但是我有点想知道将字符串解析为字典列表的最简单/最佳方法是什么。

The first parenthesis (right after POLYGON) indicates that multiple elements can be provided (separated by a comma , ). 第一个括号（POLYGON后右）表示多个元件可被提供（由逗号分隔, ）。

So each pair of numbers is to supposed to be x and y . 因此，每对数字应该是x和y 。

I'd like to parse this string to end up with the following data structure (using python 2.7 ): 我想解析此字符串以得到以下数据结构（使用python 2.7 ）：

list [ //list of polygons
  list [ //polygon n°1
    dict { //polygon n°1's first point
      'x': 148210.445767647, //first number
      'y': 172418.761192525 //second number
    },
    dict { //polygon n°1's second point
      'x': 148183.930888667,
      'y': 148183.930888667
    },
    ... // rest of polygon n°1's points
  ], //end of polygon n°1
  list [ // polygon n°2
    dict { // polygon n°2's first point
      'x': 148221.9791684,
      'y': 172344.568316375
    },
    ... // rest of polygon n°2's points
  ] // end of polygon n°2
] // end of list of polygons

Polygons' number of points is virtually infinite. 多边形的点数实际上是无限的。
Each point's numbers are separated by a blank. 每个点的数字用空格分隔。

Do you guys know a way to do this in a loop or any recursive way ? 你们知道循环执行此方法还是任何递归方法？

PS: I'm kind of a python beginner (only a few months under my belt) so don't hesitate to explain in details. PS：我是一个python初学者（仅几个月的经验），所以请随时详细解释。 Thank you! 谢谢！

Answer 1

The data structure you have defining your Polygon object looks very similar to a python tuple declaration. 定义Polygon对象的数据结构看起来非常类似于python元组声明。 One option, albeit a bit hacky would be to use python's AST parser . 一个选项（尽管有点棘手）是使用python的AST解析器。

You would have to strip off the POLYGON part and this solution may not work for other declarations that are more complex. 您将不得不剥离POLYGON零件，并且该解决方案可能不适用于更复杂的其他声明。

import ast
your_str = "POLYGON (...)"
# may be better to use a regex to split off the class part 
# if you have different types
data = ast.literal_eval(your_str.replace("POLYGON ",""))
x, y = data
#now you can zip the two x and y pairs together or make them into a dictionary

Answer 2

Lets say u have a string that looks like this 假设您有一个看起来像这样的字符串

my_str = 'POLYGON ((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))' my_str = 'POLYGON（（148210.445767647 172418.761192525，148183.930888667 172366.054787545，148183.866770629 172365.316772032，148184.328078148 172364.737139913，148220.543522168 172344.042601933，148221.383518338 172343.971823159），（148221.97916844 172344.568316375，148244.61381946 172406.651932395，148244.578100039 172407.422441673，148244.004662562 172407.938319453，148211.669446582 172419.255646473，148210.631989339 172419.018894911，148210.445767647 172418.761192525））'

my_str = my_str.replace('POLYGON ', '')
coords_groups = my_str.split('), (')

for coords in coords_groups:
    coords.replace('(', '').replace(')', '')
    coords_list = coords.split(', ')
    coords_list2 = []
    for item in coords_list:
        item_split = item.split(' ')
        coords_list2.append({'x', item_split[0], 'y': item_split[1]})

I think this should help a little 我认为这应该有所帮助

All u need now is a way to get info between parenthesis, this should help Regular expression to return text between parenthesis 您现在需要的是在括号之间获取信息的方法，这应该有助于正则表达式在括号之间返回文本

UPDATE updated code above thanks to another answer by https://stackoverflow.com/users/2635860/mccakici , but this works only if u have structure of string as u have said in your question UPDATE更新上面的感谢代码由另一个答案https://stackoverflow.com/users/2635860/mccakici ，但是这只有当u有串的结构为u你的问题说

Answer 3

can you try? 你能试一下吗？

import ast

POLYGON = '((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))'
new_polygon = '(' + POLYGON.replace(', ', '),(').replace(' ', ',') + ')'


data = ast.literal_eval(new_polygon)
result_list = list()
for items in data:
    sub_list = list()
    for item in items:
        sub_list.append({
            'x': item[0],
            'y': item[1]
        })
    result_list.append(sub_list)

print result_list

将字符串解析为字典列表

问题描述

3 个解决方案

解决方案1
2 2014-05-21 13:59:23

解决方案2
1 2014-05-21 13:53:40

解决方案3
1 已采纳 2014-05-21 14:04:28

将字符串解析为字典列表

问题描述

3 个解决方案

解决方案1 2 2014-05-21 13:59:23

解决方案2 1 2014-05-21 13:53:40

解决方案3 1 已采纳 2014-05-21 14:04:28

解决方案1
2 2014-05-21 13:59:23

解决方案2
1 2014-05-21 13:53:40

解决方案3
1 已采纳 2014-05-21 14:04:28