简体   繁体   English

在Python字典中解析JavaScript数组

[英]Parsing javaScript arrays in the Python dictionaries

So I have a public webpage that contains something like the following code: 因此,我有一个公共网页,其中包含类似以下代码的内容:

var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);

What I want to do is to have Python to read this page and convert the data into 2 dictionaries with the name+description being the key. 我想要做的是让Python阅读此页面并将数据转换为2个字典,其中以名称+描述为键。

ie,

dict1["Name1Description1"] = 1.000

dict2["Name1Description1"] = 2.000

dict1["Name2Description2"] = 4.000

dict2["Name2Description2"] = 8.000

Is there an easy way we could do this or we pretty much have to parse it as any other string? 有没有简单的方法可以做到这一点,或者我们几乎必须将其解析为其他任何字符串? Obviously the array could be of any length. 显然,数组可以是任何长度。

Thanks! 谢谢!

Yes, this is possible using regular expressions. 是的,使用正则表达式是可能的。

import re

st = '''
var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);
'''

dict1, dict2 = {}, {}
matches = re.findall('\"(\d+)\",\"(.*?)\",\"(.*?)\",(\d+.\d+),(\d+.\d+)', st, re.DOTALL)
for m in matches:
    key = m[1] + m[2]
    dict1[key] = float(m[3])
    dict2[key] = float(m[4])

print(dict1)
print(dict2)

# {'Name1description1': 1.0, 'Name2description2': 4.0}
# {'Name1description1': 2.0, 'Name2description2': 8.0}

The logic of the regular expression is: 正则表达式的逻辑是:

\" - Match a double quote
\"(\d+)\" - Match any number of digits contained in between two double quotes
\"(.*?)\" - Match any number of any characters contained between two double quotes
(\d+.\d+) - Match any number of numbers with a dot followed by any number of numbers
, - Match a comma

So the regular expression will match the js string input with this expected pattern. 因此,正则表达式将使用此预期模式匹配js字符串输入。 But I assume the js is without spaces between the commas. 但是我假设js之间的逗号之间没有空格。 You could first strip out of the commas and then run it. 您可以先去除逗号然后运行它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM