简体   繁体   English

"如何将原始javascript对象转换为python字典?"

[英]How to convert raw javascript object to python dictionary?

When screen-scraping some website, I extract data from <script> tags.当屏幕抓取某些网站时,我从<script>标签中提取数据。
The data I get is not in standard JSON format.我得到的数据不是标准的JSON格式。 I cannot use json.loads() .我不能使用json.loads()

# from
js_obj = '{x:1, y:2, z:3}'

# to
py_obj = {'x':1, 'y':2, 'z':3}

Currently, I use regex to transform the raw data to JSON format.目前,我使用regex将原始数据转换为JSON格式。
But I feel pretty bad when I encounter complicated data structure.但是当我遇到复杂的数据结构时,我感觉很糟糕。

Do you have some better solutions?你有更好的解决方案吗?

I'm facing the same problem this afternoon, and I finally found a quite good solution.今天下午我面临同样的问题,我终于找到了一个很好的解决方案。 That is JSON5 .那是JSON5

The syntax of JSON5 is more similar to native JavaScript, so it can help you parse non-standard JSON objects. JSON5 的语法更类似于原生 JavaScript,因此可以帮助您解析非标准的 JSON 对象。

You might want to check pyjson5 out.您可能想检查pyjson5

This will likely not work everywhere, but as a start, here's a simple regex that should convert the keys into quoted strings so you can pass into json.loads.这可能不适用于任何地方,但首先,这是一个简单的正则表达式,它应该将键转换为带引号的字符串,以便您可以传递到 json.loads。 Or is this what you're already doing?或者这就是你已经在做的?

In[70] : quote_keys_regex = r'([\{\s,])(\w+)(:)'

In[71] : re.sub(quote_keys_regex, r'\1"\2"\3', js_obj)
Out[71]: '{"x":1, "y":2, "z":3}'

In[72] : js_obj_2 = '{x:1, y:2, z:{k:3,j:2}}'

Int[73]: re.sub(quote_keys_regex, r'\1"\2"\3', js_obj_2)
Out[73]: '{"x":1, "y":2, "z":{"k":3,"j":2}}'

Use json5使用json5

import json5

js_obj = '{x:1, y:2, z:3}'

py_obj = json5.loads(js_obj)

print(py_obj)

# output
# {'x': 1, 'y': 2, 'z': 3}

If you have node<\/code> available on the system, you can ask it to evaluate the javascript expression for you, and print the stringified result.如果系统上有可用的node<\/code> ,您可以要求它为您评估 javascript 表达式,并打印字符串化结果。 The resulting JSON can then be fed to json.loads<\/code> :然后可以将生成的 JSON 馈送到json.loads<\/code> :

def evaluate_javascript(s):
    """Evaluate and stringify a javascript expression in node.js, and convert the
    resulting JSON to a Python object"""
    node = Popen(['node', '-'], stdin=PIPE, stdout=PIPE)
    stdout, _ = node.communicate(f'console.log(JSON.stringify({s}))'.encode('utf8'))
    return json.loads(stdout.decode('utf8'))

Not including objects<\/strong>不包括对象<\/strong>

json.loads()<\/a> json.loads()<\/a><\/h1>

Simply: 只是:

import json
py_obj = json.loads(js_obj_stringified)

Above is the Python portion of the code. 上面是代码的Python部分。 In javascript portion of the code: 在代码的javascript部分:

js_obj_stringified = JSON.stringify(data);

JSON.stringify turns a Javascript object into JSON text and stores that JSON text in a string. JSON.stringify将Javascript对象转换为JSON文本,并将该JSON文本存储在字符串中。 It is a safe way to pass (via POST/GET) a javascript object to python to process. 这是将JavaScript对象(通过POST / GET)传递给python进行处理的安全方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM