简体   繁体   中英

Fastest way to convert JavaScript object/array to Python dict/list

I'm trying to parse the code of JavaScript objects that hold huge JavaScript arrays and convert it to a Python dictionary with lists.

At the moment I'm using PyYaml, but that didn't work directly, as it can't handle consecutive commas (eg it breaks on '[,,,0,]' with: expected the node content, but found ',' ). So I substituted these out, but this is all very slow. I'm wondering if any of you know of a better and faster way to do this. JSON decode doesn't work as JavaScript code isn't JSON valid either.

This is the code I'm using, explained above, with js_obj as example:

js_obj = "{index: '37',data: [, 1, 2, 3,,,]}"

def repl(match):
    content = re.sub(" ", "",match.group(0))
    length = len(content) - 1
    result = ''
    if content[0] == '[':
        result = '[""'
        length -= 1

    after = ','
    if content[-1] == ']':
        length -= 1
        after += '""]'

    return result + (',""' * length) + after

py_dict = yaml.load(re.sub('\[? *(, *)+\]?', repl, js_obj))

You probably should write data from JavaScript using JSON, and then read it into Python in JSON. YAML is OK, but I tend to prefer JSON over YAML; JSON is more consistent.

If you must parse the JavaScript, you might want to look into pyparsing or similar.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM