简体   繁体   English

Python-将类似列表的字符串转换为列表列表的最快,最有效的方法

[英]Python - Fastest and most efficent way to convert a list-like string to a list of lists

I'm writting a python program which has to run a c++ program (compiled) in order to manage a very heavy load opperation. 我正在编写一个python程序,该程序必须运行c ++程序(已编译)才能管理非常繁重的操作。

The way I call this executable is subprocess.check_output() 我称之为可执行文件的方式是subprocess.check_output()

Doing so, it returns a very long string like the following: 这样做,它返回一个很长的字符串,如下所示:

executable_output_1 = [0.011, 0.544, 2.314], [7.895, 6.477, 2.573]

executable_output_2 = [4.255, 6.235, 7.566], [9.522, 7.321, 1.234]

type(executable_output) >>> <type 'str'>

In this example I wrote a quite short string, but in the real output it's really long. 在此示例中,我编写了一个很短的字符串,但是在实际输出中,它确实很长。

I'd like to do some operations with that data back in python, so I need a list of lists (as I'll call that executable multiple times). 我想在python中对该数据进行一些操作,因此我需要一个列表列表(因为我将多次调用该可执行文件)。

How could I convert that string into a list of lists? 如何将字符串转换为列表列表?

Desired output: 所需的输出:

executables_outputs_list = [[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]], [[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]]]

type(executable_output_list) >>> <type 'list'>

type(executable_output_list[0]) >>> <type 'list'>

type(executable_output_list[0][0][0]) >>> <type 'float'>

Use ast.literal_eval : 使用ast.literal_eval

>>> from ast import literal_eval
>>> executable_output = '[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]'
>>> literal_eval(executable_output)
([0.011, 0.544, 2.314], [7.895, 6.477, 2.573])

This is a tuple . 这是一个tuple To convert to a list : 转换为list

>>> list(literal_eval(executable_output))
[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]]

You can make use of python's json package which should be faster than the ast.literal_eval . 您可以使用python的json包,该包应比ast.literal_eval快。

# setup
from ast import literal_eval
import json

executable_output_1 = "[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]"
executable_output_2 = "[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]"

outputs = (executable_output_1, executable_output_2)

The json approach has the following timing: 100000 loops, best of 3: 14.5 µs per loop : json方法具有以下时序: 100000 loops, best of 3: 14.5 µs per loop

def extract(output_strings):
    template = '{{"values": [{}]}}'
    parse_func = lambda x: json.loads(template.format(x))
    return [parse_func(x)["values"] for x in output_strings]

extract(outputs)

>>> [[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]],
     [[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]]]

The ast approach is 3x slower for the dummy data with 10000 loops, best of 3: 51.6 µs per loop : 对于10000 loops, best of 3: 51.6 µs per loop伪数据的ast方法要慢3倍10000 loops, best of 3: 51.6 µs per loop

def extract_ast(output_strings):
    return [list(literal_eval(x)) for x in output_strings]

extract_ast(outputs)

>>> [[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]],
     [[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]]]

The json approach improves with increasing amount of data to be parsed. json方法随着要解析的数据量的增加而改进。 With the following setup, the json approach yields 1000 loops, best of 3: 291 µs per loop compared to the ast with 100 loops, best of 3: 3.95 ms per loop which is 13x faster. 通过以下设置, json方法产生1000 loops, best of 3: 291 µs per loop ,而ast100 loops, best of 3: 3.95 ms per loop这快13倍

executable_output_1 = ",".join(["[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]"] * 100)
executable_output_2 = ",".join(["[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]"] * 100)

outputs = (executable_output_1, executable_output_2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM