[英]Python - Fastest and most efficent way to convert a list-like string to a list of lists
I'm writting a python program which has to run a c++ program (compiled) in order to manage a very heavy load opperation. 我正在编写一个python程序,该程序必须运行c ++程序(已编译)才能管理非常繁重的操作。
The way I call this executable is subprocess.check_output() 我称之为可执行文件的方式是subprocess.check_output()
Doing so, it returns a very long string like the following: 这样做,它返回一个很长的字符串,如下所示:
executable_output_1 = [0.011, 0.544, 2.314], [7.895, 6.477, 2.573]
executable_output_2 = [4.255, 6.235, 7.566], [9.522, 7.321, 1.234]
type(executable_output) >>> <type 'str'>
In this example I wrote a quite short string, but in the real output it's really long. 在此示例中,我编写了一个很短的字符串,但是在实际输出中,它确实很长。
I'd like to do some operations with that data back in python, so I need a list of lists (as I'll call that executable multiple times). 我想在python中对该数据进行一些操作,因此我需要一个列表列表(因为我将多次调用该可执行文件)。
How could I convert that string into a list of lists? 如何将字符串转换为列表列表?
Desired output: 所需的输出:
executables_outputs_list = [[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]], [[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]]]
type(executable_output_list) >>> <type 'list'>
type(executable_output_list[0]) >>> <type 'list'>
type(executable_output_list[0][0][0]) >>> <type 'float'>
Use ast.literal_eval
: 使用
ast.literal_eval
:
>>> from ast import literal_eval
>>> executable_output = '[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]'
>>> literal_eval(executable_output)
([0.011, 0.544, 2.314], [7.895, 6.477, 2.573])
This is a tuple
. 这是一个
tuple
。 To convert to a list
: 转换为
list
:
>>> list(literal_eval(executable_output))
[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]]
You can make use of python's json
package which should be faster than the ast.literal_eval
. 您可以使用python的
json
包,该包应比ast.literal_eval
快。
# setup
from ast import literal_eval
import json
executable_output_1 = "[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]"
executable_output_2 = "[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]"
outputs = (executable_output_1, executable_output_2)
The json
approach has the following timing: 100000 loops, best of 3: 14.5 µs per loop
: json
方法具有以下时序: 100000 loops, best of 3: 14.5 µs per loop
:
def extract(output_strings):
template = '{{"values": [{}]}}'
parse_func = lambda x: json.loads(template.format(x))
return [parse_func(x)["values"] for x in output_strings]
extract(outputs)
>>> [[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]],
[[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]]]
The ast
approach is 3x slower for the dummy data with 10000 loops, best of 3: 51.6 µs per loop
: 对于
10000 loops, best of 3: 51.6 µs per loop
伪数据的ast
方法要慢3倍10000 loops, best of 3: 51.6 µs per loop
:
def extract_ast(output_strings):
return [list(literal_eval(x)) for x in output_strings]
extract_ast(outputs)
>>> [[[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]],
[[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]]]
The json
approach improves with increasing amount of data to be parsed. json
方法随着要解析的数据量的增加而改进。 With the following setup, the json
approach yields 1000 loops, best of 3: 291 µs per loop
compared to the ast
with 100 loops, best of 3: 3.95 ms per loop
which is 13x faster. 通过以下设置,
json
方法产生1000 loops, best of 3: 291 µs per loop
,而ast
有100 loops, best of 3: 3.95 ms per loop
这快13倍 。
executable_output_1 = ",".join(["[0.011, 0.544, 2.314], [7.895, 6.477, 2.573]"] * 100)
executable_output_2 = ",".join(["[4.255, 6.235, 7.566], [9.522, 7.321, 1.234]"] * 100)
outputs = (executable_output_1, executable_output_2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.