简体   繁体   English

将颜色的字符串表示形式转换回列表

[英]Convert string representation of colors back to list

I had data where one column was a list of numbers, but when saving to a CSV, I guess this got stored as a string. 我有数据,其中一列是数字列表,但是当保存为CSV时,我猜想它已存储为字符串。 I want to convert this list of strings back to a list of lists. 我想将此字符串列表转换回列表列表。

So here's what my data looks like now: 所以这是我现在的数据:

import pandas as pd 
from ast import literal_eval

colors = ["(120, 120, 80)", "(90, 10, 100)"]
names = ["name1", "name2"]
data = {
    "colors":colors,
    "names":names
}
df = pd.DataFrame(data)

And from reading Stackoverflow, I tried the literal eval method but it did not work: 通过阅读Stackoverflow,我尝试了文字eval方法,但是它不起作用:

try:
  df['colors'] = literal_eval( df['colors'].tolist() )
except ValueError as e:
  print(e)

I get a malformed string error. 我收到格式错误的字符串错误。

您可以为每个列执行以下操作:

col = [int(val) for val in colors.replace("(","").replace(")","").split(",")]

Using literal_eval() is a good approach. 使用literal_eval()是一种很好的方法。 The issue is that it needs to be applied to each sub-list (string) individually. 问题在于它需要分别应用于每个子列表(字符串)。 A pythonic approach would be to use a list comprehension as follows: pythonic方法是使用列表理解,如下所示:

>>> from ast import literal_eval
>>> colors = ["(120, 120, 80)", "(90, 10, 100)"]
>>> colors = [literal_eval(x) for x in colors]
>>> colors
[(120, 120, 80), (90, 10, 100)]

To get a list of list instead of a list of tuple , you could use: 要获取list list而不是tuple list ,可以使用:

>>> from ast import literal_eval
>>> colors = ["(120, 120, 80)", "(90, 10, 100)"]
>>> colors = [list(literal_eval(x)) for x in colors]
>>> colors
[[120, 120, 80], [90, 10, 100]]

The Python documentation for ast.literal_eval(node_or_string) states: ast.literal_eval(node_or_string)的Python文档指出:

Safely evaluate an expression node or a string containing a Python literal or container display. 安全地评估表达式节点或包含Python文字或容器显示的字符串。 The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None. 提供的字符串或节点只能由以下Python文字结构组成:字符串,字节,数字,元组,列表,字典,集合,布尔值和无。

This can be used for safely evaluating strings containing Python values from untrusted sources without the need to parse the values oneself. 这可用于安全地评估包含来自不受信任来源的Python值的字符串,而无需自己解析值。 It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing. 它不能评估任意复杂的表达式,例如涉及运算符或索引的表达式。

Using re.findall to extract digits and apply over the series: 使用re.findall提取数字并apply该系列:

import re
df['colors'].apply(lambda str : [int(s) for s in re.findall(r'\d+',str) ]).tolist()

#  [[120, 120, 80], [90, 10, 100]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM