[英]How to convert string representation list with mixed values to a list?
How do can I convert a string that contains values that are both strings and numeric, given that the string within the list is not in quotes?如果列表中的字符串不在引号中,我如何转换包含字符串和数字值的字符串?
import pandas as pd
df = pd.DataFrame({'col_1': ['[2, A]', '[5, BC]']})
print(df)
col_1
0 [2, A]
1 [5, BC]
col_1 [2, A]
Name: 0, dtype: object
My aim is to use the list in another function, so I tried to transform the string with built-in functions such as eval() or ast.literal_eval(), however in both cases I need to add quotes around the strings, so it is "A" and "BC".我的目标是在另一个 function 中使用该列表,因此我尝试使用内置函数(例如 eval() 或 ast.literal_eval() 转换字符串,但是在这两种情况下我都需要在字符串周围添加引号,所以它是“A”和“BC”。
You can first use a regex to add quotes around the potential strings (here I used letters + underscore), then use literal_eval
(for some reason I have an error with pd.eval
)您可以先使用正则表达式在潜在字符串周围添加引号(这里我使用字母 + 下划线),然后使用
literal_eval
(出于某种原因我在pd.eval
中出错)
from ast import literal_eval
df['col_1'].str.replace(r'([a-zA-Z_]+)', r'"\1"', regex=True).apply(literal_eval)
output (lists): output(名单):
0 [2, A]
1 [5, BC]
It is already a string and If the data is going to be in a certain format-它已经是一个字符串并且如果数据将采用某种格式-
df['col_2'] = df['col_1'].apply(lambda x: x.split(',')[1].rstrip(']'))
If you want the output to be a list:如果您希望 output 成为列表:
import pandas as pd
df = pd.DataFrame({'col_1': ['[2, A]', '[5, BC]']})
print(df)
a = df["col_1"].to_list()
actual_list = [[int(i.split(",")[0][1:]), str(i.split(",")[1][1:-1])] for i in a]
actual_list
Output: Output:
[[2, 'A'], [5, 'BC']]
If you just need to convert string representation list to list of strings, you can use str.strip()
together with str.split()
, as follows:如果您只需要将字符串表示列表转换为字符串列表,则可以将
str.strip()
与str.split()
) 一起使用,如下所示:
df['col_1'].str.strip('[]').str.split(',\s*')
Result:结果:
print(df['col_1'].str.strip('[]').str.split(',\s*').to_dict())
{0: ['2', 'A'], 1: ['5', 'BC']}
If you further want to convert the strings of numeric values to numbers, you can further use pd.to_numeric()
, as follows:如果想进一步将数值字符串转换为数字,可以进一步使用
pd.to_numeric()
,如下:
df['col_1'].str.strip('[]').str.split(',\s*').apply(lambda x: [pd.to_numeric(y, errors='ignore') for y in x])
Result:结果:
print(df['col_1'].str.strip('[]').str.split(',\s*').apply(lambda x: [pd.to_numeric(y, errors='ignore') for y in x]).to_dict())
{0: [2, 'A'], 1: [5, 'BC']} # 2, 5 are numbers instead of strings
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.