简体   繁体   English

如何将具有混合值的字符串表示列表转换为列表?

[英]How to convert string representation list with mixed values to a list?

How do can I convert a string that contains values that are both strings and numeric, given that the string within the list is not in quotes?如果列表中的字符串不在引号中,我如何转换包含字符串和数字值的字符串?

import pandas as pd

df = pd.DataFrame({'col_1': ['[2, A]', '[5, BC]']})

print(df)

     col_1
0   [2, A]
1  [5, BC]

col_1    [2, A]
Name: 0, dtype: object

My aim is to use the list in another function, so I tried to transform the string with built-in functions such as eval() or ast.literal_eval(), however in both cases I need to add quotes around the strings, so it is "A" and "BC".我的目标是在另一个 function 中使用该列表,因此我尝试使用内置函数(例如 eval() 或 ast.literal_eval() 转换字符串,但是在这两种情况下我都需要在字符串周围添加引号,所以它是“A”和“BC”。

You can first use a regex to add quotes around the potential strings (here I used letters + underscore), then use literal_eval (for some reason I have an error with pd.eval )您可以先使用正则表达式在潜在字符串周围添加引号(这里我使用字母 + 下划线),然后使用literal_eval (出于某种原因我在pd.eval中出错)

from ast import literal_eval
df['col_1'].str.replace(r'([a-zA-Z_]+)', r'"\1"', regex=True).apply(literal_eval)

output (lists): output(名单):

0     [2, A]
1    [5, BC]

It is already a string and If the data is going to be in a certain format-它已经是一个字符串并且如果数据将采用某种格式-

df['col_2'] = df['col_1'].apply(lambda x: x.split(',')[1].rstrip(']'))

If you want the output to be a list:如果您希望 output 成为列表:

import pandas as pd

df = pd.DataFrame({'col_1': ['[2, A]', '[5, BC]']})
print(df)

a = df["col_1"].to_list()
actual_list = [[int(i.split(",")[0][1:]), str(i.split(",")[1][1:-1])] for i in a]
actual_list

Output: Output:

[[2, 'A'], [5, 'BC']]

If you just need to convert string representation list to list of strings, you can use str.strip() together with str.split() , as follows:如果您只需要将字符串表示列表转换为字符串列表,则可以将str.strip()str.split() ) 一起使用,如下所示:

df['col_1'].str.strip('[]').str.split(',\s*')

Result:结果:

print(df['col_1'].str.strip('[]').str.split(',\s*').to_dict())

{0: ['2', 'A'], 1: ['5', 'BC']}

If you further want to convert the strings of numeric values to numbers, you can further use pd.to_numeric() , as follows:如果想进一步将数值字符串转换为数字,可以进一步使用pd.to_numeric() ,如下:

df['col_1'].str.strip('[]').str.split(',\s*').apply(lambda x: [pd.to_numeric(y, errors='ignore') for y in x])

Result:结果:

print(df['col_1'].str.strip('[]').str.split(',\s*').apply(lambda x: [pd.to_numeric(y, errors='ignore') for y in x]).to_dict())

{0: [2, 'A'], 1: [5, 'BC']}           # 2, 5 are numbers instead of strings

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM