[英]pandas replace values of a list column
I have a dataframe like this我有这样的数据框
ID ID | Feeback回馈 |
---|---|
T223 T223 | [Good, Bad, Bad] [好,坏,坏] |
T334 T334 | [Average,Good,Good] [平均,好,好] |
feedback_dict = {'Good':1, 'Average':2, 'Bad':3}
using this dictionary I have to replace Feedback column使用这本字典我必须更换反馈列
ID ID | Feeback回馈 |
---|---|
T223 T223 | [1, 3, 3] [1, 3, 3] |
T334 T334 | [2,1,1] [2,1,1] |
I tried two way, but none worked, any help will be appreciated.我尝试了两种方法,但都没有用,我们将不胜感激。
method1:
df = df.assign(Feedback=[feedback_dict.get(i,i) for i in list(df['Feedback'])])
method2:
df['Feedback'] = df['Feedback'].apply(lambda x : [feedback_dict.get(i,i) for i in list(x)])
For me second solution working, but necessary convert strings to lists before:对我来说,第二个解决方案有效,但之前必须将字符串转换为列表:
import ast
df['Feedback'] = df['Feedback'].apply(ast.literal_eval)
#df['Feedback'] = df['Feedback'].str.strip('[]').str.split(',')
First solution working with nested dictionary:使用嵌套字典的第一个解决方案:
df = df.assign(Feedback=[[feedback_dict.get(i,i) for i in x] for x in df['Feedback']])
df['Feedback'] = df['Feedback'].apply(lambda x : [feedback_dict.get(i,i) for i in list(x)])
print (df)
ID Feedback
0 T223 [1, 3, 3]
1 T334 [2, 1, 1]
EDIT: If instead lists are missing values use if-else
statement - non list values are replaced to empty lists:编辑:如果列表缺少值,则使用if-else
语句 - 非列表值被替换为空列表:
print (df)
ID Feedback
0 T223 [Good,Bad,Bad]
1 T334 [Average,Good,Good]
2 NaN NaN
feedback_dict = {'Good':1, 'Average':2, 'Bad':3}
df = df.assign(Feedback=[[feedback_dict.get(i,i) for i in x] if isinstance(x, list) else []
for x in df['Feedback']])
print (df)
ID Feedback
0 T223 [1, 3, 3]
1 T334 [2, 1, 1]
2 NaN []
If your usecase is about as simple as this example, I wouldn't recommend this method.如果你的用例和这个例子一样简单,我不会推荐这种方法。 However, here's another option in case it makes other parts of your project easier.但是,如果它使项目的其他部分更容易,那么这里还有另一种选择。
df.explode()
your column (assuming it is a list and not text; otherwise convert it to a list first) df.explode()
你的专栏(假设它是一个列表而不是文本;否则先将它转换为一个列表)df.replace()
使用df.replace()
执行替换df.groupby()
and df.agg()
使用df.groupby()
和df.agg()
再次将行分组For the example, it would look like this (assuming the variables have been declared like in your question):例如,它看起来像这样(假设变量已像您的问题中那样声明):
df = df.explode('Feedback')
df['Feedback'] = df['Feedback'].replace(feedback_dict)
df = df.groupby('ID').agg(list)
l , L = [] , [] # two list for adding new values into them
for lst in df.Feeback: # call the lists in the Feeback column
for i in last: #calling each element in each lists
if i == 'Good': #if the element is Good then:
l.append(feedback_dict['Good']) #append the value 1 to the first created list
if i == 'Average': #if the element is Average then:
l.append(feedback_dict['Average']) #append the value 2 to the first created list
if i == 'Bad': #if the element is Bad then:
l.append(feedback_dict['Bad']) #append the value 3 to the first created list
L.append(l[:3]) # we need to split half of the first list to add as a list to the second list and the other half as another list to the second list we created
L.append(l[3:])
df['Feeback'] = L #at the end just put the values of the second created list as feedback column
You were pretty close to the solution.您非常接近解决方案。
What I did was:我所做的是:
data.replace({'Good': '1', 'Average': '2', 'Bad': '3'}, regex=True)
and obtain the result that you were looking:并获得您正在寻找的结果:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.