简体   繁体   English

熊猫替换列表列的值

[英]pandas replace values of a list column

I have a dataframe like this我有这样的数据框

ID ID Feeback回馈
T223 T223 [Good, Bad, Bad] [好,坏,坏]
T334 T334 [Average,Good,Good] [平均,好,好]
feedback_dict = {'Good':1, 'Average':2, 'Bad':3}

using this dictionary I have to replace Feedback column使用这本字典我必须更换反馈列

ID ID Feeback回馈
T223 T223 [1, 3, 3] [1, 3, 3]
T334 T334 [2,1,1] [2,1,1]

I tried two way, but none worked, any help will be appreciated.我尝试了两种方法,但都没有用,我们将不胜感激。

method1:    
df = df.assign(Feedback=[feedback_dict.get(i,i)  for i in list(df['Feedback'])])

method2:
df['Feedback'] = df['Feedback'].apply(lambda x : [feedback_dict.get(i,i)  for i in list(x)])

For me second solution working, but necessary convert strings to lists before:对我来说,第二个解决方案有效,但之前必须将字符串转换为列表:

import ast

df['Feedback'] = df['Feedback'].apply(ast.literal_eval)
#df['Feedback'] = df['Feedback'].str.strip('[]').str.split(',')

First solution working with nested dictionary:使用嵌套字典的第一个解决方案:

df = df.assign(Feedback=[[feedback_dict.get(i,i) for i in x] for x in df['Feedback']])


df['Feedback'] = df['Feedback'].apply(lambda x : [feedback_dict.get(i,i)  for i in list(x)])
print (df)
     ID    Feedback
0  T223  [1, 3, 3]
1  T334  [2, 1, 1]

EDIT: If instead lists are missing values use if-else statement - non list values are replaced to empty lists:编辑:如果列表缺少值,则使用if-else语句 - 非列表值被替换为空列表:

print (df)
     ID             Feedback
0  T223       [Good,Bad,Bad]
1  T334  [Average,Good,Good]
2   NaN                  NaN


feedback_dict = {'Good':1, 'Average':2, 'Bad':3}
df = df.assign(Feedback=[[feedback_dict.get(i,i) for i in x] if isinstance(x, list) else [] 
                            for x in df['Feedback']])

print (df)
     ID   Feedback
0  T223  [1, 3, 3]
1  T334  [2, 1, 1]
2   NaN         []

If your usecase is about as simple as this example, I wouldn't recommend this method.如果你的用例和这个例子一样简单,我不会推荐这种方法。 However, here's another option in case it makes other parts of your project easier.但是,如果它使项目的其他部分更容易,那么这里还有另一种选择。

  1. df.explode() your column (assuming it is a list and not text; otherwise convert it to a list first) df.explode()你的专栏(假设它是一个列表而不是文本;否则先将它转换为一个列表)
  2. Perform the replacements with df.replace()使用df.replace()执行替换
  3. Group the rows back together again with df.groupby() and df.agg()使用df.groupby()df.agg()再次将行分组

For the example, it would look like this (assuming the variables have been declared like in your question):例如,它看起来像这样(假设变量已像您的问题中那样声明):

df = df.explode('Feedback')
df['Feedback'] = df['Feedback'].replace(feedback_dict)
df = df.groupby('ID').agg(list)
l , L = [] , []  # two list for adding new values into them

for lst in df.Feeback: # call the lists in the Feeback column
    for i in last: #calling each element in each lists
        if i == 'Good': #if the element is Good then:
            l.append(feedback_dict['Good'])   #append the value 1 to the first created list
        if i == 'Average':  #if the element is Average then:
            l.append(feedback_dict['Average'])  #append the value 2 to the first created list
        if i == 'Bad':   #if the element is Bad then:
            l.append(feedback_dict['Bad']) #append the value 3 to the first created list
L.append(l[:3])  # we need to split half of the first list to add as a list to the second list and the other half as another list to the second list we created
L.append(l[3:])
df['Feeback'] = L  #at the end just put the values of the second created list as feedback column

You were pretty close to the solution.您非常接近解决方案。

What I did was:我所做的是:

data.replace({'Good': '1', 'Average': '2', 'Bad': '3'}, regex=True)

and obtain the result that you were looking:并获得您正在寻找的结果:

enter image description here在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM