简体   繁体   中英

pandas replace values of a list column

I have a dataframe like this

ID Feeback
T223 [Good, Bad, Bad]
T334 [Average,Good,Good]
feedback_dict = {'Good':1, 'Average':2, 'Bad':3}

using this dictionary I have to replace Feedback column

ID Feeback
T223 [1, 3, 3]
T334 [2,1,1]

I tried two way, but none worked, any help will be appreciated.

method1:    
df = df.assign(Feedback=[feedback_dict.get(i,i)  for i in list(df['Feedback'])])

method2:
df['Feedback'] = df['Feedback'].apply(lambda x : [feedback_dict.get(i,i)  for i in list(x)])

For me second solution working, but necessary convert strings to lists before:

import ast

df['Feedback'] = df['Feedback'].apply(ast.literal_eval)
#df['Feedback'] = df['Feedback'].str.strip('[]').str.split(',')

First solution working with nested dictionary:

df = df.assign(Feedback=[[feedback_dict.get(i,i) for i in x] for x in df['Feedback']])


df['Feedback'] = df['Feedback'].apply(lambda x : [feedback_dict.get(i,i)  for i in list(x)])
print (df)
     ID    Feedback
0  T223  [1, 3, 3]
1  T334  [2, 1, 1]

EDIT: If instead lists are missing values use if-else statement - non list values are replaced to empty lists:

print (df)
     ID             Feedback
0  T223       [Good,Bad,Bad]
1  T334  [Average,Good,Good]
2   NaN                  NaN


feedback_dict = {'Good':1, 'Average':2, 'Bad':3}
df = df.assign(Feedback=[[feedback_dict.get(i,i) for i in x] if isinstance(x, list) else [] 
                            for x in df['Feedback']])

print (df)
     ID   Feedback
0  T223  [1, 3, 3]
1  T334  [2, 1, 1]
2   NaN         []

If your usecase is about as simple as this example, I wouldn't recommend this method. However, here's another option in case it makes other parts of your project easier.

  1. df.explode() your column (assuming it is a list and not text; otherwise convert it to a list first)
  2. Perform the replacements with df.replace()
  3. Group the rows back together again with df.groupby() and df.agg()

For the example, it would look like this (assuming the variables have been declared like in your question):

df = df.explode('Feedback')
df['Feedback'] = df['Feedback'].replace(feedback_dict)
df = df.groupby('ID').agg(list)
l , L = [] , []  # two list for adding new values into them

for lst in df.Feeback: # call the lists in the Feeback column
    for i in last: #calling each element in each lists
        if i == 'Good': #if the element is Good then:
            l.append(feedback_dict['Good'])   #append the value 1 to the first created list
        if i == 'Average':  #if the element is Average then:
            l.append(feedback_dict['Average'])  #append the value 2 to the first created list
        if i == 'Bad':   #if the element is Bad then:
            l.append(feedback_dict['Bad']) #append the value 3 to the first created list
L.append(l[:3])  # we need to split half of the first list to add as a list to the second list and the other half as another list to the second list we created
L.append(l[3:])
df['Feeback'] = L  #at the end just put the values of the second created list as feedback column

You were pretty close to the solution.

What I did was:

data.replace({'Good': '1', 'Average': '2', 'Bad': '3'}, regex=True)

and obtain the result that you were looking:

enter image description here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM