简体   繁体   中英

Assign values to columns based on conditions in a pandas dataframe

I have the below dataset:

device_id   A   B   C   Current Class   
1           70  35  40     C                
2           45  90  34     B

Now each device has a score within each class( A,B,C) and it is currently a part of a certain class. Based on the class for which it has the highest score , a class change will either be recommended or not.

For example, device 1 is in class C but it's highest score is in class A and hence it's recommended class will be A.

Expected output:

device_id   A   B   C   Current Class   Class Change    Recommended
1           70  35  40  C                   Yes             A
2           45  90  34  B                   No              B

Can someone please help me with this??

I think you need idxmax with numpy.where :

a = df[['A','B','C']].idxmax(axis=1)
#more general solution is select all columns without first and last
#a = df.iloc[:, 1:-1].idxmax(axis=1)
print (df.iloc[:, 1:-1])
    A   B   C
0  70  35  40
1  45  90  34

df['Class Change'] = np.where(df['Current Class'] == a, 'No', 'Yes')
df['Recommended'] = a
print (df)
   device_id   A   B   C Current Class Class Change Recommended
0          1  70  35  40             C          Yes           A
1          2  45  90  34             B           No           B

Detail:

print (a)
0    A
1    B
dtype: object

If order of new columns is not important and should be swapped:

df['Recommended'] = df[['A','B','C']].idxmax(1)
df['Class Change'] = np.where(df['Current Class'] == df['Recommended'], 'No', 'Yes')
print (df)
   device_id   A   B   C Current Class Recommended Class Change
0          1  70  35  40             C           A          Yes
1          2  45  90  34             B           B           No

I would first find the column with the max to get the Recommended row, and then check if that matches the Current Class to get the Class Change row, like this:

devices = pd.DataFrame({'A':[70, 45],
                       'B':[35, 90],
                       'C':[40, 34],
                       'Current Class':['C','B']})

devices['Recommended'] = devices[['A', 'B', 'C']].idxmax(1)

devices['Class Change'] = devices['Current Class'] == devices['Recommended']

print(devices)

output:

    A   B   C Current Class Recommended  Class Change
0  70  35  40             C           A         False
1  45  90  34             B           B          True

numpy solution : -)

df['Recommended']=np.array(list('ABC'))[np.argmax(df[list('ABC')].values,1)]
df
Out[172]: 
   device_id   A   B   C CurrentClass Recommended
0          1  70  35  40            C           A
1          2  45  90  34            B           B
(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
Out[173]: 
0     no
1    yes
dtype: object
df['Class Change']=(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
df
Out[175]: 
   device_id   A   B   C CurrentClass Recommended Class Change
0          1  70  35  40            C           A           no
1          2  45  90  34            B           B          yes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM