I keep running into dead ends here, and it's killing me.
Dataframe:
accountid col2 col3
1 ['abc','def','xyz'] ['abc','mda','xyz','sdi']
2 ['abc','asd','xyz','dib] ['nio','ouy','abc']
3 ['abc','def','xyz'] ['abc','mda','xyz']
Notes
*each field in col2 and col3 are lists
*fields in col2 and col3 may not have an equal number of items in the list
Result should look like I'm trying to create a col4 that shows the items in col3 that are not in and col2:
accountid col2 col3 col4
1 ['abc','def','xyz'] ['abc','mda','xyz','sdi'] ['mda','sdi']
2 ['abc','asd','xyz','dib] ['nio','ouy','abc'] ['nio','ouy']
3 ['abc','def','xyz'] ['abc','mda','xyz'] ['mda']
Let me know if this doesn't make sense. I appreciate any help at all on this.
Let us do
s=df.col3.apply(set)-df.col2.apply(set)
0 {sdi, mda}
1 {nio, ouy}
2 {mda}
dtype: object
df['New']=s.map(list)
Check the result
s.map(list)
0 [sdi, mda]
1 [nio, ouy]
2 [mda]
dtype: object
You list is not list , it is string
import ast
df.iloc[:,1:]=df.iloc[:,1:].applymap(ast.literal_eval)
Try this. Apply the lambda function along the column axis=1
df['col4'] = df.apply(lambda x : list(set(x['col3']).difference(set(x['col2']))), axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.