简体   繁体   English

Pandas:比较数据框的列并根据条件添加新列和值

[英]Pandas : Compare the columns of a data frame and add a new column & value based on a condition

I have a data frame which is,我有一个数据框,

ip_df:
     name class    sec    details
0    tom  I        a      [{'class':'I','sec':'a','subjects':['numbers','ethics']},{'class':'I','sec':'b','subjects':['numbers','moral-science']},{'class':'I','sec':'c','subjects':['moral-science','ethics']},{'class':'I','subjects':['numbers','ethics1']}]
1    sam  I        d      [{'class':'I','sec':'a','subjects':['numbers','ethics']},{'class':'I','sec':'b','subjects':['numbers','moral-science']},{'class':'I','sec':'c','subjects':['moral-science','ethics']},{'class':'I','subjects':['numbers','ethics1']}] 

and the resultant data frame is suppose to be,并且结果数据框应该是,

op_df:
      name  class  sec   subjects
0     tom   I      a     ['numbers','ethics']
1     sam   I      d     ['numbers','ethics1']

The "op_df" has to be framed based on the following conditions, “op_df”必须根据以下条件进行构图,

  • Condition 1: check if a "class" and "sec" exists in "details" column, if so add a new column named as "subjects" with its respective value条件1:检查“details”列中是否存在“class”和“sec”,如果存在,则添加一个名为“subjects”的新列及其各自的值
  • Condition 2:If a "class" and "sec" doesnt exist in "details" column, check if a "class" matches, if so add a new column named as "subjects" with its respective value条件2:如果“details”列中不存在“class”和“sec”,检查“class”是否匹配,如果匹配,则添加一个名为“subjects”的新列及其各自的值
  • If both condition 1 and condition 2 doesnt exist, add the default value as [0,0] in "subjects" column如果条件1和条件2都不存在,则在“subjects”列中添加默认值[0,0]

Solution if need first matched value by both conditions with next and iter trick for add default value [0, 0] if no matched:解决方案是否需要两个条件的第一个匹配值,如果没有匹配,则使用nextiter技巧添加默认值[0, 0]

final = []
for a, b, c in zip(df['class'], df['sec'], df['details']):
    out = []
    for x in c:
        m1 = x['class'] == a 
        if m1 and x.get('sec') == b:
            out.append(x['subjects'])
        elif m1 and 'sec' not in list(x.keys()):
            out.append(x['subjects'])
    final.append(next(iter(out), [0,0]))

df['subjects'] =  final

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM