繁体   English   中英

根据另一列中的值组合数据框的列

[英]Combining columns of dataframe based on value in another column

输入 df(示例)

Country     SubregionA      SubregionB
BRA         State of Acre   Brasiléia
BRA         State of Acre   Cruzeiro do Sul
USA         AL              Bibb County
USA         AL              Blount County
USA         AL              Bullock County

输出 df

Country     SubregionA      SubregionB
BRA         State of Acre   State of Acre - Brasiléia
BRA         State of Acre   State of Acre - Cruzeiro do Sul
USA         AL              AL Bibb County
USA         AL              AL Blount County
USA         AL              AL Bullock County

代码片段是不言自明的,但执行时似乎永远运行。 可能出什么问题了(数据框“ data ”也很大,大约有 250K+ 行)

for row in data.itertuples():
     region = data['Country']

     if region == 'ARG' :
          data['SubregionB'] = data[['SubregionA' 'SubregionB']].apply(lambda row: '-'.join(row.values.astype(str)), axis=1)
     elif region == 'BRA' :
          data['SubregionB'] = data[['SubregionA', 'SubregionB']].apply(lambda row: '-'.join(row.values.astype(str)), axis=1)
     elif region == 'USA':
          data['SubregionB'] = data[['SubregionA', 'SubregionB']].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
     else:
          pass

说明:尝试根据列名称“Country”中的值连接列 SubregionA 和 SubregionB。 分隔符不同,因此编写了多个 if-else 语句。 执行时间太长,我怎样才能让它更快?

您可以使用numpy.selectSeries.isin和联接列与+

print (df)
  Country     SubregionA       SubregionB
0     BRA  State of Acre         Brasilia
1     BRA  State of Acre  Cruzeiro do Sul
2     USA             AL      Bibb County
3     USA             AL    Blount County
4     USA             AL   Bullock County
5     JAP            AAA             BBBB

reg1 = ['ARG','BRA']
reg2 = ['USA']

a = np.select([df['Country'].isin(reg1), df['Country'].isin(reg2)], 
              [df['SubregionA'] + ' - ' + df['SubregionB'],
               df['SubregionA'] + ' ' + df['SubregionB']],
              default=df['SubregionB'])

df['SubregionB'] = a
print (df)
  Country     SubregionA                       SubregionB
0     BRA  State of Acre         State of Acre - Brasilia
1     BRA  State of Acre  State of Acre - Cruzeiro do Sul
2     USA             AL                   AL Bibb County
3     USA             AL                 AL Blount County
4     USA             AL                AL Bullock County
5     JAP            AAA                             BBBB

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM