簡體   English   中英

根據另一列中的值組合數據框的列

[英]Combining columns of dataframe based on value in another column

輸入 df(示例)

Country     SubregionA      SubregionB
BRA         State of Acre   Brasiléia
BRA         State of Acre   Cruzeiro do Sul
USA         AL              Bibb County
USA         AL              Blount County
USA         AL              Bullock County

輸出 df

Country     SubregionA      SubregionB
BRA         State of Acre   State of Acre - Brasiléia
BRA         State of Acre   State of Acre - Cruzeiro do Sul
USA         AL              AL Bibb County
USA         AL              AL Blount County
USA         AL              AL Bullock County

代碼片段是不言自明的,但執行時似乎永遠運行。 可能出什么問題了(數據框“ data ”也很大,大約有 250K+ 行)

for row in data.itertuples():
     region = data['Country']

     if region == 'ARG' :
          data['SubregionB'] = data[['SubregionA' 'SubregionB']].apply(lambda row: '-'.join(row.values.astype(str)), axis=1)
     elif region == 'BRA' :
          data['SubregionB'] = data[['SubregionA', 'SubregionB']].apply(lambda row: '-'.join(row.values.astype(str)), axis=1)
     elif region == 'USA':
          data['SubregionB'] = data[['SubregionA', 'SubregionB']].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
     else:
          pass

說明:嘗試根據列名稱“Country”中的值連接列 SubregionA 和 SubregionB。 分隔符不同,因此編寫了多個 if-else 語句。 執行時間太長,我怎樣才能讓它更快?

您可以使用numpy.selectSeries.isin和聯接列與+

print (df)
  Country     SubregionA       SubregionB
0     BRA  State of Acre         Brasilia
1     BRA  State of Acre  Cruzeiro do Sul
2     USA             AL      Bibb County
3     USA             AL    Blount County
4     USA             AL   Bullock County
5     JAP            AAA             BBBB

reg1 = ['ARG','BRA']
reg2 = ['USA']

a = np.select([df['Country'].isin(reg1), df['Country'].isin(reg2)], 
              [df['SubregionA'] + ' - ' + df['SubregionB'],
               df['SubregionA'] + ' ' + df['SubregionB']],
              default=df['SubregionB'])

df['SubregionB'] = a
print (df)
  Country     SubregionA                       SubregionB
0     BRA  State of Acre         State of Acre - Brasilia
1     BRA  State of Acre  State of Acre - Cruzeiro do Sul
2     USA             AL                   AL Bibb County
3     USA             AL                 AL Blount County
4     USA             AL                AL Bullock County
5     JAP            AAA                             BBBB

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM