簡體   English   中英

熊貓通過字符串替換或正則表達式將列拆分為幾列

[英]Pandas split column in several columns throug string replacement or regex

我的數據框中有一個“列”,在最佳條件下,它看起來像這樣:

Client: Stack Overflow   Order Num: 123456  Account From: 3656645654   Account to: 546546578

我想將此列拆分為幾列,例如:

'Client','Order Num', 'Account From','Account to'

但在某些情況下,我在列中沒有客戶、訂單號和帳戶

我是這樣做的:

for x in len(df.columns):
   if 'Client' in df.loc[x,'Columnn']:
      df.loc[x,'Client'] = str(df.loc[x,'Column']).split('Client: ')[1]
      if 'Order Num' in df.loc[x,'Client']:
         df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Order Num: ')[0]
      if 'Account From' in df.loc[x,'Client']:
         df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Account From: ')[0]
      if 'Account to' in df.loc[x,'Client']:
         df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Account to: ')[0]
   else:
      df.loc[x,'Client'] = ''

對於我要創建的所有列,依此類推。

這部分腳本差不多有40行,速度很慢。

你有更“泛濫”的解決方案嗎?

嘗試使用字符串訪問器.str並使用正則表達式extract命名組:

df['col1'].str.extract('Client: (?P<Client>.*) Order Num: (?P<OrderNum>.*) Account From: (?P<AccountFrom>.*) Account to: (?P<AccountTo>.*)')

輸出:

             Client OrderNum   AccountFrom  AccountTo
0  Stack Overflow    123456   3656645654    546546578

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM