[英]Pandas - How to split a string column into several columns, by the index of specific characters?
[英]Pandas split column in several columns throug string replacement or regex
我的數據框中有一個“列”,在最佳條件下,它看起來像這樣:
Client: Stack Overflow Order Num: 123456 Account From: 3656645654 Account to: 546546578
我想將此列拆分為幾列,例如:
'Client','Order Num', 'Account From','Account to'
但在某些情況下,我在列中沒有客戶、訂單號和帳戶
我是這樣做的:
for x in len(df.columns):
if 'Client' in df.loc[x,'Columnn']:
df.loc[x,'Client'] = str(df.loc[x,'Column']).split('Client: ')[1]
if 'Order Num' in df.loc[x,'Client']:
df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Order Num: ')[0]
if 'Account From' in df.loc[x,'Client']:
df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Account From: ')[0]
if 'Account to' in df.loc[x,'Client']:
df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Account to: ')[0]
else:
df.loc[x,'Client'] = ''
對於我要創建的所有列,依此類推。
這部分腳本差不多有40行,速度很慢。
你有更“泛濫”的解決方案嗎?
嘗試使用字符串訪問器.str
並使用正則表達式extract
命名組:
df['col1'].str.extract('Client: (?P<Client>.*) Order Num: (?P<OrderNum>.*) Account From: (?P<AccountFrom>.*) Account to: (?P<AccountTo>.*)')
輸出:
Client OrderNum AccountFrom AccountTo
0 Stack Overflow 123456 3656645654 546546578
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.