簡體   English   中英

如何使用正則表達式將一列拆分為Pandas中的多列?

[英]How to split one column into multiple columns in Pandas using regular expression?

例如,如果我有這樣的家庭住址:

71 Pilgrim Avenue, Chevy Chase, MD

在名為“地址”的列中。 我想將其分別分為“街道”,“城市”,“州”列。

使用Pandas實現此目標的最佳方法是什么?

我已經嘗試過df[['street', 'city', 'state']] = df['address'].findall(r"myregex")

但是我得到的錯誤是Must have equal len keys and value when setting with an iterable

謝謝您的幫助 :)

您可以使用split通過正則表達式,\\s+,以及一個或多個空格):

#borrowing sample from `Allen`
df[['street', 'city', 'state']] = df['address'].str.split(',\s+', expand=True)
print (df)
                              address id             street          city  \
0  71 Pilgrim Avenue, Chevy Chase, MD  a  71 Pilgrim Avenue   Chevy Chase   
1         72 Main St, Chevy Chase, MD  b         72 Main St   Chevy Chase   

  state  
0    MD  
1    MD  

而如果需要刪除列address添加drop

df[['street', 'city', 'state']] = df['address'].str.split(',\s+', expand=True)
df = df.drop('address', axis=1)
print (df)
  id             street         city state
0  a  71 Pilgrim Avenue  Chevy Chase    MD
1  b         72 Main St  Chevy Chase    MD
df = pd.DataFrame({'address': {0: '71 Pilgrim Avenue, Chevy Chase, MD',
      1: '72 Main St, Chevy Chase, MD'},
     'id': {0: 'a', 1: 'b'}})
#if your address format is consistent, you can simply use a split function.
df2 = df.join(pd.DataFrame(df.address.str.split(',').tolist(),columns=['street', 'city', 'state']))
df2 = df2.applymap(lambda x: x.strip())

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM