简体   繁体   English

在 Pandas 中通过多个分隔符将一列拆分为多列

[英]Split one column into multiple columns by multiple delimiters in Pandas

Given a dataframe as follows:给定一个数据框,如下所示:

                                     player     score
0   Sergio Agüero Forward — Manchester City    209.98
1            Eden Hazard Midfield — Chelsea    274.04
2          Alexis Sánchez Forward — Arsenal    223.86
3     Yaya Touré Midfield — Manchester City    197.91
4  Angel María Midfield — Manchester United    132.23

How could split player into three new columns name , position and team ?如何将player分成三个新的列namepositionteam

                                     player     score   name    position  team
0   Sergio Agüero Forward — Manchester City    209.98   Sergio  Forward   Manchester City
1            Eden Hazard Midfield — Chelsea    274.04   Eden    Midfield  Chelsea
2          Alexis Sánchez Forward — Arsenal    223.86   Alexis  Forward   Arsenal
3     Yaya Touré Midfield — Manchester City    197.91   Yaya    Midfield  Manchester City
4  Angel María Midfield — Manchester United    132.23   Angel   Midfield  Manchester United

I have considered split it two columns with df[['name_position', 'team']] = df['player'].str.split(pat= ' — ', expand=True) , then split name_position to name and position .我考虑过用df[['name_position', 'team']] = df['player'].str.split(pat= ' — ', expand=True)将它分成两列,然后将name_position拆分为nameposition . But is there any better solutions?但是有没有更好的解决方案?

Many thanks.非常感谢。

You can use str.extract as well if you want to do it in one go:如果你想str.extract ,你也可以使用str.extract

print(df["player"].str.extract(r"(?P<name>.*?)\s.*?\s(?P<position>[A-Za-z]+)\s—\s(?P<team>.*)"))

     name  position               team
0  Sergio   Forward    Manchester City
1    Eden  Midfield            Chelsea
2  Alexis   Forward            Arsenal
3    Yaya  Midfield    Manchester City
4   Angel  Midfield  Manchester United

You can split a python string by space with string.split() .您可以使用string.split()按空格拆分 python 字符串。 This will break up your text into 'words' , then you can simply access the one you like, like this:这会将您的文本分解为'words' ,然后您可以简单地访问您喜欢的那个,如下所示:

string =  "Sergio Agüero Forward — Manchester City"
name = string.split()[0]
position = string.split()[2]
team = string.split()[4] + (string.split().has_key(5) ? string.split()[5] : '')

For more complex patterns, you can use regex, which is a powerful string pattern finding tool.对于更复杂的模式,您可以使用正则表达式,这是一个强大的字符串模式查找工具。

Hope this helped :)希望这有帮助:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM