[英]Split one column into multiple columns by multiple delimiters in Pandas
Given a dataframe as follows:给定一个数据框,如下所示:
player score
0 Sergio Agüero Forward — Manchester City 209.98
1 Eden Hazard Midfield — Chelsea 274.04
2 Alexis Sánchez Forward — Arsenal 223.86
3 Yaya Touré Midfield — Manchester City 197.91
4 Angel María Midfield — Manchester United 132.23
How could split player
into three new columns name
, position
and team
?如何将
player
分成三个新的列name
、 position
和team
?
player score name position team
0 Sergio Agüero Forward — Manchester City 209.98 Sergio Forward Manchester City
1 Eden Hazard Midfield — Chelsea 274.04 Eden Midfield Chelsea
2 Alexis Sánchez Forward — Arsenal 223.86 Alexis Forward Arsenal
3 Yaya Touré Midfield — Manchester City 197.91 Yaya Midfield Manchester City
4 Angel María Midfield — Manchester United 132.23 Angel Midfield Manchester United
I have considered split it two columns with df[['name_position', 'team']] = df['player'].str.split(pat= ' — ', expand=True)
, then split name_position
to name
and position
.我考虑过用
df[['name_position', 'team']] = df['player'].str.split(pat= ' — ', expand=True)
将它分成两列,然后将name_position
拆分为name
和position
. But is there any better solutions?但是有没有更好的解决方案?
Many thanks.非常感谢。
You can use str.extract
as well if you want to do it in one go:如果你想
str.extract
,你也可以使用str.extract
:
print(df["player"].str.extract(r"(?P<name>.*?)\s.*?\s(?P<position>[A-Za-z]+)\s—\s(?P<team>.*)"))
name position team
0 Sergio Forward Manchester City
1 Eden Midfield Chelsea
2 Alexis Forward Arsenal
3 Yaya Midfield Manchester City
4 Angel Midfield Manchester United
You can split a python string by space with string.split()
.您可以使用
string.split()
按空格拆分 python 字符串。 This will break up your text into 'words'
, then you can simply access the one you like, like this:这会将您的文本分解为
'words'
,然后您可以简单地访问您喜欢的那个,如下所示:
string = "Sergio Agüero Forward — Manchester City"
name = string.split()[0]
position = string.split()[2]
team = string.split()[4] + (string.split().has_key(5) ? string.split()[5] : '')
For more complex patterns, you can use regex, which is a powerful string pattern finding tool.对于更复杂的模式,您可以使用正则表达式,这是一个强大的字符串模式查找工具。
Hope this helped :)希望这有帮助:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.