[英]Splitting a Pandas DataFrame column into two columns
I'm working on a simple web scrape, DataFrame project. 我正在研究一个简单的Web抓取DataFrame项目。 I have a simple 8x1 DataFrame, and I'm trying to split it into an 8x2 DataFrame.
我有一个简单的8x1数据框,并且尝试将其拆分为8x2数据框。 So far this is what my DataFrame looks like:
到目前为止,这是我的DataFrame的样子:
dframe = DataFrame(data, columns=['Active NPGL Teams'], index=[1, 2, 3, 4, 5, 6, 7, 8])
Active NPGL Teams
1 Baltimore Anthem (2015–present)
2 Boston Iron (2014–present)
3 DC Brawlers (2014–present)
4 Los Angeles Reign (2014–present)
5 Miami Surge (2014–present)
6 New York Rhinos (2014–present)
7 Phoenix Rise (2014–present)
8 San Francisco Fire (2014–present)
I would like to add a column, "Years Active" and split the "(2014-present)", "(2015-present)" into the "Years Active" column. 我想添加一列“ Years Active”,并将“(2014年至今)”,“(2015年至今)”拆分为“ Years Active”列。 How do I split my data?
如何拆分数据?
You can use 您可以使用
dframe['Active NPGL Teams'].str.split(r' (?=\()', expand=True)
0 1
1 Baltimore Anthem (2015–present)
2 Boston Iron (2014–present)
3 DC Brawlers (2014–present)
4 Los Angeles Reign (2014–present)
5 Miami Surge (2014–present)
6 New York Rhinos (2014–present)
7 Phoenix Rise (2014–present)
8 San Francisco Fire (2014–present)
The key is the regex r' (?=\\()'
which matches a space only if it is followed by an open parenthesis (lookahead assertion). 关键是正则表达式
r' (?=\\()'
,仅当其后跟一个开放的括号(超前断言)时,它才匹配空格。
Another approach (which is about 5% slower but more flexible) is to user Series.str.extract
. 另一种方法(速度较慢但灵活性提高了5%左右)是用户
Series.str.extract
。
dframe['Active NPGL Teams'].str.extract(r'^(?P<Team>.+) (?P<YearsActive>\(.+\))$',
expand=True)
Team YearsActive
1 Baltimore Anthem (2015–present)
2 Boston Iron (2014–present)
3 DC Brawlers (2014–present)
4 Los Angeles Reign (2014–present)
5 Miami Surge (2014–present)
6 New York Rhinos (2014–present)
7 Phoenix Rise (2014–present)
8 San Francisco Fire (2014–present)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.