[英]How do I split a string into several columns in a dataframe with pandas Python?
I am aware of the following questions: 我知道以下问题:
1.) How to split a column based on several string indices using pandas? 1.) 如何使用pandas基于多个字符串索引拆分列? 2.) How do I split text in a column into multiple rows?
2.) 如何将列中的文本拆分为多行?
I want to split these into several new columns though. 我想将这些分成几个新列。 Suppose I have a dataframe that looks like this:
假设我有一个如下所示的数据框:
id | string
-----------------------------
1 | astring, isa, string
2 | another, string, la
3 | 123, 232, another
I know that using: 我知道使用:
df['string'].str.split(',')
I can split a string. 我可以拆分一个字符串。 But as a next step, I want to efficiently put the split string into new columns like so:
但是作为下一步,我想有效地将拆分字符串放入新列中,如下所示:
id | string_1 | string_2 | string_3
-----------------|---------------------
1 | astring | isa | string
2 | another | string | la
3 | 123 | 232 | another
---------------------------------------
I could for example do this: 我可以这样做:
for index, row in df.iterrows():
i = 0
for item in row['string'].split():
df.set_values(index, 'string_{0}'.format(i), item)
i = i + 1
But how could one achieve the same result more elegantly?a 但是,怎样才能更优雅地达到同样的效果呢?
The str.split
method has an expand
argument: str.split
方法有一个expand
参数:
>>> df['string'].str.split(',', expand=True)
0 1 2
0 astring isa string
1 another string la
2 123 232 another
>>>
With column names: 使用列名称:
>>> df['string'].str.split(',', expand=True).rename(columns = lambda x: "string"+str(x+1))
string1 string2 string3
0 astring isa string
1 another string la
2 123 232 another
Much neater with Python >= 3.6 f-strings: 更整洁的Python> = 3.6 f-strings:
>>> (df['string'].str.split(',', expand=True)
... .rename(columns=lambda x: f"string_{x+1}"))
string_1 string_2 string_3
0 astring isa string
1 another string la
2 123 232 another
Slightly less concise than the expand
option, but here is an alternative way: 略微不如
expand
选项简洁,但这是另一种方式:
In [29]: cols = ['string_1', 'string_2', 'string_3']
In [30]: pandas.DataFrame(df.string.str.split(', ').tolist(), columns=cols)
Out[30]:
string_1 string_2 string_3
0 astring isa string
1 another string la
2 123 232 another
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.