如何使用pandas Python将字符串拆分为数据框中的多个列？

Question

I am aware of the following questions: 我知道以下问题：

1.) How to split a column based on several string indices using pandas? 1.）如何使用pandas基于多个字符串索引拆分列？ 2.) How do I split text in a column into multiple rows? 2.）如何将列中的文本拆分为多行？

I want to split these into several new columns though. 我想将这些分成几个新列。 Suppose I have a dataframe that looks like this: 假设我有一个如下所示的数据框：

id    | string
-----------------------------
1     | astring, isa, string
2     | another, string, la
3     | 123, 232, another

I know that using: 我知道使用：

df['string'].str.split(',')

I can split a string. 我可以拆分一个字符串。 But as a next step, I want to efficiently put the split string into new columns like so: 但是作为下一步，我想有效地将拆分字符串放入新列中，如下所示：

id    | string_1 | string_2 | string_3
-----------------|---------------------
1     | astring  | isa      | string
2     | another  | string   | la
3     | 123      | 232      | another
---------------------------------------

I could for example do this: 我可以这样做：

for index, row in df.iterrows():
    i = 0
    for item in row['string'].split():
        df.set_values(index, 'string_{0}'.format(i), item)
        i = i + 1

But how could one achieve the same result more elegantly?a 但是，怎样才能更优雅地达到同样的效果呢？

Answer 1

The str.split method has an expand argument: str.split方法有一个expand参数：

>>> df['string'].str.split(',', expand=True)
         0        1         2
0  astring      isa    string
1  another   string        la
2      123      232   another
>>>

With column names: 使用列名称：

>>> df['string'].str.split(',', expand=True).rename(columns = lambda x: "string"+str(x+1))
   string1  string2   string3
0  astring      isa    string
1  another   string        la
2      123      232   another

Much neater with Python >= 3.6 f-strings: 更整洁的Python> = 3.6 f-strings：

>>> (df['string'].str.split(',', expand=True)
...              .rename(columns=lambda x: f"string_{x+1}"))
  string_1 string_2  string_3
0  astring      isa    string
1  another   string        la
2      123      232   another

Answer 2

Slightly less concise than the expand option, but here is an alternative way: 略微不如expand选项简洁，但这是另一种方式：

In [29]: cols = ['string_1', 'string_2', 'string_3']   

In [30]: pandas.DataFrame(df.string.str.split(', ').tolist(), columns=cols)
Out[30]: 
  string_1 string_2 string_3
0  astring      isa   string
1  another   string       la
2      123      232  another

如何使用pandas Python将字符串拆分为数据框中的多个列？

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-02-24 01:04:25

解决方案2
1 2018-02-24 01:13:31

如何使用pandas Python将字符串拆分为数据框中的多个列？

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-02-24 01:04:25

解决方案2 1 2018-02-24 01:13:31

解决方案1
3 已采纳 2018-02-24 01:04:25

解决方案2
1 2018-02-24 01:13:31