将列值拆分为 2 个新列 - Python Pandas

Question

I have a dataframe that has column 'name'.我有一个具有“名称”列的 dataframe。 With values like 'James Cameron'.具有像“詹姆斯卡梅隆”这样的价值观。 I'd like to split it out into 2 new columns 'First_Name' and 'Last_Name', but there is no delimiter in the data so I am not quite sure how.我想将其拆分为 2 个新列“First_Name”和“Last_Name”，但数据中没有分隔符，所以我不太确定如何操作。 I realize that 'James' is in position [0] and 'Cameron' is in position [1], but I am not sure you can recognize that without the delimiter我意识到“James”在 position [0] 中，而“Cameron”在 position [1] 中，但我不确定你是否能在没有分隔符的情况下认识到这一点

df = pd.DataFrame({'name':['James Cameron','Martin Sheen'],
               'Id':[1,2]})
df

EDIT:编辑：

Vaishali's answer below worked perfectly, for the dataframe I had provided.对于我提供的 dataframe，Vaishali 在下面的回答非常有效。 I created that dataframe as an example though.我创建了 dataframe 作为示例。 My real code looks like this"我的真实代码是这样的”

data[['First_Name','Last_Name']] = data.director_name.str.split(' ', expand = True)

and that unfortunately, is throwing an error:不幸的是，这引发了一个错误：

'Columns must be same length as key'

The column holds the same values as my example though.该列的值与我的示例相同。 Any suggestions?有什么建议么？

Thanks谢谢

Answer 1

You can split on space 你可以拆分空间

df[['Name', 'Lastname']] = df.name.str.split(' ', expand = True)

    Id  name            Name    Lastname
0   1   James Cameron   James   Cameron
1   2   Martin Sheen    Martin  Sheen

EDIT: Handling the error 'Columns must be same length as key'. 编辑：处理错误'列必须与键长度相同'。 The data might have some names with more than one space, eg: George Martin Jr. In that case, one way is to split on space and use the first and the second string, ignoring third if it exists 数据可能有一些具有多个空格的名称，例如：George Martin Jr.在这种情况下，一种方法是分割空间并使用第一个和第二个字符串，如果存在则忽略第三个字符串

df['First_Name'] = df.name.str.split(' ', expand = True)[0]
df['Last_Name'] = df.name.str.split(' ', expand = True)[1]

Answer 2

Slightly different way of doing this: 这样做的方式略有不同：

df[['first_name', 'last_name']] = df.apply(lambda row: row['name'].split(), axis=1)

df
   Id           name first_name last_name
0   1  James Cameron      James   Cameron
1   2   Martin Sheen     Martin     Sheen

Answer 3

I like this method... Not as quick as simply splitting but it drops in column names in a very convenient way. 我喜欢这种方法......不像简单拆分那么快，但它以非常方便的方式在列名中删除。

df.join(df.name.str.extract('(?P<First>\S+)\s+(?P<Last>\S+)', expand=True))

   Id           name   First     Last
0   1  James Cameron   James  Cameron
1   2   Martin Sheen  Martin    Sheen

将列值拆分为 2 个新列 - Python Pandas

问题描述

3 个解决方案

解决方案1
10 已采纳 2017-05-26 17:21:05

解决方案2
1 2017-05-26 17:28:39

解决方案3
1 2017-05-26 17:31:57

将列值拆分为 2 个新列 - Python Pandas

问题描述

3 个解决方案

解决方案1 10 已采纳 2017-05-26 17:21:05

解决方案2 1 2017-05-26 17:28:39

解决方案3 1 2017-05-26 17:31:57

解决方案1
10 已采纳 2017-05-26 17:21:05

解决方案2
1 2017-05-26 17:28:39

解决方案3
1 2017-05-26 17:31:57