How can I split a pandas column and append the new results to the dataframe? I also want there to be no white space.
Example of my desired output:
col1
Smith, John
Smith, John
col2
Smith
Smith
col3
John
John
I been trying this but the lambda function is not appending the results how I want it to.
df_split = df1['col1'].apply(lambda x: pd.Series(x.split(',')))
df1['col2']= df_split.apply(lambda x: x[0])
df1['col3']= df_split.apply(lambda x: x[1])
I end up getting
col2 col3
Smith Smith
John John
Use Series.str.split(..., expand=True)
:
df[['col2', 'col3']] = df.col1.str.split(',\s+', expand=True); df
col1 col2 col3
0 Smith, John Smith John
1 Smith, John Smith John
We can use Series.str.extract() method:
In [157]: df[['col2','col3']] = df['col1'].str.extract('(\w+),\s*(\w+)', expand=True)
In [158]: df
Out[158]:
col1 col2 col3
0 Smith, John Smith John
1 Smith, John Smith John
2 Mustermann, Max Mustermann Max
3 Last,First Last First
(\\w+),\\s*(\\w+)
is a RegEx (Regular Expression) explained here
If you just want to store first string after split, then use following
df['col2'] = df['col1'].str.split(',', 1).str[0]
col1 col2
0 Smith, John Smith
1 Smith, John Smith
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.