How can I replace dataframe column with columns after it split? I know how to split column but don't know how to replace it with split value columns.
Input:
import pandas as pd
df = pd.DataFrame({'id': [101, 102],
'full_name': ['John Brown', 'Bob Smith'],
'birth_year': [1960, 1970]})
df_new = df['full_name'].str.split(" ", expand=True)
print(df)
print(df_new)
Output:
id full_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
0 1
0 John Brown
1 Bob Smith
Expected Output:
id first_name last_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
df.join(df.full_name.str.split('\s', expand = True) \
.set_axis(['first_name', 'last_name'], axis = 1)) \
[['id', 'first_name', 'last_name', 'birth_year']]
Output:
id full_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
Strategy is to get the position of the column you wish to replace, create the new columns, and concatenate the new and old dataframes with respect to the position of the column u wish to replace:
#get the position of the column to be replaced
col_position = df.columns.get_loc('full_name')
#create new dataframe that holds the new columns
insert_df = (df
.pop('full_name')
.str.split(expand=True)
.set_axis(['first_name','last_name'],axis='columns')
)
df_by_positions = (#this is the dataframe before col_position
[df.iloc[:,:col_position],
#this is the dataframe we are inserting
insert_df,
#this is the dataframe after col_position
df.iloc[:,col_position:]
]
)
pd.concat(df_by_positions,axis=1)
id first_name last_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.