[英]Replace dataframe column with split columns
拆分后如何用列替换数据框列? 我知道如何拆分列,但不知道如何用拆分值列替换它。
输入:
import pandas as pd
df = pd.DataFrame({'id': [101, 102],
'full_name': ['John Brown', 'Bob Smith'],
'birth_year': [1960, 1970]})
df_new = df['full_name'].str.split(" ", expand=True)
print(df)
print(df_new)
输出:
id full_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
0 1
0 John Brown
1 Bob Smith
预期输出:
id first_name last_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
df.join(df.full_name.str.split('\s', expand = True) \
.set_axis(['first_name', 'last_name'], axis = 1)) \
[['id', 'first_name', 'last_name', 'birth_year']]
输出:
id full_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
策略是获取您希望替换的列的位置,创建新列,并根据您希望替换的列的位置连接新旧数据框:
#get the position of the column to be replaced
col_position = df.columns.get_loc('full_name')
#create new dataframe that holds the new columns
insert_df = (df
.pop('full_name')
.str.split(expand=True)
.set_axis(['first_name','last_name'],axis='columns')
)
df_by_positions = (#this is the dataframe before col_position
[df.iloc[:,:col_position],
#this is the dataframe we are inserting
insert_df,
#this is the dataframe after col_position
df.iloc[:,col_position:]
]
)
pd.concat(df_by_positions,axis=1)
id first_name last_name birth_year
0 101 John Brown 1960
1 102 Bob Smith 1970
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.