[英]Python Pandas Column remove special character and arrange values
我有一个如下所示的数据框。
columnA columnB columnC
[['Beauty & Wellness/Beauty Mavens', '21', '17', '22'], ['Beauty & Wellness/Frequently Visits Salons', '22', '21', '25']] GA_All_B2B_Visitors_Jan20 2020-01-10 to 2020-01-15
[['Banking & Finance/Avid Investors', '585', '455', '700'], ['Beauty & Wellness/Beauty Mavens', '414', '339', '467']] GA_Oven_Page_Visitors_Nov2019 2020-01-10 to 2020-01-15
我正在尝试将其安排为如下所示,但我从哪里开始卡住了:
columnA cola colb colc columnB columnC
Beauty & Wellness/Beauty Mavens 21 17 22 GA_All_B2B_Visitors_Jan20 2020-01-10 to 2020-01-15
Beauty & Wellness/Frequently Visits Salons 22 21 25 GA_All_B2B_Visitors_Jan20 2020-01-10 to 2020-01-15
Banking & Finance/Avid Investors 585 455 700 GA_Oven_Page_Visitors_Nov2019 2020-01-10 to 2020-01-15
Beauty & Wellness/Beauty Mavens 414 339 467 GA_Oven_Page_Visitors_Nov2019 2020-01-10 to 2020-01-15
我的方法如下所示,但从哪里开始就卡住了。 我首先尝试拆分第一列的值,但它不起作用。
df_seg = pd.concat([df_seg[['columnB', 'columnC']], df_seg['columnA'].str.split(', ', expand=True)], axis=1)
任何人都可以帮忙吗?
使用DataFrame.explode
能够创建一个DataFrame
两个列表内容的DataFrame.join
并将其与DataFrame.join
new_df = df.explode('columnA').reset_index(drop=True)
new_df = (pd.DataFrame(new_df['columnA'].tolist(),
columns = ['columnA','cola','colb','colc'])
.join(new_df[['columnB','columnC']]))
print(new_df)
columnA cola colb colc \
0 Beauty & Wellness/Beauty Mavens 21 17 22
1 Beauty & Wellness/Frequently Visits Salons 22 21 25
2 Banking & Finance/Avid Investors 585 455 700
3 Beauty & Wellness/Beauty Mavens 414 339 467
columnB columnC
0 GA_All_B2B_Visitors_Jan20 2020-01-10 to 2020-01-15
1 GA_All_B2B_Visitors_Jan20 2020-01-10 to 2020-01-15
2 GA_Oven_Page_Visitors_Nov2019 2020-01-10 to 2020-01-15
3 GA_Oven_Page_Visitors_Nov2019 2020-01-10 to 2020-01-15
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.