[英]Creating new column and rows based on multiple conditions
I have the following dataframe:-我有以下 dataframe:-
import pandas as pd
df = pd.read_csv('filename.csv')
print(df)
date organic paid source_type
4/1/2018 39911909.19 38575924.75 Search
4/1/2018 5085939.952 882.608927 Social
4/1/2018 16227439.73 0 Mail
4/1/2018 0 5671871.24 Display Ads
4/1/2018 91215520.23 0 Direct
4/1/2018 15743479.56 0 Referrals
I want to add a column total_sum for all the source types except when source type is "Search".我想为所有源类型添加一个列 total_sum,除非源类型为“搜索”。 If the source_type is search I want to break down the single row into two and source type becomes organic search and paid search.
如果 source_type 是搜索,我想将单行分成两行,源类型变为自然搜索和付费搜索。 Inshort a df like below.
简而言之,如下所示。 The summing part is easy to handle i am just stuck with the breaking of rows and conditional column prefix part.
求和部分很容易处理我只是被行和条件列前缀部分的破坏所困扰。 Dataframe I need:-
Dataframe 我需要:-
date source_type Total Sum
4/1/2018 Organic Search 39911909.19
4/1/2018 Paid Search 38575924.75
4/1/2018 Social 5086822.561
4/1/2018 Mail 16227439.73
4/1/2018 Display Ads 5671871.24
4/1/2018 Direct 91215520.23
4/1/2018 Referrals 15743479.56
You can split DataFrame by boolean indexing
with Series.eq
for ==
, then reshape first by DataFrame.melt
with new column with Series.str.capitalize
, filter second by invert mask by ~
, sum values with DataFrame.pop
for remove column after and last use concat
: You can split DataFrame by
boolean indexing
with Series.eq
for ==
, then reshape first by DataFrame.melt
with new column with Series.str.capitalize
, filter second by invert mask by ~
, sum values with DataFrame.pop
for remove column after最后使用concat
:
mask = df['source_type'].eq('Search')
df1 = df[mask].melt(['date','source_type'], value_name='Total Sum')
df1['source_type'] = df1.pop('variable').str.capitalize() + ' Search'
df2 = df[~mask].copy()
df2['Total Sum'] = df2.pop('organic').add(df2.pop('paid'))
df = pd.concat([df1, df2], ignore_index=True)
print (df)
date source_type Total Sum
0 4/1/2018 Organic Search 3.991191e+07
1 4/1/2018 Paid Search 3.857592e+07
2 4/1/2018 Social 5.086823e+06
3 4/1/2018 Mail 1.622744e+07
4 4/1/2018 Display Ads 5.671871e+06
5 4/1/2018 Direct 9.121552e+07
6 4/1/2018 Referrals 1.574348e+07
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.