[英]How to get an average of row excluding specific value less than or greater than and add new column at last, Python, Pandas
[英]Add the difference of two pandas column values to a new row if the difference is greater than a column value in Pandas
'''這是我擁有的數據樣本'''
PERIOD GROUP USER_COUNT REGION
50 A 55 AX
25 A 20 AX
30 B 33 BY
40 C 10 CZ
預期產出
PERIOD GROUP USER_COUNT REGION
50 A 50 AX
50 A 5 AX
25 A 20 AX
30 B 30 BY
30 B 3 BY
40 C 10 CZ
用:
#get difference of columns
s = df['USER_COUNT'].sub(df['PERIOD'])
#mask for positive subtract values
m = s > 0
#subtract of original data ony matched rows of column VAL2
df1 = df.assign(USER_COUNT = lambda x: x['USER_COUNT'].sub(s[m], fill_value=0))
#overwrite matched rows
df2 = df[m].assign(USER_COUNT = s[m])
#join together and sorting by only stable sorting - mergesort
df3 = (pd.concat([df1, df2])
.sort_index(kind='mergesort')
.reset_index(drop=True)
.astype(df.dtypes))
print (df3)
PERIOD GROUP USER_COUNT REGION
0 50 A 50 AX
1 50 A 5 AX
2 25 A 20 AX
3 30 B 30 BY
4 30 B 3 BY
5 40 C 10 CZ
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.