[英]Sum of all columns except the ones specified in a list in Python
I have the below data for each ID: 每个ID我都有以下数据:
id ---- Base AE Val LT RO+ Prem AM TN T3 AR
05 0 34.34 9.42 70.68 0 0 0 0 0 0 0
108 0 43.77 0 28 0 0 0 0 0 0 0
205 0 77.64 0 32.2 0 0 0 0 0 0 0
320 0 66.24 0 59.628 0 0 0 0 0 0 0
313 0 21.66 0 21.442 0 0 0 0 0 0 0
324 0 72.37 0 701.12 0 0 0 0 0 0 0
505 0 76.057 0 43.87 0 0 0 0 0 0 0
Now I want to find the sum of all columns except a few which I specify and the others separately into a column like below: 现在,我想找到所有列的总和,除了我指定的几列和其他列分开的列,如下所示:
id Base Val Others Total
05 34.34 70.68 9.42 114.441387
108 43.77 28 0 71.77
205 77.64 32.2 0 109.84
320 66.24 59.628 0 125.868
313 21.66 21.442 0 43.102
324 72.37 701.12 0 773.49
505 76.057 43.87 0 119.927
So if my list of columns to keep: 因此,如果我要保留的列列表:
cols_to_keep = ['Base','Val']
The other channels which are not part of this list,are to be summed up in Others Column and all the values in each row sum to Total. 不属于此列表的其他通道将在“其他”列中汇总,并且每行中的所有值总计为总计。 id is the index of the records.
id是记录的索引。
I am able to keep the columns I declare in the list, but how to sum up the other columns except in the list in the Others column. 我能够将我声明的列保留在列表中,但是如何总结除“其他”列中的列表以外的其他列。 Can someone please help me with this?
有人可以帮我吗? The data is in a pandas df.
数据在pandas df中。
Use assign
, for filter columns use Index.difference
: 使用
assign
,对于过滤器列,使用Index.difference
:
cols_to_keep = ['Base','Val']
c = df.columns.difference(cols_to_keep)
df = df[cols_to_keep].assign(Others=df[c].sum(axis=1), Total=df.sum(1))
print (df)
Base Val Others Total
id
5 34.340 70.680 9.42 114.440
108 43.770 28.000 0.00 71.770
205 77.640 32.200 0.00 109.840
320 66.240 59.628 0.00 125.868
313 21.660 21.442 0.00 43.102
324 72.370 701.120 0.00 773.490
505 76.057 43.870 0.00 119.927
In [47]: !cat b.txt | tr -s ' ' > data.txt
...: df = pd.read_csv("data.txt",sep=" ", dtype={'id':str})
...: df['Others'] = df['AE']
...: df['Total'] = df['Base'] + df['Others'] + df['Val']
...:
...: cols_to_keep=['id', 'Base', 'Val','Others','Total']
...: c = df.columns.difference(cols_to_keep)
...: df.drop(c, axis=1)
...: newDf = df.drop(c, axis=1)
...:
In [48]: newDf
Out[48]:
id Base Val Others Total
0 05 34.340 70.680 9.42 114.440
1 108 43.770 28.000 0.00 71.770
2 205 77.640 32.200 0.00 109.840
3 320 66.240 59.628 0.00 125.868
4 313 21.660 21.442 0.00 43.102
5 324 72.370 701.120 0.00 773.490
6 505 76.057 43.870 0.00 119.927
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.