[英]Set columns of DataFrame to sum of columns of another in pandas
I have a DataFrame that looks like the below, call this "values": 我有一个如下所示的DataFrame,称其为“值”:
I would like to create another, call it "sums" that contains the sum of the DataFrame "values" from the column in "sums" to the end. 我想创建另一个,将其称为“ sums”,其中包含从“ sums”中的列到末尾的DataFrame“ values”的总和。 It would look like the below:
它看起来像下面的样子:
I would like to create this without looking through the entire DataFrame, data point by data point. 我想创建这个而不用逐个数据地查看整个DataFrame。 I have been trying with
.apply()
as seen below, but I keep getting the error: unsupported operand type(s) for +: 'int' and 'datetime.date'
我一直在尝试使用
.apply()
,如下所示,但是我一直收到错误: unsupported operand type(s) for +: 'int' and 'datetime.date'
In [26]: values = pandas.DataFrame({0:[96,54,27,28],
1:[55,75,32,37],2:[54,99,36,46],3:[35,77,0,10],4:[62,25,0,25],
5:[0,66,0,89],6:[0,66,0,89],7:[0,0,0,0],8:[0,0,0,0]})
In [28]: sums = values.copy()
In [29]: sums.iloc[:,:] = ''
In [31]: for column in sums:
...: sums[column].apply(sum(values.loc[:,column:]))
...:
Traceback (most recent call last):
File "<ipython-input-31-030442e5005e>", line 2, in <module>
sums[column].apply(sum(values.loc[:,column:]))
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\series.py", line 2220, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\src\inference.pyx", line 1088, in pandas.lib.map_infer (pandas\lib.c:63043)
TypeError: 'numpy.int64' object is not callable
In [32]: for column in sums:
...: sums[column] = sum(values.loc[:,column:])
In [33]: sums
Out[33]:
0 1 2 3 4 5 6 7 8
0 36 36 35 33 30 26 21 15 8
1 36 36 35 33 30 26 21 15 8
2 36 36 35 33 30 26 21 15 8
3 36 36 35 33 30 26 21 15 8
Is there a way to do this without looping each point individually? 有没有一种方法可以不单独循环每个点?
Without looping, you can reverse your dataframe, cumsum
per line and then re-reverse it: 无需循环,您可以反转数据帧,每行的
cumsum
,然后重新反转它:
>>> values.iloc[:,::-1].cumsum(axis=1).iloc[:,::-1]
0 1 2 3 4 5 6 7 8
0 302 206 151 97 62 0 0 0 0
1 462 408 333 234 157 132 66 0 0
2 95 68 36 0 0 0 0 0 0
3 324 296 259 213 203 178 89 0 0
You can use the .cumsum()
method to get the cumulative sum. 您可以使用
.cumsum()
方法获取累积和。 The problem is that is operates from left to right, where you need it from right to left. 问题是操作是从左到右,在您需要的地方从右到左。
So we will reverse you data frame, use cumsum()
, then set the axes back into the proper order. 因此,我们将反转数据框,使用
cumsum()
,然后将轴重新设置为正确的顺序。
import pandas as pd
values = pd.DataFrame({0:[96,54,27,28],
1:[55,75,32,37],2:[54,99,36,46],3:[35,77,0,10],4:[62,25,0,25],
5:[0,66,0,89],6:[0,66,0,89],7:[0,0,0,0],8:[0,0,0,0]})
values[values.columns[::-1]].cumsum(axis=1).reindex_axis(values.columns, axis=1)
# returns:
0 1 2 3 4 5 6 7 8
0 302 206 151 97 62 0 0 0 0
1 462 408 333 234 157 132 66 0 0
2 95 68 36 0 0 0 0 0 0
3 324 296 259 213 203 178 89 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.