简体   繁体   English

数据帧中特定行的总和(Pandas)

[英]Sum of specific rows in a dataframe (Pandas)

I'm given a set of the following data: 我给出了一组以下数据:

week  A      B      C      D      E
1     243    857    393    621    194
2     644    576    534    792    207
3     946    252    453    547    436
4     560    100    864    663    949
5     712    734    308    385    303

I'm asked to find the sum of each column for specified rows/a specified number of weeks , and then plot those numbers onto a bar chart to compare AE. 我被要求找到指定行/指定周数的每列的总和,然后将这些数字绘制到条形图上以比较AE。

Assuming I have the rows I need (eg df.iloc[2:4,:] ), what should I do next? 假设我有我需要的行(例如df.iloc[2:4,:] ),接下来我该怎么办? My assumption is that I need to create a mask with a single row that includes the sum of each column, but I'm not sure how I go about doing that. 我的假设是我需要创建一个包含每一列总和的单行掩码,但我不知道我是怎么做的。

I know how to do the final step (ie .plot(kind='bar' ), I just need to know what the middle step is to obtain the sums I need. 我知道如何做最后一步(即.plot(kind='bar' ),我只需要知道中间步骤是什么,以获得我需要的总和。

You can use for select by positions iloc , sum and Series.plot.bar : 您可以使用位置选择ilocsumSeries.plot.bar

df.iloc[2:4].sum().plot.bar()

graph1

Or if want select by names of index (here weeks) use loc : 或者如果想要按索引名称(这里是几周)选择,请使用loc

df.loc[2:4].sum().plot.bar()

graph2

Difference is iloc exclude last position: 区别是iloc排除最后位置:

print (df.loc[2:4])
        A    B    C    D    E
week                         
2     644  576  534  792  207
3     946  252  453  547  436
4     560  100  864  663  949

print (df.iloc[2:4])
        A    B    C    D    E
week                         
3     946  252  453  547  436
4     560  100  864  663  949

And if need also filter columns by positions: 如果还需要按位置过滤列:

df.iloc[2:4, :4].sum().plot.bar()  

And by names (weeks): 并按名称(周):

df.loc[2:4, list('ABCD')].sum().plot.bar()

All you need to do is call .sum() on your subset of the data: 您需要做的就是在您的数据子集上调用.sum()

df.iloc[2:4,:].sum()

Returns: 返回:

week       7
A       1506
B        352
C       1317
D       1210
E       1385
dtype: int64

Furthermore, for plotting, I think you can probably get rid of the week column (as the sum of week numbers is unlikely to mean anything): 此外,对于绘图,我认为你可以摆脱week列(因为周数的总和不太可能意味着什么):

df.iloc[2:4,1:].sum().plot(kind='bar')
# or
df[list('ABCDE')].iloc[2:4].sum().plot(kind='bar')

情节

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM