Is there an easy way to sum the value of all the rows above the current row in an adjacent column? Click on the image below to see what I'm trying to make. It's easier to see it than explain it.
Text explanation: I'm trying to create a chart where column B is either the sum or percent of total of all the rows in A that are above it. That way I can quickly visualize where the quartile, third, etc are in the dataframe. I'm familiar with the percentile function
but I'm not sure I can get it to do exactly what I want it to do. Image below as well as text version:
Text Version
1--1%
1--2%
4--6%
4--10%
2--12%
... and so on to 100 percent.
Do i need to write a for loop to do this?
you can use cumsum
for this:
import numpy as np
import pandas as pd
df = pd.DataFrame(data=dict(x=[13,22,34,21,33,41,87,24,41,22,18,12,13]))
df["percent"] = (100*df.x.cumsum()/df.x.sum()).round(1)
output:
x percent
0 13 3.4
1 22 9.2
2 34 18.1
3 21 23.6
4 33 32.3
5 41 43.0
6 87 65.9
7 24 72.2
8 41 82.9
9 22 88.7
10 18 93.4
11 12 96.6
12 13 100.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.