I have a dataframe that looks like below -
Year Salary Amount
0 2019 1200 53
1 2020 3443 455
2 2021 6777 123
3 2019 5466 313
4 2020 4656 545
5 2021 4565 775
6 2019 4654 567
7 2020 7867 657
8 2021 6766 567
Python script to get the dataframe below -
import pandas as pd
import numpy as np
d = pd.DataFrame({
'Year': [
2019,
2020,
2021,
] * 3,
'Salary': [
1200,
3443,
6777,
5466,
4656,
4565,
4654,
7867,
6766
],
'Amount': [
53,
455,
123,
313,
545,
775,
567,
657,
567
]
})
I want to calculate certain percentile values for all the columns grouped by 'Year'. Desired output should look like -
I am running below python script to perform the calculations to calculate certain percentile values-
df_percentile = pd.DataFrame()
p_list = [0.05, 0.10, 0.25, 0.50, 0.75, 0.95, 0.99]
c_list = []
p_values = []
for cols in d.columns[1:]:
for p in p_list:
c_list.append(cols + '_' + str(p))
p_values.append(np.percentile(d[cols], p))
print(len(c_list), len(p_values))
df_percentile['Name'] = pd.Series(c_list)
df_percentile['Value'] = pd.Series(p_values)
print(df_percentile)
Output -
Name Value
0 Salary_0.05 1208.9720
1 Salary_0.1 1217.9440
2 Salary_0.25 1244.8600
3 Salary_0.5 1289.7200
4 Salary_0.75 1334.5800
5 Salary_0.95 1370.4680
6 Salary_0.99 1377.6456
7 Amount_0.05 53.2800
8 Amount_0.1 53.5600
9 Amount_0.25 54.4000
10 Amount_0.5 55.8000
11 Amount_0.75 57.2000
12 Amount_0.95 58.3200
13 Amount_0.99 58.5440
How can I get the output in the required format without having to do extra data manipulation/formatting or in fewer lines of code?
You can try pivot
followed by quantile
:
(df.pivot(columns='Year')
.quantile([0.01,0.05,0.75, 0.95, 0.99])
.stack('Year')
)
Output:
Salary Amount
Year
0.01 2019 1269.08 58.20
2020 3467.26 456.80
2021 4609.02 131.88
0.05 2019 1545.40 79.00
2020 3564.30 464.00
2021 4785.10 167.40
0.75 2019 5060.00 440.00
2020 6261.50 601.00
2021 6771.50 671.00
0.95 2019 5384.80 541.60
2020 7545.90 645.80
2021 6775.90 754.20
0.99 2019 5449.76 561.92
2020 7802.78 654.76
2021 6776.78 770.84
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.