Adding new calculated columns in pandas data frame

Question

Assume I have a small data frame:

import pandas as pd
df = pd.DataFrame(
         [
             ["A",  28, 726, 120],
             ["B",  28, 1746, 250],
             ["C", 543, 15307, 4500]
         ],
         columns = ["case", "x", "y", "z"]
    )

I know how to calculate a total column as (for example):

cols = list(df.columns)
df['total'] = df.loc[:, cols].sum(axis=1)

Now I would like to append to df 3 other columns x_pct, y_pct, z_pct , containing the percentage of x,y,z in relation to total , that is to say: x_pct=100*(x/total) , etc.

And after that, I would like to still append 3 new columns x_pctr, y_pctr, z_pctr , containing the percentages rounded to a whole number: round(x_pct) , etc.

Although I know, of course, how to calculate individually x_pct, x_pctr and so on, I couldn't find how to express simply the calculation of the 3 "percentage columns" in one run (and besides the calculation of the 3 "rounded columns" in one run), nor to construct a "global" data frame containing the previous columns and the resulting ones...

I am a little confused because I guess apply(lambda...) would do the job, if only I knew how to use it? Could you get me out of there?

Answer 1

Try:

df[["x_pctr", "y_pctr", "z_pctr"]] = (
    df.loc[:, "x":].div(df.sum(axis=1), axis=0) * 100
).round()
print(df)

Prints:

  case    x      y     z  x_pctr  y_pctr  z_pctr
0    A   28    726   120     3.0    83.0    14.0
1    B   28   1746   250     1.0    86.0    12.0
2    C  543  15307  4500     3.0    75.0    22.0

Adding new calculated columns in pandas data frame

Question

1 answers

solution1
0 ACCPTED 2021-05-26 19:52:19

Adding new calculated columns in pandas data frame

Question

1 answers

solution1 0 ACCPTED 2021-05-26 19:52:19

solution1
0 ACCPTED 2021-05-26 19:52:19