简体   繁体   中英

Adding new calculated columns in pandas data frame

Assume I have a small data frame:

import pandas as pd
df = pd.DataFrame(
         [
             ["A",  28, 726, 120],
             ["B",  28, 1746, 250],
             ["C", 543, 15307, 4500]
         ],
         columns = ["case", "x", "y", "z"]
    )

I know how to calculate a total column as (for example):

cols = list(df.columns)
df['total'] = df.loc[:, cols].sum(axis=1)

Now I would like to append to df 3 other columns x_pct, y_pct, z_pct , containing the percentage of x,y,z in relation to total , that is to say: x_pct=100*(x/total) , etc.

And after that, I would like to still append 3 new columns x_pctr, y_pctr, z_pctr , containing the percentages rounded to a whole number: round(x_pct) , etc.

Although I know, of course, how to calculate individually x_pct, x_pctr and so on, I couldn't find how to express simply the calculation of the 3 "percentage columns" in one run (and besides the calculation of the 3 "rounded columns" in one run), nor to construct a "global" data frame containing the previous columns and the resulting ones...

I am a little confused because I guess apply(lambda...) would do the job, if only I knew how to use it? Could you get me out of there?

Try:

df[["x_pctr", "y_pctr", "z_pctr"]] = (
    df.loc[:, "x":].div(df.sum(axis=1), axis=0) * 100
).round()
print(df)

Prints:

  case    x      y     z  x_pctr  y_pctr  z_pctr
0    A   28    726   120     3.0    83.0    14.0
1    B   28   1746   250     1.0    86.0    12.0
2    C  543  15307  4500     3.0    75.0    22.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM