so let's say I have a dataframe:
data = [['1', 10,], ['2', 15], ['3', 14]] # Create the pandas DataFrame df = pd.DataFrame(data, columns = ['id', '# of Wagons'])
The output looks like:
id # of Wagons
0 1 10
1 2 15
2 3 14
How do I create percentages of the total while also keeping the total? If I use the.apply() function, I apply percentages to every value in the column, including the total, which I want to avoid doing. My preferred output is:
id # of Wagons new_column
0 1 10 25.64%
1 2 15 38.46%
2 3 14 35.89%
Total 39
You can add the percentage based on the '# of Wagons' like so:
import numpy as np
import pandas as pd
from pandas import DataFrame
total = np.sum(df.loc[:,'# of Wagons':].values)
df['percent'] = df.loc[:,'# of Wagons':].sum(axis=1)/total * 100
df
And if you want to add a 'Total' row you can use this:
df.append(df.sum(numeric_only=True), ignore_index=True)
You can use pd.Series.div
then use {:.precision%}.format
to get values as percentage values.
df.assign(new_col = df['# of Wagons'].div(df['# of Wagons'].sum()).map('{:.2%}'.format))
id # of Wagons new_col
0 1 10 25.64%
1 2 15 38.46%
2 3 14 35.90%
Note:
'{:.precision%}'
is part of python's mini string language
We can do
df['New']=df['# of Wagons']/df['# of Wagons'].sum()
df=df.append(pd.Series(['Total',df['# of Wagons'].sum(),1],index=df.columns),ignore_index=True)
df
Out[158]:
id # of Wagons New
0 1 10 0.256410
1 2 15 0.384615
2 3 14 0.358974
3 Total 39 1.000000
You can do something like this:
total = sum(df['# of Wagons'].values)
df["percentage"] = df['# of Wagons'].apply(lambda x: "{:.2f}%".format((x/total)*100))
print(df)
# id # of Wagons percentage
#0 1 10 25.64%
#1 2 15 38.46%
#2 3 14 35.90%
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.