简体   繁体   中英

How to store values in a Pandas DataFrame as a percentage and not a string

I'm using pandas to create data frames which will then be imported into PowerBI for visualization. One of the columns in the data frame is a percentage calculation.

I have no issues calculating the values. However, these values appear without the '%' sign at the end, eg 55.2 as opposed to 55.2%.

An example of my initial dataframe:

df1 = 

year_per    pass    fail    total
---------------------------------
201901      300     700     1000
201902      400     600     1000
201903      200     800     1000
201904      500     500     1000

I then calculate two new columns to state the % of the total that each column represent, such that the new data frame is:

df2 = 

year_per    pass    fail    total    pass%    fail%
---------------------------------------------------
201901      300     700     1000     30.0     70.0
201902      400     600     1000     40.0     60.0
201903      200     800     1000     20.0     80.0
201904      500     500     1000     50.0     50.0

These new % columns are created using the following code:

df2['pass%'] = round((df1['pass'] / df1['total']) * 100,1)

Which works. PowerBI is happy to use those values. However, I'd like it to display the '%' sign at the end for clarity. Therefore, I updated the calculation code to:

df2['pass%'] = (round((df1['pass'] / df1['total']) * 100,1).astype(str))+'%'

This also produces the right output, visually. However, as the values are now strings, PowerBI can't process the new values as the visualization is expecting a number format, not a string.

I've also tried using the following formatting (as mentioned here: how to show Percentage in python ):

{0:.1f}%".format()

ie:

df2['pass%'] = '{0:.1f}%'.format(round((df1['pass'] / df1['total']) * 100,1))

but get the error:

'TypeError: unsupported format string passed to Series.__format__'

Therefore, I was wondering if there is a way to store the values as a number format with the % sign following the numbers? Otherwise I'll just have to live with the values without the % sign.

This is, because you pass a series to round , which it expects a scalar numeric argument, but gets a series (also format would have a problem with a series). You can do instead:

df2['pass%'] = (df1['pass'] / df1['total']).map(lambda num: '{0:.1f}%'.format(round(num * 100, 1))

But you know, in contrast to the title of your question, this would of course store the percentage as a string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM