简体   繁体   中英

Modify and round numbers in a pandas dataframe in Python

Long story short, I have a csv file which I read as a pandas dataframe. The file contains a weather report, but all of the measurements for temperature are in Fahrenheit. I've figured out how to convert them:

import pandas as np

df = np.read_csv('report.csv')
df['average temperature'] = (df['average temperature'] - 32) * 5/9

But then the data for this column is in decimals up to 6 points. I've found code that will round up all the data in the dataframe, but I need only this column.

df.round(2)

I don't like how it has to be a separate piece of code on a separate line and how it modifies all of my data. Is there a way to go about this problem more elegantly? Is there a way to apply this to other columns in my dataframe, such as maximum temperature and minimum temperature without having to copy the above piece of code?

For round only some columns use subset:

cols = ['maximum temperature','minimum temperature','average temperature']
df[cols] = df[cols].round(2)

If want convert only some columns from list :

cols = ['maximum temperature','minimum temperature','average temperature']
df[cols] = ((df[cols] - 32) * 5/9).round(2)

If want round each column separately:

df['average temperature'] = df['average temperature'].round(2)
df['maximum temperature'] = df['maximum temperature'].round(2)
df['minimum temperature'] = df['minimum temperature'].round(2)

Sample:

df = (pd.DataFrame(np.random.randint(30, 100, (10, 3)),
                 columns=['maximum temperature','minimum temperature','average temperature'])
                  .assign(a='m', b=range(10)))
print (df)
   maximum temperature  minimum temperature  average temperature  a  b
0                   97                   60                   98  m  0
1                   64                   86                   64  m  1
2                   32                   64                   95  m  2
3                   60                   56                   93  m  3
4                   43                   89                   64  m  4
5                   40                   62                   86  m  5
6                   37                   40                   70  m  6
7                   61                   33                   46  m  7
8                   36                   44                   46  m  8
9                   63                   30                   33  m  9

cols = ['maximum temperature','minimum temperature','average temperature']
df[cols] = ((df[cols] - 32) * 5/9).round(2)
print (df)
   maximum temperature  minimum temperature  average temperature  a  b
0                36.11                15.56                36.67  m  0
1                17.78                30.00                17.78  m  1
2                 0.00                17.78                35.00  m  2
3                15.56                13.33                33.89  m  3
4                 6.11                31.67                17.78  m  4
5                 4.44                16.67                30.00  m  5
6                 2.78                 4.44                21.11  m  6
7                16.11                 0.56                 7.78  m  7
8                 2.22                 6.67                 7.78  m  8
9                17.22                -1.11                 0.56  m  9

Here's a single line solution with apply and a conversion function.

def convert_to_celsius (f):
    return 5.0/9.0*(f-32)

df[['Column A','Column B']] = df[['Column A','Column B']].apply(convert_to_celsius).round(2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM