简体   繁体   中英

How round dataframe column with the same decimals quantity of another column

I have a pandas dataframe like this:

df = pd.DataFrame([
        {'A': 'aaa',  'B': 0.01,    'C': 0.00001,  'D': 0.00999999999476131,   'E': 0.00023191546403037534},
        {'A': 'bbb',  'B': 0.01,    'C': 0.0001,   'D': 0.010000000000218279,  'E': 0.002981781316158273},
        {'A': 'ccc',  'B': 0.1,     'C': 0.001,    'D': 0.0999999999999659,    'E': 0.020048115477145148},
        {'A': 'ddd',  'B': 0.01,    'C': 0.01,     'D': 0.019999999999999574,  'E': 0.397456279809221},
        {'A': 'eee',  'B': 0.00001, 'C': 0.000001, 'D': 0.09500000009999432,   'E': 0.06821282401091405},

    ])
     A          B            C                       D                         E
0  aaa       0.01      0.00001     0.00999999999476131    0.00023191546403037534
1  bbb       0.01       0.0001    0.010000000000218279      0.002981781316158273
2  ccc        0.1        0.001      0.0999999999999659      0.020048115477145148
3  ddd       0.01         0.01    0.019999999999999574         0.397456279809221
4  eee    0.00001     0.000001     0.09500000009999432       0.06821282401091405 

I have tried to round columns D and E to the same number of decimal places as the values in columns B and C without success.

I try this:

    df['b_decimals'] = df['B'].astype(str).str.split('.').str[1].str.len()
    df['c_decimals'] = df['C'].astype(str).str.split('.').str[1].str.len()

    df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'])]
    df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]

but i get this error:

Traceback (most recent call last):
  File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 56, in _wrapfunc
    return getattr(obj, method)(*args, **kwds)
AttributeError: 'float' object has no attribute 'round'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/_GITHUB/python/test.py", line 30, in <module>
    main()
  File "D:/_GITHUB/python/test.py", line 24, in main
    df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
  File "D:/_GITHUB/python/test.py", line 24, in <listcomp>
    df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
  File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 3007, in around
    return _wrapfunc(a, 'round', decimals=decimals, out=out)
  File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 66, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
  File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 46, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
TypeError: integer argument expected, got float

And the problems is this, when creating columns b_decimals and c_decimals, they store NaN values:

     A        B          C                      D                        E  b_decimals   c_decimals
0  aaa     0.01    0.00001    0.00999999999476131   0.00023191546403037534           2          NaN
1  bbb     0.01     0.0001   0.010000000000218279     0.002981781316158273           2            4
2  ccc      0.1      0.001     0.0999999999999659     0.020048115477145148           1            3
3  ddd     0.01       0.01   0.019999999999999574        0.397456279809221           2            2
4  eee  0.00001   0.000001    0.09500000009999432      0.06821282401091405         NaN          NaN

What is the reason for this to happen when creating the columns? Is there another way to get the desired transformation like below?

        A          B           C           D           E
0     aaa       0.01     0.00001        0.01     0.00023
1     bbb       0.01      0.0001        0.01      0.0030
2     ccc        0.1       0.001         0.1       0.020
3     ddd       0.01        0.01        0.02        0.40
4     eee    0.00001    0.000001     0.09600    0.068212

I read them.... thanks!

You can use a -log10 operation to obtain the number of decimals before a digit (credit goes to @Willem van Onsem's answer here ).

Then you can incorporate this into a lambda function that you apply rowwise:

import numpy as np
df['D'] = df.apply(lambda row: round(row['D'], int(-np.floor(np.log10(row['B'])))),axis=1)
df['E'] = df.apply(lambda row: round(row['E'], int(-np.floor(np.log10(row['C'])))),axis=1)

Result:

>>> df
     A       B         C      D         E
0  aaa  0.0100  0.000010  0.010  0.000230
1  bbb  0.0100  0.000100  0.010  0.003000
2  ccc  0.1000  0.001000  0.100  0.020000
3  ddd  0.0100  0.010000  0.020  0.400000
4  eee  0.0001  0.000001  0.095  0.068213

>>> df.values
array([['aaa', 0.01, 1e-05, 0.01, 0.00023],
       ['bbb', 0.01, 0.0001, 0.01, 0.003],
       ['ccc', 0.1, 0.001, 0.1, 0.02],
       ['ddd', 0.01, 0.01, 0.02, 0.4],
       ['eee', 0.0001, 1e-06, 0.095, 0.068213]], dtype=object)

I use part of the solution above of Derek and make my solution:

df['b_decimals'] = -np.floor(np.log10(df['B']))
df['c_decimals'] = -np.floor(np.log10(df['C']))

df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'].astype(int))]
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'].astype(int))]

getting the following:

     A          B           C       D           E    b_decimals    c_decimals
0  aaa       0.01     0.00001    0.01     0.00023             2             5
1  bbb       0.01      0.0001    0.01       0.003             2             4
2  ccc        0.1       0.001     0.1        0.02             1             3
3  ddd       0.01        0.01    0.02         0.4             2             2
4  eee    0.00001    0.000001     0.1    0.068213             5             6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM