I have a pandas dataframe like this:
df = pd.DataFrame([
{'A': 'aaa', 'B': 0.01, 'C': 0.00001, 'D': 0.00999999999476131, 'E': 0.00023191546403037534},
{'A': 'bbb', 'B': 0.01, 'C': 0.0001, 'D': 0.010000000000218279, 'E': 0.002981781316158273},
{'A': 'ccc', 'B': 0.1, 'C': 0.001, 'D': 0.0999999999999659, 'E': 0.020048115477145148},
{'A': 'ddd', 'B': 0.01, 'C': 0.01, 'D': 0.019999999999999574, 'E': 0.397456279809221},
{'A': 'eee', 'B': 0.00001, 'C': 0.000001, 'D': 0.09500000009999432, 'E': 0.06821282401091405},
])
A B C D E
0 aaa 0.01 0.00001 0.00999999999476131 0.00023191546403037534
1 bbb 0.01 0.0001 0.010000000000218279 0.002981781316158273
2 ccc 0.1 0.001 0.0999999999999659 0.020048115477145148
3 ddd 0.01 0.01 0.019999999999999574 0.397456279809221
4 eee 0.00001 0.000001 0.09500000009999432 0.06821282401091405
I have tried to round columns D and E to the same number of decimal places as the values in columns B and C without success.
I try this:
df['b_decimals'] = df['B'].astype(str).str.split('.').str[1].str.len()
df['c_decimals'] = df['C'].astype(str).str.split('.').str[1].str.len()
df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'])]
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
but i get this error:
Traceback (most recent call last):
File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 56, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
AttributeError: 'float' object has no attribute 'round'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/_GITHUB/python/test.py", line 30, in <module>
main()
File "D:/_GITHUB/python/test.py", line 24, in main
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
File "D:/_GITHUB/python/test.py", line 24, in <listcomp>
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 3007, in around
return _wrapfunc(a, 'round', decimals=decimals, out=out)
File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 66, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "C:\Program Files\Python37\lib\site-packages\numpy\core\fromnumeric.py", line 46, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
TypeError: integer argument expected, got float
And the problems is this, when creating columns b_decimals and c_decimals, they store NaN values:
A B C D E b_decimals c_decimals
0 aaa 0.01 0.00001 0.00999999999476131 0.00023191546403037534 2 NaN
1 bbb 0.01 0.0001 0.010000000000218279 0.002981781316158273 2 4
2 ccc 0.1 0.001 0.0999999999999659 0.020048115477145148 1 3
3 ddd 0.01 0.01 0.019999999999999574 0.397456279809221 2 2
4 eee 0.00001 0.000001 0.09500000009999432 0.06821282401091405 NaN NaN
What is the reason for this to happen when creating the columns? Is there another way to get the desired transformation like below?
A B C D E
0 aaa 0.01 0.00001 0.01 0.00023
1 bbb 0.01 0.0001 0.01 0.0030
2 ccc 0.1 0.001 0.1 0.020
3 ddd 0.01 0.01 0.02 0.40
4 eee 0.00001 0.000001 0.09600 0.068212
I read them.... thanks!
You can use a -log10
operation to obtain the number of decimals before a digit (credit goes to @Willem van Onsem's answer here ).
Then you can incorporate this into a lambda function that you apply
rowwise:
import numpy as np
df['D'] = df.apply(lambda row: round(row['D'], int(-np.floor(np.log10(row['B'])))),axis=1)
df['E'] = df.apply(lambda row: round(row['E'], int(-np.floor(np.log10(row['C'])))),axis=1)
Result:
>>> df
A B C D E
0 aaa 0.0100 0.000010 0.010 0.000230
1 bbb 0.0100 0.000100 0.010 0.003000
2 ccc 0.1000 0.001000 0.100 0.020000
3 ddd 0.0100 0.010000 0.020 0.400000
4 eee 0.0001 0.000001 0.095 0.068213
>>> df.values
array([['aaa', 0.01, 1e-05, 0.01, 0.00023],
['bbb', 0.01, 0.0001, 0.01, 0.003],
['ccc', 0.1, 0.001, 0.1, 0.02],
['ddd', 0.01, 0.01, 0.02, 0.4],
['eee', 0.0001, 1e-06, 0.095, 0.068213]], dtype=object)
I use part of the solution above of Derek and make my solution:
df['b_decimals'] = -np.floor(np.log10(df['B']))
df['c_decimals'] = -np.floor(np.log10(df['C']))
df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'].astype(int))]
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'].astype(int))]
getting the following:
A B C D E b_decimals c_decimals
0 aaa 0.01 0.00001 0.01 0.00023 2 5
1 bbb 0.01 0.0001 0.01 0.003 2 4
2 ccc 0.1 0.001 0.1 0.02 1 3
3 ddd 0.01 0.01 0.02 0.4 2 2
4 eee 0.00001 0.000001 0.1 0.068213 5 6
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.