[英]Pandas apply/map not working at some rows when custom function is used
I have the following DataFrame where the CMO _ ($ / MWH) is a finantial value with needs to be rounded using the old rounding method (eg: 754.275 must be rounded to 754.28):我有以下 DataFrame 其中CMO _ ($ / MWH)是一个需要使用旧舍入方法四舍五入的最终值(例如:754.275 必须四舍五入为 754.28):
INIT_DATE INIT_HOUR CMO_($/MWH)
0 2020-12-01 00:00:00 754.275
1 2020-12-01 01:00:00 728.130
2 2020-12-01 02:00:00 722.575
3 2020-12-01 03:00:00 722.045
4 2020-12-01 04:00:00 721.950
5 2020-12-01 05:00:00 721.035
6 2020-12-01 06:00:00 722.100
7 2020-12-01 07:00:00 739.925
8 2020-12-01 08:00:00 771.390
9 2020-12-01 09:00:00 797.415
10 2020-12-01 10:00:00 796.585
11 2020-12-01 11:00:00 791.875
12 2020-12-01 12:00:00 782.225
13 2020-12-01 13:00:00 783.540
14 2020-12-01 14:00:00 790.980
15 2020-12-01 15:00:00 815.555
16 2020-12-01 16:00:00 824.760
17 2020-12-01 17:00:00 782.265
18 2020-12-01 18:00:00 779.970
19 2020-12-01 19:00:00 784.640
20 2020-12-01 20:00:00 785.380
21 2020-12-01 21:00:00 785.840
22 2020-12-01 22:00:00 779.775
23 2020-12-01 23:00:00 763.775
Since Python 3+ uses the Bank's Rounding rule, I implemented a custom round function that makes the old rounding method:由于 Python 3+ 使用银行的舍入规则,我实现了一个自定义舍入 function 使旧舍入方法:
def _old_round(n):
"""
Since Python 3+ the rounding is made through the Banker's Round Rule,
this function uses the old rounding method (round up if 0.5).
Args:
n (float): A number with 3 thousands values after decimal point
Returns:
(float) : The rounded number now with 2 decimal places, but using
the old round method
Example:
>>> _old_round(782.225)
>>> 782.23
Additional:
For more information about rounding at Python 3, go to:
https://stackoverflow.com/questions/10825926/python-3-x-rounding-behavior
"""
# * Multiply ou number by 100 since we want to round the 2 decimal place
new_n = n * 100
# * Round using the Decimal lib, but using the old method
n_dec = Decimal(new_n).quantize(Decimal('1'), rounding=ROUND_HALF_UP)
# * Roll back to original format
return float(n_dec / 100)
The function works normally when I use only numbers (see function example above), but when I use the Pandas apply or map to perform the calculation in the CMO _ ($ / MWH) column, some values are not rounded correctly ( index 12 ), does anyone know why? The function works normally when I use only numbers (see function example above), but when I use the Pandas apply or map to perform the calculation in the CMO _ ($ / MWH) column, some values are not rounded correctly ( index 12 ) ,有人知道为什么吗? Find the DF after calculation bellow:在下面计算后找到 DF:
df['ROUNDED_CMO'] = df['CMO_($/MWH)'].apply(_old_round)
df
Results:结果:
INIT_DATE INIT_HOUR CMO_($/MWH) NEW_CMO
0 2020-12-01 00:00:00 754.275 754.28
1 2020-12-01 01:00:00 728.130 728.13
2 2020-12-01 02:00:00 722.575 722.58
3 2020-12-01 03:00:00 722.045 722.05
4 2020-12-01 04:00:00 721.950 721.95
5 2020-12-01 05:00:00 721.035 721.04
6 2020-12-01 06:00:00 722.100 722.10
7 2020-12-01 07:00:00 739.925 739.93
8 2020-12-01 08:00:00 771.390 771.39
9 2020-12-01 09:00:00 797.415 797.42
10 2020-12-01 10:00:00 796.585 796.59
11 2020-12-01 11:00:00 791.875 791.88
12 2020-12-01 12:00:00 782.225 782.22 <
13 2020-12-01 13:00:00 783.540 783.54
14 2020-12-01 14:00:00 790.980 790.98
15 2020-12-01 15:00:00 815.555 815.56
16 2020-12-01 16:00:00 824.760 824.76
17 2020-12-01 17:00:00 782.265 782.27
18 2020-12-01 18:00:00 779.970 779.97
19 2020-12-01 19:00:00 784.640 784.64
20 2020-12-01 20:00:00 785.380 785.38
21 2020-12-01 21:00:00 785.840 785.84
22 2020-12-01 22:00:00 779.775 779.78
23 2020-12-01 23:00:00 763.775 763.78
The value at index 12 should be: 782.23 and not 782.22 , the function works for all the data but not for this specific index for some reason... I'm using a Miniconda Environment (Windows 10) with pandas 1.1.3, Python 3.6.12, jupyter-notebook 6.1.4 (I'm running the code here). The value at index 12 should be: 782.23 and not 782.22 , the function works for all the data but not for this specific index for some reason... I'm using a Miniconda Environment (Windows 10) with pandas 1.1.3, Python 3.6.12,jupyter-notebook 6.1.4(我在这里运行代码)。
Obs: I already tried to use Math.Ceil, Numpy.Ceil, Decimal Package Round Options, and the solutions at: Question 1 , Question 2 , Question 3 , none of them work properly. Obs:我已经尝试过使用 Math.Ceil、Numpy.Ceil、Decimal Package Round Options,以及以下解决方案: 问题 1 、 问题 2 、 问题 3 ,它们都不能正常工作。 I also already tried (the index 12 still not rounding):我也已经尝试过(索引 12 仍未四舍五入):
df['ROUNDED_CMO'] = df['CMO_($/MWH)'].map(lambda x : _old_round(x))
df['ROUNDED_CMO'] = df['CMO_($/MWH)'].apply(lambda x : _old_round(x))
For me your code is working perfectly, maybe you have a version problem in your packages.对我来说,您的代码运行良好,也许您的包中有版本问题。 I will show you:我给你看:
pandas
version: '1.0.5'.我的pandas
版本:'1.0.5'。decimal
version: '1.70'我的decimal
版本:'1.70'First I create your dataframe:首先,我创建您的 dataframe:
import pandas as pd
from decimal import Decimal, ROUND_HALF_UP
df = pd.DataFrame({
"CMO_($/MWH)" :
[
754.275,
728.130,
722.575,
722.045,
721.950,
721.035,
722.100,
739.925,
771.390,
797.415,
796.585,
791.875,
782.225,
783.540,
790.980,
815.555,
824.760,
782.265,
779.970,
784.640,
785.380,
785.840,
779.775,
763.775
]
})
Then I apply your function:然后我申请你的 function:
df['ROUNDED_CMO'] = df['CMO_($/MWH)'].map(lambda x : _old_round(x))
The output is: output 是:
CMO_($/MWH) ROUNDED_CMO
0 754.275 754.28
1 728.130 728.13
2 722.575 722.58
3 722.045 722.05
4 721.950 721.95
5 721.035 721.04
6 722.100 722.10
7 739.925 739.93
8 771.390 771.39
9 797.415 797.42
10 796.585 796.59
11 791.875 791.88
12 782.225 782.23
13 783.540 783.54
14 790.980 790.98
15 815.555 815.56
16 824.760 824.76
17 782.265 782.27
18 779.970 779.97
19 784.640 784.64
20 785.380 785.38
21 785.840 785.84
22 779.775 779.78
23 763.775 763.78
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.