[英]Applying function to a data frame in Pandas returns UnboundLocalError
My dataframe looks like this:我的 dataframe 看起来像这样:
Name Overall Rating Value in Millions
0 Neymar Jr 92 €105.5M
1 L. Messi 94 €95.5M
2 K. Mbappé 89 €93.5M
3 V. van Dijk 91 €90M
4 K. De Bruyne 91 €90M
... ... ... ...
19692 I. Isa 63 €0
19693 I. Fetfatzidis 74 €0
19694 M. Mohsen 66 €0
19695 B. Jokič 72 €0
19696 B. Sigurðarson 73 €0
I am trying to apply a function to the 3rd column "Value in Millions" to convert values from string format to floats:我正在尝试将 function 应用于第三列“以百万为单位的值”,以将值从字符串格式转换为浮点数:
#A function to convert the values in the third row from strings to floats
def value_to_float(value_as_string): # eg.'€95.5M'
value_as_string = value_as_string.strip('€')
if 'M' in value_as_string: #95.5M - string
value_as_string = value_as_string.strip('M') #95.5 - string
multiplier = float(value_as_string) #95.5 - float
value_as_float = multiplier * 1000000 #95000000.0 - float
if 'K' in value_as_string:
value_as_string = value_as_string.strip('K')
multiplier = float(value_as_string)
value_as_float = multiplier * 1000 #Same as above, in case of K(Thousands)
return value_as_float
The function works correctly when given an explicit parameter:给定显式参数时,function 可以正常工作:
value_to_float('€95.5M')
95500000.0
However, when I try the following:但是,当我尝试以下操作时:
players["Value in Millions"].apply(value_to_float)
I get this error:我收到此错误:
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
<ipython-input-80-3d3345f9405d> in <module>
----> 1 players["Value in Millions"].apply(value_to_float)
~/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3846 else:
3847 values = self.astype(object).values
-> 3848 mapped = lib.map_infer(values, f, convert=convert_dtype)
3849
3850 if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-61-158745b17930> in value_to_float(value_as_string)
15 value_as_float = multiplier * 1000 #Same as above, in case of K(Thousands)
16
---> 17 return value_as_float
UnboundLocalError: local variable 'value_as_float' referenced before assignment
I tried several other methods(series.map(), oldschool looping), but I always get the same error, so I'm inclined to think there's a gap in the logic somewhere.我尝试了其他几种方法(series.map(),oldschool looping),但我总是得到同样的错误,所以我倾向于认为某处的逻辑存在差距。
Your issue stems from the fact that your function assumes that all rows have an "M" or "K" in their values for "Value in millions".您的问题源于您的 function 假设所有行的“价值以百万计”的值中都有“M”或“K”。 In your own dataframe example above we can clearly see examples where the "Value in millions" is "€0".在您自己的 dataframe 示例中,我们可以清楚地看到“百万价值”为“0 欧元”的示例。 Following the function logic, the value_as_float variable never gets set so it throws the error you outlined.遵循 function 逻辑, value_as_float 变量永远不会被设置,因此它会引发您概述的错误。
Adjusting your function to set the value_as_float to 0 by default fixes this issue.调整您的 function 以将 value_as_float 默认设置为 0 可以解决此问题。
def value_to_float(value_as_string): # eg.'€95.5M'
value_as_string = value_as_string.strip('€')
value_as_float = 0
if 'M' in value_as_string: #95.5M - string
value_as_string = value_as_string.strip('M') #95.5 - string
multiplier = float(value_as_string) #95.5 - float
value_as_float = multiplier * 1000000 #95000000.0 - float
if 'K' in value_as_string:
value_as_string = value_as_string.strip('K')
multiplier = float(value_as_string)
value_as_float = multiplier * 1000 #Same as above, in case of K(Thousands)
return value_as_float
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.