[英]Creating a variable conditional to the value of another variable in Python
I'm trying to generate variable which is value depend on the value of another variable.我正在尝试生成变量,该变量的值取决于另一个变量的值。 My dataset is urban_classification
and I am trying to create the variable URBRUR
based on the value of the variable prc_urbain
.我的数据集是urban_classification
,我正在尝试根据变量URBRUR
的值创建变量prc_urbain
。 This is my code:这是我的代码:
if urban_classification.prc_urbain>0.5 :
urban_classification['URBRUR'] = "urban"
else:
urban_classification['URBRUR'] = "rural"
and I get this error message:我收到此错误消息:
Traceback (most recent call last):
File "C:\Users\Utilisateur\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-a94aadb86c32>", line 31, in <module>
if urban_classification.prc_urbain>0.5 :
File "C:\Users\Utilisateur\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\generic.py", line 1555, in __nonzero__
self.__class__.__name__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Can you indicate me what I am doing wrong?你能指出我做错了什么吗?
Thanks!谢谢!
The error message:错误信息:
The truth value of a Series is ambiguous. Series 的真值是模棱两可的。
comes from来自
if urban_classification.prc_urbain>0.5 :
because urban_classification.prc_urbain
is a pd.Series, hence urban_classification.prc_urbain>0.5
is also a pd.Series made of True/False values, and python is not able to determine if this list of booleans should evaluate to True or not.因为urban_classification.prc_urbain
是一个 pd.Series,因此urban_classification.prc_urbain>0.5
也是一个由 True/False 值组成的 pd.Series,并且 python 无法确定此布尔值列表是否应评估为 True。
To achieve what you want, you can use pd.cut :要实现您想要的,您可以使用pd.cut :
urban_classification["URBRUR"] = pd.cut(urban_classification.prc_urbain, [0, 0.5, 1], labels=["rural", "urban], include_lowest=True)
Example:例子:
import pandas as pd
s = pd.Series([0, 0.1, 0.45, 0.6, 0.8, 1])
pd.cut(s, [0, 0.5, 1], labels=("rural", "urban"), include_lowest=True)
0 rural
1 rural
2 rural
3 urban
4 urban
5 urban
Your variable urban_classification.prc_urbain
is not a number you can directly compare to 0.5
, but a pandas.Series
object (basically a one-dimensional array).您的变量urban_classification.prc_urbain
不是可以直接与0.5
比较的数字,而是pandas.Series
object (基本上是一维数组)。
The error you see is asking you to be more specific: do you want all the values in the array to be >0.5
, any specific one of them, etc...您看到的错误要求您更具体:您是否希望数组中的所有值都为>0.5
,其中任何特定的值,等等...
If you believe the array is composed by just one element, you only need to append [0]
to the Series object, eg:如果您认为数组仅由一个元素组成,您只需将 append [0]
到 object 系列,例如:
if urban_classification.prc_urbain[0] > 0.5:
urban_classification['URBRUR'] = "urban"
else:
urban_classification['URBRUR'] = "rural"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.