简体   繁体   English

根据 Python 中另一个变量的值创建一个变量

[英]Creating a variable conditional to the value of another variable in Python

I'm trying to generate variable which is value depend on the value of another variable.我正在尝试生成变量,该变量的值取决于另一个变量的值。 My dataset is urban_classification and I am trying to create the variable URBRUR based on the value of the variable prc_urbain .我的数据集是urban_classification ,我正在尝试根据变量URBRUR的值创建变量prc_urbain This is my code:这是我的代码:

if urban_classification.prc_urbain>0.5 :
    urban_classification['URBRUR'] = "urban"
else:
    urban_classification['URBRUR'] = "rural"

and I get this error message:我收到此错误消息:

    Traceback (most recent call last):
  File "C:\Users\Utilisateur\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-a94aadb86c32>", line 31, in <module>
    if urban_classification.prc_urbain>0.5 :
  File "C:\Users\Utilisateur\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\generic.py", line 1555, in __nonzero__
    self.__class__.__name__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can you indicate me what I am doing wrong?你能指出我做错了什么吗?

Thanks!谢谢!

The error message:错误信息:

The truth value of a Series is ambiguous. Series 的真值是模棱两可的。

comes from来自

if urban_classification.prc_urbain>0.5 :

because urban_classification.prc_urbain is a pd.Series, hence urban_classification.prc_urbain>0.5 is also a pd.Series made of True/False values, and python is not able to determine if this list of booleans should evaluate to True or not.因为urban_classification.prc_urbain是一个 pd.Series,因此urban_classification.prc_urbain>0.5也是一个由 True/False 值组成的 pd.Series,并且 python 无法确定此布尔值列表是否应评估为 True。

To achieve what you want, you can use pd.cut :要实现您想要的,您可以使用pd.cut

urban_classification["URBRUR"] = pd.cut(urban_classification.prc_urbain, [0, 0.5, 1], labels=["rural", "urban], include_lowest=True)

Example:例子:

import pandas as pd                                                     
s = pd.Series([0, 0.1, 0.45, 0.6, 0.8, 1])                              
pd.cut(s, [0, 0.5, 1], labels=("rural", "urban"), include_lowest=True)                       


0    rural
1    rural
2    rural
3    urban
4    urban
5    urban

Your variable urban_classification.prc_urbain is not a number you can directly compare to 0.5 , but a pandas.Series object (basically a one-dimensional array).您的变量urban_classification.prc_urbain不是可以直接与0.5比较的数字,而是pandas.Series object (基本上是一维数组)。

The error you see is asking you to be more specific: do you want all the values in the array to be >0.5 , any specific one of them, etc...您看到的错误要求您更具体:您是否希望数组中的所有值都为>0.5 ,其中任何特定的值,等等...

If you believe the array is composed by just one element, you only need to append [0] to the Series object, eg:如果您认为数组仅由一个元素组成,您只需将 append [0]到 object 系列,例如:

if urban_classification.prc_urbain[0] > 0.5:
    urban_classification['URBRUR'] = "urban"
else:
    urban_classification['URBRUR'] = "rural"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM