简体   繁体   中英

Creating a variable conditional to the value of another variable in Python

I'm trying to generate variable which is value depend on the value of another variable. My dataset is urban_classification and I am trying to create the variable URBRUR based on the value of the variable prc_urbain . This is my code:

if urban_classification.prc_urbain>0.5 :
    urban_classification['URBRUR'] = "urban"
else:
    urban_classification['URBRUR'] = "rural"

and I get this error message:

    Traceback (most recent call last):
  File "C:\Users\Utilisateur\AppData\Roaming\Python\Python37\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-a94aadb86c32>", line 31, in <module>
    if urban_classification.prc_urbain>0.5 :
  File "C:\Users\Utilisateur\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\generic.py", line 1555, in __nonzero__
    self.__class__.__name__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can you indicate me what I am doing wrong?

Thanks!

The error message:

The truth value of a Series is ambiguous.

comes from

if urban_classification.prc_urbain>0.5 :

because urban_classification.prc_urbain is a pd.Series, hence urban_classification.prc_urbain>0.5 is also a pd.Series made of True/False values, and python is not able to determine if this list of booleans should evaluate to True or not.

To achieve what you want, you can use pd.cut :

urban_classification["URBRUR"] = pd.cut(urban_classification.prc_urbain, [0, 0.5, 1], labels=["rural", "urban], include_lowest=True)

Example:

import pandas as pd                                                     
s = pd.Series([0, 0.1, 0.45, 0.6, 0.8, 1])                              
pd.cut(s, [0, 0.5, 1], labels=("rural", "urban"), include_lowest=True)                       


0    rural
1    rural
2    rural
3    urban
4    urban
5    urban

Your variable urban_classification.prc_urbain is not a number you can directly compare to 0.5 , but a pandas.Series object (basically a one-dimensional array).

The error you see is asking you to be more specific: do you want all the values in the array to be >0.5 , any specific one of them, etc...

If you believe the array is composed by just one element, you only need to append [0] to the Series object, eg:

if urban_classification.prc_urbain[0] > 0.5:
    urban_classification['URBRUR'] = "urban"
else:
    urban_classification['URBRUR'] = "rural"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM