简体   繁体   中英

Create a Boolean column based on a condition

I have a dataframe of 11 columns and I want to create a new 0,1 column based on values in two of those columns.

I have already tried using np.where to create other columns but it doesnt work for this one.

train["location"] = np.where(3750901.5068 <= train["x"] <= 3770901.5068 
and -19268905.6133 <= train['y'] <= -19208905.6133, 1, 0)

I get this error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I'm not sure you even need np.where here. To element-wise and two series, use & here instead of and . See: Logical operators for boolean indexing in Pandas

Also, 3750901.5068 <= train["x"] <= 3770901.5068 seems to be internally translated by python into (3750901.5068 <= train["x"]) and (train["x"] <= 3770901.5068) , which again, has and and won't work. So you'll need to either explicitly split each one up into eg (3750901.5068 <= train["x"]) & (train["x"] <= 3770901.5068) or use Series.between eg train["x"].between(3750901.5068, 3770901.5068, inclusive=True) . See: How to select rows in a DataFrame between two values, in Python Pandas?

You'll also need parentheses for the two arguments to & .

So the end result should look like

train["location"] = train["x"].between(3750901.5068, 3770901.5068, inclusive=True) & train['y'].between(-19268905.6133, -19208905.6133, inclusive=True)

This will give you a series of bools (Trues and Falses). These are already just 0s and 1s under-the-hood. If you really want 0s and 1s, you can pick a solution from here . For example, train.location = train.location.astype(int)

You can use pandas.DataFrame.isin which will be a better solution. Also yes you need parenthesis and & instead of "and" . Documentation for pandas.DataFrame.isin https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html

For example:

df=pd.DataFrame({'a':[100,110,120,111,109],'b':[120,345,124,119,127]})
df['c']=np.where((df['a'].isin([100,111])) & (df['b'].isin([120,128])),1,0)

In your case it would be:

train["location"]=np.where(((train["x"].isin([3750901.5068,3770901.5069])) & (train["y"].isin([-19268905.6133,-19268905.6132])),1,0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM