简体   繁体   中英

Filling several NaN values, based on 2 conditions, with a certain numeric value

So basically, I've been trying to fill a column's nan values based on another column.

Let's say, I have a column that's called ''accommodates'' (meaning how many people a certain house can accommodate) and another column called bedrooms.

To fill these nan values, I found, for example, what's the most common value for accommodates when a house has 1 bedroom. It returned that the most common value is 2. What I wanted to do now is to fill the nan values in the column accommodates, that correspond to a 1 bedroom house, with 2.

An example of the data is below:

 accommodates bathrooms  bedrooms
    nan         2.0       1.0
    nan         2.0       1.0
    nan         2.0       1.0
    nan         2.0       1.0
    nan         2.0       1.0
    nan         2.0       1.0
    ...         ...       ...

I've done similar things for other attributes, so I tried the following code:

accom_cond=((house.bedrooms==1) & (house.accommodates.isna()))
accom_val= [2,2,2,2,2,2,2,2,2,2,2,2,2,2]

house.accommodates= np.select(accom_cond,accom_val,house.accommodates)

This is assuming that there are 14 NaN values under these circumstances (also, if you know a better way than to repeat the value 2 14times, I'd appreciate it :D)

However, it doesn't not work. It returns the error:

ValueError: list of cases must be same length as list of conditions

I tried to print accom_cond to see what is going on and it returned this:

accom_cond
Out[156]: 
0       False
1       False
2       False
3       False
4       False
5       False
6       False
7       False
8       False
9       False
10      False
11      False
12      False
13      False
14      False
15      False
16      False
17      False
18      False
19      False
20      False
21      False
22      False
23      False
24      False
25      False
26      False
27      False
28      False
29      False
        ...

I don't get why it's not returning just the 14 null values that follow the conditions I defined.

Can anyone help me with this? (Thank you in advance for taking the time to read this!!)

 accom_cond=[((house.bedrooms==1) & (house.accommodates.isna()))]
 accom_val= [2]

As per numpy.select documentation : First param is your condition list. Second param is your option list.
This means that if first condition is fulfilled, return first option, else if second condition is fulfilled, return second option, so on. Else return third param.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM