简体   繁体   中英

How to get the count of row from two dataframe based on some conditions

count_0 = (X_train['school_state']== 'nc' & y_train['project_is_approved'] == 0).apply(len)

X_train is one numpy array and y_train is another numpy array.

X_train has a column school_state which total 51 state names and one of the State name is 'nc' and y_train has a single column ie project_is_approved which have two value which can be either 0 or 1.

I want to find out number where state name is 'nc' and project_is_approved is 0.

Through above code i am getting error:

IndexError: only integers, slices ( : ), ellipsis ( ... ), numpy.newaxis ( None ) and integer or boolean arrays are valid indices

Sample y_train : array([0, 1, 1, ..., 1, 1, 0], dtype=int64) Sample X_train['school_state']:

47418 nc

49054 ca

35919 wi

34248 ca

15492 sd

31525 ks

36090 fl

43569 ny

9290 pa

12848 me

46189 la

33364 dc

您需要添加括号,然后才能使用sum方法:

((X_train['school_state'] == 'nc') & (y_train == 0)).sum()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM