如何根据某些条件从两个数据框中获取行数

Question

count_0 = (X_train['school_state']== 'nc' & y_train['project_is_approved'] == 0).apply(len)

X_train is one numpy array and y_train is another numpy array. X_train是一个 numpy 数组， y_train是另一个 numpy 数组。

X_train has a column school_state which total 51 state names and one of the State name is 'nc' and y_train has a single column ie project_is_approved which have two value which can be either 0 or 1. X_train有一列school_state ，其中一共有 51 个州名，其中一个州名是“nc”，而y_train有一个列，即project_is_approved ，它有两个值，可以是 0 或 1。

I want to find out number where state name is 'nc' and project_is_approved is 0.我想找出状态名称为 'nc' 且project_is_approved为 0 的数字。

Through above code i am getting error:通过上面的代码，我收到错误：

IndexError: only integers, slices ( : ), ellipsis ( ... ), numpy.newaxis ( None ) and integer or boolean arrays are valid indices IndexError：只有整数，切片（ : ），省略号（ ... ），numpy.newaxis（ None ）和整数或布尔数组是有效的索引

Sample y_train : array([0, 1, 1, ..., 1, 1, 0], dtype=int64) Sample X_train['school_state']:样本 y_train : array([0, 1, 1, ..., 1, 1, 0], dtype=int64) 样本 X_train['school_state']:

47418 nc 47418 数控

49054 ca 49054 ca

35919 wi 35919无线

34248 ca 34248 ca

15492 sd 15492 标清

31525 ks 31525 秒

36090 fl 36090 液量

43569 ny 43569 纽约

9290 pa 9290帕

12848 me 12848 我

46189 la 46189 拉

33364 dc 33364 直流

Answer 1

您需要添加括号，然后才能使用sum方法：

((X_train['school_state'] == 'nc') & (y_train == 0)).sum()

如何根据某些条件从两个数据框中获取行数

问题描述

1 个解决方案

解决方案1
0 2019-12-30 12:52:43

如何根据某些条件从两个数据框中获取行数

问题描述

1 个解决方案

解决方案1 0 2019-12-30 12:52:43

解决方案1
0 2019-12-30 12:52:43