简体   繁体   English

搜索两个 3 列的 numpy 数组并找到在 Python 中符合条件的位置

[英]Search through two 3 column numpy array and find where meets criteria in Python

I have two numpy array like我有两个像

Population A

Score A Score B Answer
  1       0.3      1    
  0       0.5      0
  1       0.6      1     
  0       0.7      1
  1       0.9      1

Population B

Score A Score B Answer
  1       0.3      1    
  0       0.5      0
  1       0.6      1     
  1       0.7      1
  0       0.9      1

 Sample Results are
 ScoreB     Ratio 
  0.3        1
  0.5        1
  0.6        1
  0.7        1

I have to find score/value of score B in each population, any value above that value becomes 1 else 0, for example if you pick 0.5 in population A then first value is 0 rest are 1, similarly if you pick 0.6 in population B for example then first two value are 0 rest are 1.我必须在每个群体中找到分数 B 的分数/值,高于该值的任何值变为 1 否则为 0,例如,如果您在群体 A 中选择 0.5,则第一个值为 0 其余为 1,类似地,如果您在群体 B 中选择 0.6例如然后前两个值是 0 其余是 1。

I have to do this iteratively/ algorithmic possibly in while loop I guess and without creating or replacing scoreB such that我猜我必须在 while 循环中迭代地/算法地执行此操作,而无需创建或替换 scoreB 使得

Ratio = (counts(scoreA=1&scoreb=1&Answer=1) in population A/ counts(scoreA=1&scoreb=1&Answer=1) in population B) == 1

Note: Score B is sorted so not to worry about that注意:分数 B 是排序的所以不用担心

You can use logical_and or simply the * operator on numpy array which gives element wise multiplication您可以在 numpy 数组上使用logical_and或简单的*运算符,它给出元素明智的乘法

import numpy as np
def count_match (v, A, B, C):

  cond = np.logical_and(A == 1, B >= v)
  cond = np.logical_and(cond, C == 1)
  # alternatively, use an element wise product:
  cond = (A == 1) * (B >= v) * (C == 1)

  # counting the number of 1
  count = np.sum(cond)
  return count

A_a = np.array([1, 0, 1, 0, 1])
B_a = np.array([0.3, 0.5, 0.6, 0.7, 0.9])
C_a = np.array([1, 0, 1, 1, 1])

A_b = np.array([1, 0, 1, 1, 0])
B_b = np.array([0.3, 0.5, 0.6, 0.7, 0.9])
C_b = np.array([1, 0, 1, 1, 1])
for v in [0.3, 0.5, 0.6, 0.7, 0.9]:
  print(v, count_match(v, A_a, B_a, C_a) / count_match(v, A_b, B_b, C_b)) # divides by zero brr

note btw that for popA and popB, the condition on array A and C do not change, so you may precompute them already (I doubt that it changes time that much)注意顺便说一句,对于 popA 和 popB,数组 A 和 C 上的条件不会改变,所以你可能已经预先计算了它们(我怀疑它会改变多少时间)

def count_match2(v, pre_cond, B):
  return np.sum(pre_cond * (B >= v))

for v in [0.3, 0.5, 0.6, 0.7, 0.9]:
  print(v, count_match2(v, (A_a == 1) * (C_a == 1), B_a) / count_match2(v, (A_b == 1) * (C_b == 1), B_b)) # divides by zero brr

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM