简体   繁体   English

Python:如何在大于M的公共列中找到非零单元格的N行以上

[英]Python: How to find greater than N rows with non zero cells in greater than M common columns

I have nxm matrix and I want to find programatically N or more rows that contains non zero cells in more than M common columns. 我有nxm矩阵,我想以编程方式找到N个或更多行,其中M个以上的公共列中包含非零单元格。

For example. 例如。 Here is the matrix: 这是矩阵:

matrix([[ 0.,  0.,  1.,  1.,  1.,  0.,  1.,  0.],
        [ 1.,  0.,  1.,  0.,  1.,  1.,  0.,  1.],
        [ 1.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
        [ 0.,  0.,  0.,  0.,  1.,  0.,  1.,  0.],
        [ 0.,  1.,  0.,  1.,  1.,  0.,  1.,  0.],
        [ 0.,  1.,  0.,  1.,  0.,  0.,  0.,  0.]])

And I looking for 2 or more rows which contains non zero cells in 2 or more common colums. 我正在寻找2个或更多包含2个或更多常见列的非零单元格的行。 There are several possible results, one of them is: 有几种可能的结果,其中之一是:

row1: [ 1.,  0.,  1.,  0.,  1.,  1.,  0.,  1.],
row2: [ 1.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
      col1                     col5

Is it possible to find all rows combinations, that solve this task? 是否有可能找到解决此任务的所有行组合?

from itertools import combinations                   
matrix =[[ 0.,  0.,  1.,  1.,  1.,  0.,  1.,  0.],   
         [ 1.,  0.,  1.,  0.,  1.,  1.,  0.,  1.],   
         [ 1.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],   
         [ 0.,  0.,  0.,  0.,  1.,  0.,  1.,  0.],   
         [ 0.,  1.,  0.,  1.,  1.,  0.,  1.,  0.],   
         [ 0.,  1.,  0.,  1.,  1.,  0.,  0.,  0.],   
         [ 0.,  1.,  0.,  1.,  1.,  0.,  0.,  1.]]   

m = 2                                                
n = 6                                                
req_rows = []                                        
ncm = [x for x in combinations(matrix,m)]            
for x in ncm:                                        
    if sum([1 for l in zip(*x) if not 0 in l])>=m:   
        print x

output 输出

([0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0])
([0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0], [1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0])
([0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0])
([0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0])
([0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0])
([0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0])
([1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0], [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0])
([1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0])
([0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0])
([0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0])
([0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0])
([0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0])
from pprint import pprint
from itertools import combinations

def solve(lst, m):

    col, n = {}, len(lst)
    for i, x in enumerate(lst):
        col[i] = [j for j, y in enumerate(x) if y]

    for s in xrange(n, m-1, -1):
        for c in combinations(xrange(n), s):
            values  = set(col[c[0]]).intersection(*(col[k] for k in c[1:]))
            if len(values) >= m:
                yield [lst[k] for k in c]

for x in solve(matrix, 2):
    pprint(x)

Output: 输出:

[[0, 0, 1, 1, 1, 0, 1, 0],
 [0, 0, 0, 0, 1, 0, 1, 0],
 [0, 1, 0, 1, 1, 0, 1, 0]]
[[0, 0, 1, 1, 1, 0, 1, 0], [1, 0, 1, 0, 1, 1, 0, 1]]
[[0, 0, 1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 1, 0, 1, 0]]
[[0, 0, 1, 1, 1, 0, 1, 0], [0, 1, 0, 1, 1, 0, 1, 0]]
[[1, 0, 1, 0, 1, 1, 0, 1], [1, 0, 0, 0, 0, 1, 0, 0]]
[[0, 0, 0, 0, 1, 0, 1, 0], [0, 1, 0, 1, 1, 0, 1, 0]]
[[0, 1, 0, 1, 1, 0, 1, 0], [0, 1, 0, 1, 0, 0, 0, 0]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM