[英]find all sets of matches in two numpy arrays
我有 2 個 numpy 數組如下:
#[ 3 5 6 8 8 9 9 9 10 10 10 11 11 12 13 14] #rows
#[11 7 11 4 7 2 4 7 2 4 7 4 7 7 11 11] #cols
我想找到所有匹配集,例如:
3 6 13 14 從行匹配 11 在列
5 8 9 10 11 12 從行匹配 2 4 7 在列
有沒有直接的 numpy 方法來做到這一點? 沒有空白值,行和列大小將相同。
我嘗試過的(循環而不是最有效的):
#first get array of indices, sorted by unique element
idx_sort = np.argsort(cols)
# sorts records array so all unique elements are together
sorted_records_array = cols[idx_sort]
# returns the unique values, the index of the first occurrence of a value, and the count for each element
vals, idx_start, count = np.unique(sorted_records_array, return_counts=True, return_index=True)
# splits the indices into separate arrays
res = np.split(idx_sort, idx_start[1:])
#Using looping I use intersections and concatenate to group sets:
for cntr,itm in enumerate(res):
idx = rows[itm]
for cntr2,itm2 in enumerate(res):
if cntr != cntr2:
intersectItems = np.intersect1d(rows[itm], rows[itm2])
if intersectItems.size > 0:
#print('intersectItems',intersectItems)
res[cntr] = np.unique(np.concatenate((res[cntr], res[cntr2]), axis=0))
我將進一步需要查找並刪除重復項,因為我的輸出是 [3 6 13 14],[11 11 11 11] ...
IIUC,你可以這樣做:
import numpy as np
rows = np.array([3, 5, 6, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 12, 13, 14]) # rows
cols = np.array([11, 7, 11, 4, 7, 2, 4, 7, 2, 4, 7, 4, 7, 7, 11, 11]) # cols
matches = {row: col for col, row in zip(cols[::-1], rows[::-1])}
print(matches)
輸出
{14: 11, 13: 11, 12: 7, 11: 4, 10: 2, 9: 2, 8: 4, 6: 11, 5: 7, 3: 11}
或許看逆向字典更容易理解:
from collections import defaultdict
d = defaultdict(list)
for key, value in matches.items():
d[value].append(key)
d = dict(d)
print(d)
輸出反向字典
{11: [14, 13, 6, 3], 7: [12, 5], 4: [11, 8], 2: [10, 9]}
從上面可以看出14,13,6,3
匹配到11
和12,5,11,8,10,9
匹配到7,4,2
您可以使用字典理解(帶有嵌入式列表理解):
rows = [3, 5, 6, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 12, 13, 14]
cols = [11, 7, 11, 4, 7, 2, 4, 7, 2, 4, 7, 4, 7, 7, 11, 11]
>>> {c: [r for i, r in enumerate(rows) if cols[i]==c] for c in cols}
{11: [3, 6, 13, 14], 7: [5, 8, 9, 10, 11, 12], 4: [8, 9, 10, 11], 2: [9, 10]}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.