简体   繁体   English

在 python 中按组随机播放数组

[英]Shuffle array by group in python

Let's assume I have two arrays:假设我有两个 arrays:

values = [1,2,3,4,5,6,7,8,9]
groups = [0,0,0,1,1,2,2,3,4]

Is it possible to shuffle "values" only within groups?是否可以仅在组内对“值”进行洗牌? Eg elements in group 0 (1,2,3) are going to be shuffled only with each other, elements in group 1 (4,5) are going to be shuffled with each other and so on.例如,第 0 组 (1,2,3) 中的元素将仅相互混洗,第 1 组 (4,5) 中的元素将相互混洗,依此类推。

I have huge numpy arrays, is there any efficient way to do so?我有巨大的 numpy arrays,有什么有效的方法吗?

You can do it this way:你可以这样做:

import numpy as np
np.random.seed(133)

values = np.array([1,2,3,4,5,6,7,8,9])
groups = np.array([0,0,0,1,1,2,2,3,4])

for index in np.unique(groups):
    mask = groups==index
    values[mask] = np.random.permutation(values[mask])

print(values)

Output: Output:

[3 1 2 5 4 6 7 8 9]

Assuming that your group numbers are always in ascending order, you can leverage the fact that Python's sort is stable to shuffle the values/groups as a whole and then sort the result only by groups.假设您的组编号始终按升序排列,您可以利用 Python 的排序稳定这一事实将值/组作为一个整体进行混洗,然后仅按组对结果进行排序。 Combine the group numbers and values into a single list of tuples that you shuffle.将组编号和值组合成一个您打乱的元组列表。 Then sort that list of tuple using only the group as sort key and extract only the value part然后仅使用组作为排序键对该元组列表进行排序,并仅提取值部分

values = [1,2,3,4,5,6,7,8,9]
groups = [0,0,0,1,1,2,2,3,4]

import random

shuffled = random.sample([*zip(groups,values)],len(values))
values   = [v for g,v in sorted(shuffled,key=lambda gs:gs[0])]

print(values)
print(groups)

[3, 1, 2, 5, 4, 7, 6, 8, 9]
[0, 0, 0, 1, 1, 2, 2, 3, 4]

If your group identifiers are not ordered (or not consecutive), you will need to form groups (of indexes), shuffle them group by group and place the shuffled values at the specific subset of positions corresponding to each group:如果您的组标识符不是有序的(或不连续的),您将需要形成组(索引),将它们逐组打乱,并将打乱的值放在与每个组对应的特定位置子集:

values = [1,2,3,4,5,6,7,8,9]
groups = [0,1,0,1,0,2,3,3,2]

import random

gIndex = dict() # grouping dictionary {groupId:[indexes]}
for i,g in enumerate(groups): 
    gIndex.setdefault(g,[]).append(i) # value indexes by group id
shuffled = [None]*len(groups)         # resulting shuffled value list
for indexes in gIndex.values():       # shuffle indexes by group
    for i,j in zip(indexes,random.sample(indexes,len(indexes))):
        shuffled[i] = values[j]       # map old positions to new position
        
print(values)
print(groups)
print(shuffled)

[1, 2, 3, 4, 5, 6, 7, 8, 9] # original order
[0, 1, 0, 1, 0, 2, 3, 3, 2] # group identifiers
[3, 4, 1, 2, 5, 9, 8, 7, 6] # shuffled order (within groups)

You could also do ti using the first sorting technique but it wouldn't be as efficient as using a dictionary:您也可以 ti 使用第一种排序技术,但它不会像使用字典那样有效:

indexes    = sorted((g,i) for i,g in enumerate(groups))
newIndexes = sorted(random.sample(indexes,len(indexes)),key=lambda gi:gi[0])
shuffled   = [None]*len(values)
for (_,i),(_,j) in zip(indexes,newIndexes):
    shuffled[i] = values[j]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM