简体   繁体   中英

Randomizing a list of zeros and ones with constraints

I'm currently trying to randomize a list of 0s and 1s which should give a random order of zeros and ones with the following constraints:

  1. 1/3 of the items have to be 1s (respectively 2/3 are 0s)

  2. No more than two 1s should occur consecutively

  3. No more than four zeros should occur consecutively

I have worked on an option, but it did not exactly turn out to be what I need. Here's my option:

for prevItem, nextItem in enumerate(WordV[: -1]):
            if nextItem  == WordV[prevItem+1] and WordV[prevItem+1] == WordV[prevItem+2] and nextItem ==1: 
                WordV[prevItem+2] = 0
            if nextItem  == WordV[prevItem+1] and WordV[prevItem+1] == WordV[prevItem+2] and WordV[prevItem+2] == WordV[prevItem+3] and WordV[prevItem+3] == WordV[prevItem+4] and nextItem == 0: 
                WordV[prevItem+2] = 1

# Check the number of ones & zeros
print(WordV)
ones= WordV.count(1)
zeros= WordV.count(0)
print(ones, zeros)

Currently, the number of ones and zeros does not add up to a proportion of 1/3 to 2/3 because the constraints replace numbers. The WordV list is a list containing 24 ones and 48 zeros that is shuffled randomly (with random.shuffle(WordV)).

Is there a smarter (and more correct) way to integrate the constraints into the code?

import numpy as np

def consecutive(data, stepsize=0):
    return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)

def check(list_to_check):
    groups = consecutive(list_to_check)
    for group in groups:
        if group[0] == 1 and group.size > 2:
            return True
        if group[0] == 0 and group.size > 4:
            return True

wordv = np.array([1]*24+[0]*48)


while check(wordv):
    np.random.shuffle(wordv)

wordv will contain something like:

array([0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0,
       0, 0, 1, 0, 1, 0])

The consecutive function will split the data in groups containing the same element:

[ins] In [32]: consecutive([1,1,1,0,0,1])
Out[32]: [array([1, 1, 1]), array([0, 0]), array([1])]

The check will check both conditions you specified and we will shuffle the list until we meet the conditions

You could try an optimization approach: Start with the list holding the elements in the right proportion, then keep swapping random elements until you get the desired results. In each turn, check the number of too-long streaks of 0s or 1s and always keep the better one of the original or the mutated list.

import itertools, random

def penalty(lst):
    return sum(1 for k, g in itertools.groupby(lst)
               if k == 0 and len(list(g)) > 4 or k == 1 and len(list(g)) > 2)

def constrained_shuffle(lst):
    # penalty of original list
    p = penalty(lst)
    while p > 0:
        # randomly swap two elements, get new penalty
        a, b = random.randrange(len(lst)), random.randrange(len(lst))
        lst[a], lst[b] = lst[b], lst[a]
        p2 = penalty(lst)
        if p2 > p:
            # worse than before, swap back
            lst[a], lst[b] = lst[b], lst[a]
        else:
            p = p2

lst = [0] * 20 + [1] * 10
random.shuffle(lst)
constrained_shuffle(lst)
print(lst)

For 200 0s and 100 1s this will take a few hundred to a few thousand iterations until it finds a valid list, which is okay. For lists with thousands of elements this is rather too slow, but could probably be improved by memorizing the positions of the too-long streaks and preferrably swapping elements within those.


About the "randomness" of the approach: Of course, it is less random than just repeatedly generating a new shuffled list until one fits the constraints, but I don't see how this will create a bias for or against certain lists, as long as those satisfy the constraints. I did a short test, repeatedly generating shuffled lists and counting how often each variant appears:

counts = collections.Counter()
for _ in range(10000):
    lst = [0] * 10 + [1] * 5
    random.shuffle(lst)
    constrained_shuffle(lst)
    counts[tuple(lst)] += 1
print(collections.Counter(counts.values()).most_common())
[(7, 197), (6, 168), (8, 158), (9, 157), (5, 150), (10, 98), (4, 92), 
 (11, 81), (12, 49), (3, 49), (13, 43), (14, 23), (2, 20), (15, 10),
 (1, 8), (16, 4), (17, 3), (18, 1)]

So, yes, maybe there are a few lists that are more likely than others (one appeared 18 times, three 17 times, and most others 5-9 times). For 100,000 iterations, the "more likely" lists appear ~50% more often than the others, but still only about 120 times out of those 100,000 iterations, so I'd think that this is not too much of a problem.

Without the initial random.shuffle(lst) there are more lists what appear much more often than the average, so this should not be skipped.

I don't really know python, so I'll give you pseudocode:

int length;
int[] onesAndZeros = new int[length];

for(int i: onesAndZeros) { // generate a random list
    i = random(0, 1);
}

int zeroCount() { // correct the ratio
    int c;
    for(int i: onesAndZeros) {
        if(i == 0) {
            c++;
        }
    }
    return c;
}

int wantedZeros;
if(zeroCount() / (length - zeroCount()) != 2) { // you should probably check a small interval, but this answer is already long
    int a = 2*(length - zeroCount()) - zeroCount(); // I will include the math if necessary
    wantedZeros = zeroCount() + a;
}
while(zeroCount() != wantedZeros) {
    boolean isLess = zeroCount < wantedZeros;
    if(isLess) {
        onesAndZeros[random(0, length - 1)] = 0;
    } else {
        onesAndZeros[random(0, length - 1)] = 0;
    }
}

string isCorrect() { // fix the 2 1s and 4 0s
    for(int i = 0; i < length; i++) {
        if(onesAndZeros[i] == 0 &&
           onesAndZeros[i + 1] == 0 &&
           onesAndZeros[i + 2] == 0 &&
           onesAndZeros[i + 3] == 0 &&
           onesAndZeros[i + 4] == 0) { // be sure not to go out of bounds!
            return "0" + i;
        } else 
        if(onesAndZeros[i] == 1 &&
           onesAndZeros[i + 1] == 1 &&
           onesAndZeros[i + 2] == 1) {
            return "1" + i;
        } else {
            return "a";
        }
    }
}

void fix(int type, int idx) {
    if(type == 0) {
        onesAndZeros[idx + 4] = 1;
    } else {
        onesAndZeros[idx + 2] = 0;
    }
}

string corr = isCorrect();
while(length(corr) >= 2) { // note: this step will screw up the ones/zeros ratio a bit, if you want to restore it, consider running the last 2 steps again
    if(corr[0] == '0') {
        fix(0, toInt(removeFirstChar(corr)));
    } else {
        fix(1, toInt(removeFirstChar(corr)));
    }
}

// done!

I'm well aware that this can be greatly optimized and cleaned up, depending on the language. But this is more of a solid base to build upon.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM