简体   繁体   English

将 Python 列表划分为仅包含唯一值的列表

[英]Dividing a Python list into lists containing only unique values

I am building a scheduling tool, which has to make a schedule with the numbers 0-16.我正在构建一个调度工具,它必须用数字 0-16 制定一个时间表。 It must adhere to the following rules:它必须遵守以下规则:

  1. The schedule is size (10,7)时间表是大小 (10,7)
  2. No one number may occur twice in a column任何一个数字都不能在一列中出现两次
  3. All numbers must occur the same amount of times (if the schedule size is not divisible by the numbers to choose from, some numbers can occur once more)所有数字必须出现相同的次数(如果计划大小不能被可供选择的数字整除,则某些数字可以再次出现)

To make sure I can adhere to the third rule, I created a pool in which each number occurs with the same frequency, while the leftover is divided through random choice.为了确保我可以遵守第三条规则,我创建了一个池,其中每个数字以相同的频率出现,而剩余的则通过随机选择进行分配。

schedule = np.zeros((10,7))
IDs = [i for i in range(17)]

numberOfTasks = schedule.size

rounds = math.floor(numberOfTasks / len(IDs))
leftover = numberOfTasks % len(IDs)

pool = [person for person in IDs for _ in range(rounds)]
pool += random.sample(IDs, k=leftover)

This works, I now have a list of IDs of the same size as the schedule, all I need to do is put them in. Now, to adhere to the second rule, I should pick each number only once in each day/column.这行得通,我现在有一个与日程表大小相同的 ID 列表,我需要做的就是将它们放入。现在,为了遵守第二条规则,我应该在每天/列中只选择每个数字一次。 I tried this:我试过这个:

for i in range(schedule.shape[1]):
    
    daySchedule = schedule[:, i]
    plannablePeople = list(np.unique(pool))

    for j in range(len(daySchedule)):

        pickedPerson = random.choice(plannablePeople)
        plannablePeople.remove(pickedPerson)
        pool.remove(pickedPerson)
        daySchedule[j] = pickedPerson
    
    # Checking the scheduled day and the pool
    print(daySchedule)
    print(pool)

    schedule[:, i] = daySchedule

However, with this method I end up with an error in the last column because some ID's are left multiple times and therefore the plannablePeople list will be too short.但是,使用这种方法,我最终会在最后一列出现错误,因为某些 ID 被多次留下,因此 plannablePeople 列表会太短。 Does anyone know if there is a more efficient way of solving this?有谁知道是否有更有效的方法来解决这个问题?

I thought there should be a way to split a list into lists with only unique items, but I yet have to find out how.我认为应该有一种方法可以将列表拆分为仅包含唯一项目的列表,但我还需要找出方法。

It sounds like you want to generate random samples from your IDs , without replacement, for each day in your schedule.听起来您想为日程中的每一天从您的IDs中生成随机样本,无需替换。

To do this you can use numpy.random.choice .为此,您可以使用numpy.random.choice You will see from the docs that it takes a keyword argument, size , which is the number of samples to take, and another keyword argument, replace , whose default value is True .您将从文档中看到,它需要一个关键字参数size ,即要采集的样本数,以及另一个关键字参数replace ,其默认值为True

So something like:所以像:

numpy.random.choice(IDs, size=numberOfTasks, replace=False)

will generate one day's worth of scheduling for you.将为您生成一天的日程安排。

A more complete, but simple example is as follows:一个更完整但更简单的例子如下:

import numpy
import itertools

ndays = 7
njobs = 10
people = range(17)

days = [numpy.random.choice(people, size=njobs, replace=False) for d in range(ndays)]
schedule = numpy.array(days)

which gives the example schedule :它给出了示例schedule

array([[11, 12, 14,  2,  0,  3, 10,  1,  6, 13],
       [ 8, 15,  7,  0, 12,  3,  1,  6, 10, 13],
       [ 2,  9, 16,  4,  5, 15,  0,  8,  7, 11],
       [ 1,  4, 10, 16,  6, 12,  2, 15, 13,  9],
       [ 8,  1,  7, 13, 12,  0,  3, 15,  4,  9],
       [ 2,  5,  7,  3,  9, 10, 13, 15,  0,  8],
       [ 7, 13, 14,  6,  8, 16,  3, 11,  1,  9]])

A 'fair' schedule一个“公平”的时间表

Your requirement for some sort of fairness is more difficult to enforce.您对某种公平的要求更难以执行。 I'm not totally convinced that your strategy of using a kind of worker pool works in general, though it may work reasonably well most of the time.我并不完全相信您使用一种工作池的策略通常是有效的,尽管它在大多数情况下可能工作得相当好。 Here is a short example which uses a pool.这是一个使用池的简短示例。 (Note that the extra work of finding the remainder extra and supplementing the pool with randomly sampled workers is not IMO necessary, since you'll be randomly sampling from the pool anyway) (请注意,IMO 不需要额外的extra工作来寻找剩余部分并用随机抽样的工人补充池,因为无论如何你都会从池中随机抽样)

import numpy
import itertools

ndays = 7
njobs = 10
people = [p for p in range(17)]
pool = people * ndays

schedule = numpy.zeros((7, 10), numpy.int)

for day in range(ndays):
    schedule[day, :] = numpy.random.choice(numpy.unique(pool), size=njobs, replace=False)
    for person in schedule[day, :]:
        pool.remove(person)

which gives the example schedule :它给出了示例schedule

array([[12, 13,  0,  1,  2, 11,  6,  8, 16,  9],
       [15,  8,  3, 10,  5, 12,  7,  0, 11,  4],
       [12,  7, 13,  4,  0,  3, 15,  9, 14, 10],
       [14,  6, 16,  9,  4, 15, 11,  5, 10,  3],
       [ 0, 13,  6,  1, 12,  5, 15,  4,  7,  9],
       [13, 15, 16,  3,  5,  2,  8,  4,  6,  7],
       [ 2,  3, 15,  5,  4, 10,  0,  8,  9,  1]])

(you can get a (10, 7) shape schedule with schedule.T ) (您可以使用 schedule.T 获得 (10, 7) 形状schedule.T

With regard to your original example, the line pool.remove(pickedPerson) looks suspicious to me, and is more likely intended as plannablePeople.remove(pickedPerson) .关于您的原始示例,行pool.remove(pickedPerson)对我来说看起来很可疑,并且更有可能打算用作plannablePeople.remove(pickedPerson) There is also a small mistake elsewhere, such as the indexing in daySchedule[i] = pickedPerson which probably should be daySchedule[j] = pickedPerson .其他地方也有一个小错误,例如daySchedule[i] = pickedPerson中的索引可能应该是daySchedule[j] = pickedPerson After correcting these, the example code in your question works well for me.更正这些后,您问题中的示例代码对我来说效果很好。

Notice also that your problem almost identical to the problem of generating Latin Squares (actually Latin Rectangle in your case, which you could obtain from any Latin Square large enough), although generating a single Latin Square is easy enough, random sampling uniformly (ie fairly) from all Latin Squares is so hard that it is NP-complete AFAIK.另请注意,您的问题几乎与生成拉丁方的问题相同(在您的情况下实际上是拉丁矩形,您可以从任何足够大的拉丁方获得),尽管生成单个拉丁方很容易,随机抽样均匀(即相当) 来自所有拉丁方格是如此之难,以至于它是 NP-complete AFAIK。 This hints (though its certainly not a proof) that it might be very hard to enforce the fairness requirements in your problem too.这暗示(尽管它肯定不是证据)在您的问题中执行公平要求也可能非常困难。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM