简体   繁体   English

从满足条件的列表中提取随机值? 蟒蛇

[英]Extract random values from list that fulfil criteria? Python

Is it possible to use the random module to extract strings from a list, but only if the string has a length greater than x? 仅当字符串的长度大于x时,才可以使用随机模块从列表中提取字符串吗?

For example: 例如:

list_of_strings = ['Hello', 'Hello1' 'Hello2']

If you set x = 5 and call random.choice() the code would be 'choosing' between only list_of_strings[1] and list_of_strings[2] . 如果设置x = 5并调用random.choice()则代码将仅在list_of_strings[1]list_of_strings[2]之间“选择”。

I realise you could make a second list which contains only values of len > x but i would like to know if it is possible without this step. 我知道您可以制作仅包含len > x值的第二个列表,但是我想知道如果没有此步骤,是否有可能。

random.choice([s for s in list_of_strings if len(s) > x])

Or you could do something like this: 或者,您可以执行以下操作:

while True:
    choice = random.choice(list_of_strings)
    if len(choice) > x:
        return choice

You should check first if there are strings in the list that are longer than x, otherwise that code will never end. 您应该首先检查列表中是否有比x长的字符串,否则该代码将永远不会结束。

Another possible solution is to use reservoir sampling, it has the additional benefit of having a bounded running time. 另一种可能的解决方案是使用储层采样,它还有一个额外的好处,即运行时间有限。

Another solution that doesn't create an additional list: 另一个不会创建其他列表的解决方案:

from itertools import islice
from random import randrange

def choose_if(f, s):
  return next(islice(filter(f, s), randrange(sum(map(f, s))), None))

choose_if(lambda x: len(x) > 5, list_of_strings)

Turns out it is almost two times slower than Christian's solution. 事实证明,它比Christian的解决方案慢了将近两倍。 That's because it iterates over s twice, applying f to every element. 那是因为它遍历s两次,将f应用于每个元素。 It is expensive enough to outweigh the gain from not creating a second list. 它的成本足以抵消不创建第二个列表所带来的收益。

Francisco's solution, on the other hand, can be 10 to 100 times faster than that, because it applies f only as many times as it failed to pick a suitable element. 另一方面,Francisco的解决方案可以比它快10到100倍,因为它仅应用f未能选择合适元素的次数。 Here's a complete version of that function: 这是该功能的完整版本:

from random import choice

def choose_if(f, s):
  if any(filter(f, s)):
    while True:
      x = choice(s)
      if f(x): return x

Bear in mind it starts to get worse when few (less than 1%) elements satisfy the condition. 请记住,当很少(少于1%)的元素满足条件时,情况就会开始恶化。 When only 1 element in 5000 was good, it was 5 times slower than using a list comprehension. 当5000个元素中只有1个元素是好元素时,它比使用列表理解要慢5倍。

您可以这样做:

random.choice([i for i in list_of_strings if len(i) > x])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM