简体   繁体   English

在排序列表中查找首次出现的索引

[英]Find index of first occurrence in sorted list

I have a sorted list that looks like this: 我有一个看起来像这样的排序列表:

sortedlist = ['0','0','0','1','1,'1,'2',2','3']

I also have a count variable: 我也有一个count变量:

count = '1'

*note: sometimes count can be an integar greater that the max value in the list. *注意:有时count的整数可以大于列表中的最大值。 For example count = '4' 例如count = '4'

What I want to do is to find the first occurrence of the count in the list and print the index. 我想要做的是在列表中找到计数的第一个匹配项并打印索引。 If the value is greater than the max value in the list, then assign a string. 如果该值大于列表中的最大值,则分配一个字符串。 Here is what I have tried: 这是我尝试过的:

maxvalue = max(sortedlist)
for i in sortedlist:
    if int(count) < int(sortedlist[int(i)]):
        indexval = i
        break
        OutputFile.write(''+str(indexval)+'\n')
if int(count) > int(maxvalue):
    indexval = "over"
    OutputFile.write(''+str(indexval)+'\n')

I thought the break would end the for loop, but I'm only getting results from the last if statement. 我以为中断会结束for循环,但是我只能从最后的if语句中获得结果。 Am I doing something incorrectly? 我做错了什么吗?

Your logic is wrong, you have a so called sorted list of strings which unless you compared as integer would not be sorted correctly, you should use integers from the get-go and bisect_left to find index: 你的逻辑是错误的,你有,除非你相比整数不会被正确排序字符串的所谓的排序列表 ,你应该从一开始就使用整数bisect_left找到索引:

from bisect import bisect_left

sortedlist = sorted(map(int, ['0', '0', '0', '1', '1', '1', '2', '2', '3']))

count = 0

def get_val(lst, cn):
    if lst[-1] < cn:
        return "whatever"
    return bisect_left(lst, cn, hi=len(lst) - 1)

If the value falls between two as per your requirement, you will get the first index of the higher value, if you get an exact match you will get that index: 如果值按要求介于两个之间,则将获得较高值的第一个索引,如果完全匹配,则将获得该索引:

In [13]: lst = [0,0,2,2]

In [14]: get_val(lst, 1)
Out[14]: 2

In [15]: lst = [0,0,1,1,2,2,2,3]

In [16]: get_val(lst, 2)
Out[16]: 4

In [17]: get_val(lst, 9)
Out[17]: 'whatever'

As there are some over-complicated solutions here it's worth posting how straightforwardly this can be done: 由于这里存在一些过于复杂的解决方案,因此值得一提的很简单:

def get_index(a, L):
    for i, b in enumerate(L):
        if b >= a:
            return i
    return "over"

get_index('1', ['0','0','2','2','3'])
>>> 2
get_index('1', ['0','0','0','1','2','3'])
>>> 3
get_index('4', ['0','0','0','1','2','3'])
>>> 'over'

But, use bisect . 但是,请使用bisect

You could use a function (using EAFP principle) to find the first occurrence that is equal to or greater than the count: 您可以使用一个函数(使用EAFP原理)来查找等于或大于计数的第一个匹配项:

In [239]: l = ['0','0','0','1','1','1','2','2','3']

In [240]: def get_index(count, sorted_list):
     ...:     try:
     ...:         return next(x[0] for x in enumerate(l) if int(x[1]) >= int(count))
     ...:     except StopIteration:
     ...:         return "over"
     ...:     

In [241]: get_index('3', l)
Out[241]: 8

In [242]: get_index('7', l)
Out[242]: 'over'

As your list is already sorted, so the maximum value will be the last element of your list ie maxval = sortedlist[-1] . 由于列表已经排序,因此最大值将是列表的最后一个元素,即maxval = sortedlist[-1] secondly there is an error in your for loop. 其次,您的for循环中存在错误。 for i in sortedlist: This gives you each element in the list . for i in sortedlist:这将为您提供list中的每个元素。 To get index do a for loop on range len(sortedlist) Here i is the element in the list. 要获取索引,请在范围len(sortedlist)上进行for循环。在此,i是列表中的元素。 You should break after writing to the file. 您应该在写入文件后中断。 Below is the fixed code: 下面是固定代码:

maxvalue = sortedlist[-1]
if int(count) > int(maxvalue):
    indexval = "over"
    OutputFile.write(''+str(indexval)+'\n')
else:
    for i in xrange(len(sortedlist)):
        if int(count) <= int(sortedlist[int(i)]):
            indexval = i
            OutputFile.write(''+str(indexval)+'\n')
            break

Using itertools.dropwhile() : 使用itertools.dropwhile()

from itertools import dropwhile

sortedlist = [0, 0, 0, 1, 1, 1, 2, 2, 3]

def getindex(count):
    index = len(sortedlist) - len(list(dropwhile(lambda x: x < count, sortedlist)))
    return "some_string" if index >= len(sortedlist) else index

The test: 考试:

print(getindex(5))
> some_string

and: 和:

print(getindex(3))
> 8

Explanation 说明

dropwhile() drops the list until the first occurrence, when item < count returns False . item < count返回False时, dropwhile()会丢弃列表直到第一次出现。 By subrtracting the (number of) items after that from the length of the original list, we have the index. 通过从原始列表的长度subrtracting项(数) ,我们的索引。

" an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element ." 一个迭代器,只要谓词为true,就从迭代器中删除元素;然后,返回每个元素 。”

First of all: 首先:

for i in range(1, 100):
  if i >= 3:
    break
    destroyTheInterwebz()
  print i

Will never execute that last function. 永远不会执行最后一个功能。 It will onmy peint 1 and 2 . 它会出现在我的第1和第2 Because break immediately leaves the loop; 因为break 立即离开循环; it does not wait for the current iteration to finish. 不会等待当前迭代结束。

In my opinion, the code would be nicer if you used a function indexOf and return instead of break . 我认为,如果您使用函数indexOfreturn而不是break ,那么代码会更好。

Last but not least: the data structures here are pretty expensive. 最后但并非最不重要的一点:这里的数据结构非常昂贵。 You may want to use integers instead of strings, and numpy arrays. 您可能要使用整数而不是字符串和numpy数组。 You could then use the very fast numpy.searchsorted function. 然后,您可以使用非常快速的numpy.searchsorted函数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM