简体   繁体   中英

Python - get index from nested list where condition is met

Please could someone help me with getting an index of an item in a nested list where a certain condition is met using Python 2.7? I know that there are similar questions on StackOverflow about this but I can't seem to find good examples that deal with "nested" lists.

I have a list of data which is hundreds of thousands of lines long, in the format below:

data =[
["","","","28.04.2015 09:34:38",1.52411,1.52428,17],
["","","","28.04.2015 09:34:40",1.52415,1.52433,18],
["","","","28.04.2015 09:34:42",1.52425,1.52444,19],
["","","","28.04.2015 09:34:44",1.52417,1.52435,18],
["","","","28.04.2015 09:34:46",1.52421,1.52440,19],
["","","","28.04.2015 09:34:48",1.52426,1.52446,20],
["","","","28.04.2015 09:34:50",1.52429,1.52444,15],
["","","","28.04.2015 09:34:58",1.52423,1.52441,18],
["","","","28.04.2015 09:35:00",1.52416,1.52434,18],
["","","","28.04.2015 09:35:02",1.52416,1.52433,17],
["","","","28.04.2015 09:35:04",1.52416,1.52434,18],
["","","","28.04.2015 09:35:06",1.52406,1.52422,16],
["","","","28.04.2015 09:35:10",1.52406,1.52421,15],
["","","","28.04.2015 09:35:14",1.52427,1.52444,17],
["","","","28.04.2015 09:35:16",1.52424,1.52443,19],
["","","","28.04.2015 09:35:18",1.52434,1.52453,19],
["","","","28.04.2015 09:35:20",1.52434,1.52451,17],
["","","","28.04.2015 09:35:22",1.52438,1.52456,18],
["","","","28.04.2015 09:35:24",1.52432,1.52451,19],
["","","","28.04.2015 09:35:28",1.52445,1.52464,19],
["","","","28.04.2015 09:35:34",1.52435,1.52451,16],
["","","","28.04.2015 09:35:36",1.52432,1.52449,17],
["","","","28.04.2015 09:35:38",1.52429,1.52448,19]]

For each row I want to compare the data in "column 5" (the first col of decimal numbers) to a certain value (lets use 1.52440 as an example) and return the index of the first row where the data is greater than my certain value.

I have made code that does this the 'traditional' way using a for-row-in-data type loop, but I would like to use a better (faster) method if possible and cannot seem to produce the expected result.

The rather poor attempt that I have made so far is:

pricedata = [n[4] for n in data]
myindex = (x for x in enumerate(pricedata) if x > 1.5440).next()

The first row extracts the price data col as a new list. I am not sure this is really necessary but as my understanding of list comprehensions is poor I was trying to break things into steps I understand.

I dont really understand what the second line is doing, but it seems to return (0, 1.52411) - the first item in the list - regardless of what comparison value I enter.

I have also tried:

myindex = [x for x in enumerate(pricedata) if x > 1.5440][0]

and it seems to produce the same result.

I thought the comprehension was saying:

"Make a list of price's for each price you look at in the list-of-indexed-prices if the price-you-are-looking-at is greater than 1.5440", but it seems I am mistaken!

Please could someone point out the error of my ways and help me out? Thank you for any assistance!

Problem is, you're comparing a tuple with a float, since enumerate returns tuples of the index and item, however, tuples are deemed greater than floats in Python 2:

>>> () > 4.
True

Therefore, the first tuple produced by enumerate always yields a match.


To solve this, you should instead unpack the tuple first, and return the first matching index from your generator expression using next :

next(i for i, x in enumerate(data) if x[4] > 1.52415)

You are misusing enumerate . It iterates over the sequence/iterator you provide it and yields index/value couples.

Try this instead:

myindex = next(index for index, row in enumerate(data) if row[5] > 1.52440)

The

(i for i, row in enumerate(data) if row[4] > 1.52440)

part is a generator, it yields indexes of rows meeting the condition.

next iterates over this generator until the first row is returned.

Because his uses a generator rather than an intermediary list, you don't have to go through the whole list. The search stops after the first row is found that matches the condition. This can be important when the table has many rows, like yours.

Note that you'll get a StopIteration exception if no matching row is found. If you want to get a specific value in this case (eg None ), you may pass it as a second argument to next :

myindex = next((index for index, row in enumerate(data) if row[5] > 1.52440), None)

This can be achieved by splitting the enumrator into index and value:

try:
    first_index = (index for index, data in enumerate(data)
                   if data[4] > 1.52415).next()
except StopIteration:
    first_index = -1

The StopIteration section is executed when no item in the list matches the predicate.

data=[[0,0,0,0,0,0],[1,0,0,0,0,0],[0,0,0,0,0,0],[1,0,0,0,0,0],[0,0,0,0,0,0],[1,0,0,0,0,0],[0,0,0,0,0,0]]
for index, value in enumerate(data):
    if value[0] > 0:
        print(index)
        break

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM