简体   繁体   中英

How to match part of a list and return other parts in python

We stored the line

['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'POST', '2016-10-28T09:29:07.940000'],

in an array called data. We know it's in there because:

>>> print data
[['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'POST', '2016-10-28T09:29:07.940000'], ['worker42', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'HANDLE', '2016-10-28T09:29:07.970000'], ['frontend7', '2ef630e2-64fb-4100-8a04-07c4d25887b7', 'GET', '2016-10-28T09:29:07.970000'], ['frontend9', 'a9af2495-f2f0-42e3-81fa-d99d4bac5b9c', 'GET', '2016-10-28T09:29:07.990000'], ['frontend19', '0336af66-edff-48e0-958c-42d09d0efd7a', 'GET', '2016-10-28T09:29:08.010000'], ['frontend14', 'ebc80de2-3708-4aa5-88e4-d3c08a018961', 'GET', '2016-10-28T09:29:08.030000'], ['frontend16', '14fd9242-7a0c-4f42-ab0c-f8e6de21f948', 'GET', '2016-10-28T09:29:08.040000'], ['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'RESPOND', '2016-10-28T09:29:08.050000'], ['frontend5', '8b3e6d9f-abbc-46c0-a458-05e6fd3bbe6c', 'POST', '2016-10-28T09:29:08.060000'], ['frontend3', 'd8389212-c91e-450b-8745-2cb121cb9623', 'POST', '2016-10-28T09:29:08.090000']]

Can even pull out the whole line:

>>> print data[0]
['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'POST', '2016-10-28T09:29:07.940000']

Can pull out any part of the line:

>>> print data[0][0]
frontend2

PROBLEM: I need to FIND a line that begins with frontend2 and contains RESPOND, and pull out the OTHER parts of this.

One might think index would at least find it, but no:

>>> data.index("frontend2")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 'frontend2' is not in list
>>>

What's the pythonic way to do this?
Ideally I could something like data[frontend2][2] and it would scan the list, find the first line that matches, and return the 2-indexed item. (Or, for another part of the script, overwrite the 2 item without touching the rest of the line.)

filter will get me ALL lines; presumably I could filter further to get only the get line? I haven't found good docs on that; any explanation is appreciated. Doing it in list comprehension mode produces the same results.

>>> print filter(lambda x: 'frontend2' in x, data)
[['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'POST', '2016-10-28T09:29:07.940000'], ['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'RESPOND', '2016-10-28T09:29:08.050000']]

One potential solution might be use regular expressions and filter on that, but definitely seems like there should be a better way.

To use a list comprehension , you have to apply all the conditions that matches the required sublist in a filter:

>>> [lst for lst in data if  lst[0]=='frontend2' and 'RESPOND' in lst]
[['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'RESPOND', '2016-10-28T09:29:08.050000']]

Alternative 1: the obvious

A very obvious way that scans each element of your list and looks for the two tokens you need:

for line in data:
   if 'frontend2' in line and 'RESPOND' in line:
       print line

Alternative 2: a bit more efficient

A bit more efficient alternative, if you know that the line has to begin with frontend2 :

for line in data:
    if line[0] == 'frontend2' and 'RESPOND' in line:
        print line

Alternative 3: using filter

Another alternative using filter and all the conditions that you need:

print filter(lambda x: 'frontend2' in x and 'RESPOND' in x, data)
>>> [['frontend2', 'ac1b360e-daa8-4102-bc7e-aae01ac5f6ab', 'RESPOND', '2016-10-28T09:29:08.050000']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM