I have lists of strings of which i want to extract a certain value:
["bla","blabla","blablabla","time taken to build model: 5.1 seconds", "blabla"]
Normally I would look for the index of the element I am looking for by
list.index("time taken")
But since the time changes, I think of using a regular expression. I just can't figure out how to do this.
So how can I find out the index of a list element that matches a certain regex like eg re.match()? (Without iterating through the list, this would take to long)
Not sure if there is a built in method but its easy to do this with list comprehensions in O(n)
time.
With regular expressions:
import re
your_list = ["bla","blabla","blablabla","time taken to build model: 5.1 seconds", "blabla"]
regex = re.compile("^time taken")
idxs = [i for i, item in enumerate(your_list) if re.search(regex, item)]
And without regular expressions:
your_list = ["bla","blabla","blablabla","time taken to build model: 5.1 seconds", "blabla"]
query_term = 'time taken'
idxs = [i for i, item in enumerate(your_list) if item.startswith(query_term)]
You can make it return the first found index or last found index depending or parameterise it in a method to provide flexibility.
To find an element in a list, unless you have extra information (such as order of elements), you have to iterate through it. If you really want to go faster, change the structure, use a database or use another language.
Regex solution need iterate through sequence. If you want get strings with some prefix or suffix, you should implement Trie it's the fastest solution of a problem. Also you can implement solution with cycled hashes of different lengths, but in some cases it will be uneffciient.
If your priority is to get first match in the sequence , then only index()
is useful. That's how you do it, if you want to use regex in index()
method
lst=["bla","blabla","blablabla","time taken to build model: 5.1 seconds", "blabla"]
lst.index([i for i in lst if re.findall(r'^time taken', i)][0])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.