简体   繁体   中英

detecting year in list of strings

I have list of strings like this:

words = ['hello', 'world', 'name', '1', '2018']

I looking for the fastest way (python 3.6) to detect year "word" in the list. For example, "2018" is year. "1" not. Let's define the acceptable year range to 2000-2020.

Possible solution

Check if the word is number ( '2018'.isdigit() ) and then convert it to int and check if valid range.

What is the fastest way to do it in python?

You can build a set of your valid years (as strings). Then loop through each of the words you want to test to check if it is a valid year:

words = ['hello', 'world', 'name', '1', '2018']
valid_years = {str(x) for x in range(2000,2021)}

for word in words:
    if word in valid_years:
        print word

As Martijn Pieters mentioned in the comments, sets are the fastest solution for accessing items with an O(1) complexity:

Sets let you test for membership in O(1) time, using a list has a linear O(length_of_list) cost


EDIT :

As you can see in the comments, there are a lot of different ways of generating the set of valid_years , as long as your data structure is a Set you will have the fastest way of doing what you want.

You can read more here:

Concatenate list to one string with special split char. Use regex to search.

For example:

word_tmp = " ".join(words)
re.search("\b20[0-2]\d\b", word_tmp)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM