detecting year in list of strings

Question

I have list of strings like this:

words = ['hello', 'world', 'name', '1', '2018']

I looking for the fastest way (python 3.6) to detect year "word" in the list. For example, "2018" is year. "1" not. Let's define the acceptable year range to 2000-2020.

Possible solution

Check if the word is number ( '2018'.isdigit() ) and then convert it to int and check if valid range.

What is the fastest way to do it in python?

Answer 1

You can build a set of your valid years (as strings). Then loop through each of the words you want to test to check if it is a valid year:

words = ['hello', 'world', 'name', '1', '2018']
valid_years = {str(x) for x in range(2000,2021)}

for word in words:
    if word in valid_years:
        print word

As Martijn Pieters mentioned in the comments, sets are the fastest solution for accessing items with an O(1) complexity:

Sets let you test for membership in O(1) time, using a list has a linear O(length_of_list) cost

EDIT :

As you can see in the comments, there are a lot of different ways of generating the set of valid_years , as long as your data structure is a Set you will have the fastest way of doing what you want.

You can read more here:

List comprehension
Sets
Complexities for different Python data structures (so you can understand which data structures in Python are quicker for specific operations)

Answer 2

Concatenate list to one string with special split char. Use regex to search.

For example:

word_tmp = " ".join(words)
re.search("\b20[0-2]\d\b", word_tmp)

detecting year in list of strings

Question

Possible solution

2 answers

solution1
4 ACCPTED 2018-04-18 09:02:19

solution2
-2 2018-04-18 09:01:30

detecting year in list of strings

Question

Possible solution

2 answers

solution1 4 ACCPTED 2018-04-18 09:02:19

solution2 -2 2018-04-18 09:01:30

solution1
4 ACCPTED 2018-04-18 09:02:19

solution2
-2 2018-04-18 09:01:30