cycle through matches by index with re.finditer

Question

I want to know how to navigate by index the object produced by a finditer regex operation.

My string is s = "fish oil X22 stack peanut C4"

Here is my code:

import re
words = re.finditer('\S+', s)
has_digits = re.compile(r'\d').search
for word in words:
    if has_digits(word.group()):
        print (the word that is two words back)

Desired output =

fish
stack

Answer 1

You can use a deque to hold the elements. Then this becomes easy:

import re
from collections import deque
s = 'fish oil X22 stack peanut C4'
words = re.finditer('\S+', s)
has_digits = re.compile(r'\d').search
deq = deque([],2)
for word in words:
    wordtxt = word.group()
    if has_digits(wordtxt):
        print (deq[0])
    deq.append(wordtxt)

It's a little unclear what should happen with the string:

s = 'fish oil X22 stack C4'

Should it print "fish" and "oil" or "fish" and "X22". Also, what if the first substring is "X22"? In my answer, this will cause an IndexError , but it's hard to know what you want to do with that...

Answer 2

You can use itertools.tee and itertools.izip :

import re
import itertools as it

s = "fish oil X22 stack peanut C4"
words = re.finditer('\S+', s)
has_digits = re.compile(r'\d').search
words, words_copy = it.tee(words)
next(words); next(words)       #Skip the first two words of one iterator
for word, back_word in it.izip(words, words_copy):
    if has_digits(word.group()):
            print(back_word.group())

cycle through matches by index with re.finditer

Question

2 answers

solution1
4 ACCPTED 2013-04-19 17:15:48

solution2
1 2013-04-19 20:30:24

cycle through matches by index with re.finditer

Question

2 answers

solution1 4 ACCPTED 2013-04-19 17:15:48

solution2 1 2013-04-19 20:30:24

solution1
4 ACCPTED 2013-04-19 17:15:48

solution2
1 2013-04-19 20:30:24