简体   繁体   中英

cycle through matches by index with re.finditer

I want to know how to navigate by index the object produced by a finditer regex operation.

My string is s = "fish oil X22 stack peanut C4"

Here is my code:

import re
words = re.finditer('\S+', s)
has_digits = re.compile(r'\d').search
for word in words:
    if has_digits(word.group()):
        print (the word that is two words back)

Desired output =

fish
stack

You can use a deque to hold the elements. Then this becomes easy:

import re
from collections import deque
s = 'fish oil X22 stack peanut C4'
words = re.finditer('\S+', s)
has_digits = re.compile(r'\d').search
deq = deque([],2)
for word in words:
    wordtxt = word.group()
    if has_digits(wordtxt):
        print (deq[0])
    deq.append(wordtxt)

It's a little unclear what should happen with the string:

s = 'fish oil X22 stack C4'

Should it print "fish" and "oil" or "fish" and "X22". Also, what if the first substring is "X22"? In my answer, this will cause an IndexError , but it's hard to know what you want to do with that...

You can use itertools.tee and itertools.izip :

import re
import itertools as it

s = "fish oil X22 stack peanut C4"
words = re.finditer('\S+', s)
has_digits = re.compile(r'\d').search
words, words_copy = it.tee(words)
next(words); next(words)       #Skip the first two words of one iterator
for word, back_word in it.izip(words, words_copy):
    if has_digits(word.group()):
            print(back_word.group())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM