简体   繁体   中英

How to check subsequent elements of string in python using iterators?

I have a sentence that I want to parse to check for some conditions:

a) If there is a period and it is followed by a whitespace followed by a lowercase letter

b) If there is a period internal to a sequence of letters with no adjacent whitespace (ie www.abc.com)

c) If there is a period followed by a whitespace followed by an uppercase letter and preceded by a short list of titles (ie Mr., Dr. Mrs.)

Currently I am iterating through the string (line) and using the next() function to see whether the next character is a space or lowercase, etc. And then I just loop through the line. But how would I check to see what the next, next character would be? And how would I find the previous ones?

line = "This is line.1 www.abc.com. Mr."

t = iter(line)
b = next(t)

for i in line[:len(line)-1]:
    a = next(t)
    if i == "." and (a.isdigit()): #for example, this checks to see if the     value after the period is a number
         print("True")

Any help would be appreciated. Thank you.

Regular expressions is what you want.

Since your going to check for a pattern in a string, you can make use of the python's builtin support for regular expressions through re library.

Example:

#To check if there is a period internal to a sequence of letters with no adjacent whitespace 
import re
str = 'www.google.com'
pattern = '.*\..*'
obj = re.compile(pattern)
if obj.search(str):
    print "Pattern matched"

Similarly generate patterns for the conditions you want to check in your string.

#If there is a period and it is followed by a whitespace followed by a lowercase letter
regex = '.*\. [a-z].*'

You can generate and test your regular expressions online using this simple tool

Read more extensively about re library here

You can use multiple next operations to get more data

line = "This is line.1 www.abc.com. Mr."

t = iter(line)
b = next(t)

for i in line[:len(line)-1]:
    a = next(t)
    c = next(t)
    if i == "." and (a.isdigit()): #for example, this checks to see if the     value after the period is a number
         print("True")

You can get previous ones by saving your iterations to a temporary list

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM