简体   繁体   中英

Python Regular Expression for stopping in between

This is my string:

age: adult/child gender: male/female age range: 3 - 5 years/5 - 8 years/8 - 12 yrs/12 years and up product type: costume character: animals & insects material: polyester theme: animal age start: 3 years age end: adult features: -face is seen through the mouth of the zebra. -zipper closure in the front and a tail in the back. -set includes: jumpsuit and head mask. -animal collection. age: -adult/child. gender: -male/female. age group: -3 - 5 years/5 - 8 years/8 - 12 years/12 yrs and up

I want to catch only the bold part with python regex. But I am not able to do it. I used this regex but not working quite possibly. My Regex is:

\bage[a-z]?\b.*\d+\s(?:years[a-z]?|yrs|month[a-z]+)

This was getting the weird answer, catching unwanted string.

You could try this pattern using re.search() :

import re

string = 'age: adult/child  gender: male/female  age range: 3 - 5 years/5 - 8 years/8 - 12 years/12 years and up  product type: costume  character: animals & insects  material: polyester  theme: animal  age start: 3 years  age end: adult features:  -face is seen through the mouth of the zebra.  -zipper closure in the front and a tail in the back.  -set includes: jumpsuit and head mask.  -animal collection.  age: -adult/child.  gender: -male/female.  age range: -3 - 5 years/5 - 8 years/8 - 12 years/12 years and up'
match = re.search(r'(age range:.*?)  ', string)
if match:
    print(match.group(1))

Output:

age range: 3 - 5 years/5 - 8 years/8 - 12 years/12 years and up

This relies on the assumption that each item of data is separated by two spaces as shown in the given string. The pattern says to match the string age match: followed by zero or more characters (non-greedy), followed by exactly 2 spaces.

You can use the following:

\bage range:\s*(?:\d+\s*-\s*\d+\s*y(?:ea)?rs/)+\d+\s*y(?:ea)?rs and up\b

See Demo

If "product type" is always following your desired string, then you can use lookahead assertion :

>>> r = re.search(r'(age range:.*?)(?= product type)', s)
>>> r.group(1)
'age range: 3 - 5 years/5 - 8 years/8 - 12 years/12 years and up'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM