简体   繁体   中英

How to select words with apostrophe using regular expression

I am trying to separate a string into a list, but I need to have the string contain words that are joined by apostrophes. For example :

String="My name is Melvin_JESUS, Guatemala, Dean'Olvier, 501soy...@ 1231 !"

should give me a result as:

['my', 'name', 'is', 'melvin', 'jesus', 'guatemala', '"dean'oliver"', 'soy']

i have tried the following regular expression:

my_patern= r"(?:^|(?<=\s)|-)[A-Za-z'\.]+(?=\s|\t|$|\b)"

but doesn't give me the desired results.

You may use

(?<![^\W\d_])[^\W\d_]+(?:['.][^\W\d_]+)*(?![^\W\d_])

See the regex demo

Details

  • (?<![^\\W\\d_]) - no letter right before the match is allowed
  • [^\\W\\d_]+ - 1 or more letters
  • (?:['.][^\\W\\d_]+)* - 0 or more sequences of ' or . and then 1+ letters
  • (?![^\\W\\d_]) - no letter right after the match is allowed.

In Python, use

re.findall(r'(?<![^\W\d_])[^\W\d_]+(?:['.][^\W\d_]+)*(?![^\W\d_])', text)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM