简体   繁体   中英

Capturing text with Python regular expressions

I've been having a bit of trouble with capturing strings between html tags using Python regular expressions. I've been trying to capture the string "example link 2" from the string below:

<link>example link 1</link>
<item>
     <link>example link 2</link>
</item>

I've got this so far:

(?<=<link>)(.*)(?=</link>)

However the regular expression above returns "example link 1" and "example link 2". Could anyone please help with selecting only "example link 2"?

EDIT: Unfortunately I'm required to use regular expressions for this question so i can't use a parser etc. Thanks for the recommendation though.

You need to add 'g' modifier at the end. For example the regex should look like:

/(?<=\<link>)(.*)(?=<\/link>)/g

The 'g' modifier tells the engine not to stop after the first match has been found, but rather to continue until no more matches can be found.
Demo here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM