简体   繁体   中英

Ruby split with regex - regex isn't doing what i want

i have this string

string = "<p>para1</p><p>para2</p><p>para3</p>"

I want to split on the para2 text, so that i get this

["<p>para1</p>", "<p>para3</p>"]

The catch is that sometimes para2 might not be wrapped in p tags (and there might be optional spaces outside the p and inside it). I thought that this would do it:

string.split(/\s*(<p>)?\s*para2\s*(<\/p>)?\s*/)

but, i get this:

["<p>para1</p>", "<p>", "</p>", "<p>para3</p>"]

it's not pulling the start and end p tags into the matching pattern - they should be eliminated as part of the split. Ruby's regular expressions are greedy by default so i thought that they would get pulled in. And, this seems to be confirmed if i do a gsub instead of a split:

string.gsub(/\s*(<p>)?\s*para2\s*(<\/p>)?\s*/, "XXX")
=> "<p>para1</p>XXX<p>para3</p>"

They are being pulled in and got rid of here, but not on the split. Any ideas anyone?

thanks, max

用非捕获组(?:…)替换捕获组(…) (?:…)

/\s*(?:<p>)?\s*para2\s*(?:<\/p>)?\s*/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM