简体   繁体   中英

Regular Expression (PHP/Perl) Backreferences

I want to match strings of the form:

Sections 158, 417, 418, 500, 501, and 1111

Where I get all the numbers as backreferences (replacing them with hyperlinks). I got this far:

$text = preg_replace_callback("/Sections? ([0-9.]+)(?:(?:and|,| |or)+([0-9.]+))+/","hyperlink_me", $text); 

That's PHP (which uses perl-compliant regexes). The PHP part is fine I think, it's the regex that's not doing what I want, but I'm giving the whole line of code for context.

The problem is the only two backreferences I get (other than the whole string) are for the first number ("158") and the last one ("1111"). It seems like the second captured backreference ("([0-9.]+") is just getting written over from the second number onwards. From this page , at "Repetition and Backreferences", I get the idea that this is often an issue, but can't figure out how to solve it in this context. Any regex geniuses out there who can help?

That is simply how capturing-parentheses work, and can't be avoided. But you can create a "helper" callback, hyperlink_us , that does this:

$output = preg_replace_callback("/[^0-9.]+([0-9.]+)/","hyperlink_me", $input);

and then you use it like this:

$text = preg_replace_callback("/(Sections? [0-9.]+(?:(?:and|,| |or)+[0-9.]+)+)/","hyperlink_us", $text);

That way, hyperlink_us passes all the numbers to hyperlink_me , but the caller of hyperlink_us can make sure that numbers are only passed in the appropriate contexts (the "Section(s)..." stuff).

(Disclaimer: I'm not much of a PHP-er. I'm assuming that this function behaves like its analogue in JavaScript. I do know Perl, but it doesn't have a callback-based regex-replacement function.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM