简体   繁体   中英

How to generate PHP regex to match all consecutive digits preceded by the hash character unless in an anchor / link

I would expect lines 1, 3, 4 and 6 to be matched but only line 6 is being matched.

Regex:

(#[0-9]+\b)(?!.*?\<\/a\>)

Sample string:

#2222
<a target="_blank" href="http://localhost/#/app/job/2222/1">#2222</a>
#3535
#3553
<a target="_blank" href="http://localhost/#/app/job/5242/1">#5242</a>
#3333

The regex is shown here: https://regex101.com/r/JpyfzQ/3

In live demo you set s modifier which you shouldn't and instead you have to set g global modifier.

Regex (a better way):

<a\b[^>]*>.*<\/a>(*SKIP)(*F)|#\d+\b

Live demo

PHP:

preg_replace('@<a\b[^>]*>.*</a>(*SKIP)(*F)|#\d+\b@', 'replacement', $input);

A more sophisticated way would be to use DOM functions in conjunction with regex functions, such as preg_match() :

<?php

$html = <<<DATA
#2222
<a target="_blank" href="http://localhost/#/app/job/2222/1">#2222</a>
#3535
#3553
<a target="_blank" href="http://localhost/#/app/job/5242/1">#5242</a>
#3333
DATA;

$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED);

$xpath = new DOMXPath($dom);
// you need to register the namespace "php" to make it available in the query
$xpath->registerNamespace("php", "http://php.net/xpath"); 
$xpath->registerPHPFunctions('preg_match');

$regex = '~#\d+~';

$items = $xpath->query("//*[not(self::a)][php:functionString('preg_match', '$regex', text()) = '1']");
foreach ($items as $item) {
    print_r($item);
}

?>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM