简体   繁体   中英

javascript to wrap specific text using regex capture but exclude html tag attributes

I've got Regex targeting alpha-numeric strings that are product numbers (all will be CAP/number combinations of various lengths) wrapping these product numbers in bold tags for hundreds of generated HTML emails.

This worked great to bold product numbers, but also captures random parts of URLs and hex colors in my HTML email's tags attributes.

I've tried to exclude hex colors, and only include text after ">" and before "<". These don't seem to omit certain URLs and hex colors. Example...from this regex and replace syntax:

var newHtml = html.replace(new RegExp(/([0-9][^ ]*[A-Z][^ ]*)|([A-Z]
[^ ]*[0-9][^ ]*)(?=[^<|&lt;|http|#]*(>|&gt;|$))/g),"
<strong>$1</strong>");

and this text, from which I only want to wrap 09D623 that appears outside of tags:

Lorem ipsum <a href="http://www.example.com/09D623" target="blank"  
style="color: #66BB12;">dolor sit</a> amet, 09D623 non pulvinar nunc
egestas. Nunc sit amet imperdiet 09D623 magnat.

I still capture 66BB12, a hex color inside a tag along with extra characters following the color, and random URLs if they contain caps/numbers such as this example. I've tried to exclude hex color using this: ^(#[0-9a-f]{3}|[0-9a-f]{6})$

and separately, tag contents using this expression: (?!([^<]+)?>)

but none of these seem to work as expected. I'm not even sure I have the exclude expression correct — when it follows the expression I started with following new RegExp...above.

Thanks for any insights you can share...

test is at https://regex101.com/r/rW6iL6/13 or, regex101的测试结果,显示蓝色突出显示的匹配项

I don't know enough about the strings to generalize this better, but it matches what you're looking for in the example:

var email = 'Lorem ipsum <a href="http://www.example.com/09D623" target="blank" style="color: #66BB12;">dolor sit</a> amet, 09D623 non pulvinar nunc egestas. Nunc sit amet imperdiet 09D623 magnat.';
var modded = email.replace(/(\s\d+[A-Z]+\d+\s)/g, "<strong>$1</strong>");
document.write(modded);

So your regex seems a lot more complicated than it needs to be:

\\s([0-9A-Z]{2,})\\s does a perfect job of matching what you want in the example:

Finds any match 2 or more characters long surrounded by whitespace and captures only the numbers.

You could also add in allowed punctuation to the edges, but as long as you leave off # or ; , it won't match the hex:

[.,-"' ]([0-9A-Z]{2,})[.,-"' ] will match most other options that could be near the product number

If you want to do it based on location according to > and < :

>[^<]*?([0-9A-Z]{2,})(?:[^<]*?([0-9A-Z]{2,}))*

This allows it to look through any non-tag strings for any number of product numbers and return up to 2 results per >< . You can chain more if you need more, but that's how the regex capture group do.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM