[英]javascript to wrap specific text using regex capture but exclude html tag attributes
I've got Regex targeting alpha-numeric strings that are product numbers (all will be CAP/number combinations of various lengths) wrapping these product numbers in bold tags for hundreds of generated HTML emails. 我已经用Regex定位了字母数字字符串,这些字母数字字符串是产品编号(所有都是长度的CAP /数字组合),这些产品编号以粗体标记包装成数百个生成的HTML电子邮件。
This worked great to bold product numbers, but also captures random parts of URLs and hex colors in my HTML email's tags attributes. 这对于大胆的产品编号非常有用,但也可以在HTML电子邮件的标签属性中捕获URL的随机部分和十六进制颜色。
I've tried to exclude hex colors, and only include text after ">" and before "<". 我试图排除十六进制颜色,只在“>”之后和“ <”之前包括文本。 These don't seem to omit certain URLs and hex colors. 这些似乎并未忽略某些URL和十六进制颜色。 Example...from this regex and replace syntax: 示例...来自此正则表达式并替换语法:
var newHtml = html.replace(new RegExp(/([0-9][^ ]*[A-Z][^ ]*)|([A-Z]
[^ ]*[0-9][^ ]*)(?=[^<|<|http|#]*(>|>|$))/g),"
<strong>$1</strong>");
and this text, from which I only want to wrap 09D623 that appears outside of tags:
Lorem ipsum <a href="http://www.example.com/09D623" target="blank"
style="color: #66BB12;">dolor sit</a> amet, 09D623 non pulvinar nunc
egestas. Nunc sit amet imperdiet 09D623 magnat.
I still capture 66BB12, a hex color inside a tag along with extra characters following the color, and random URLs if they contain caps/numbers such as this example. 我仍然捕获66BB12,标记内的十六进制颜色以及该颜色后面的多余字符,以及随机URL(如果它们包含大写字母/数字),例如本示例。 I've tried to exclude hex color using this: ^(#[0-9a-f]{3}|[0-9a-f]{6})$ 我尝试使用以下方法排除十六进制颜色:^(#[0-9a-f] {3} | [0-9a-f] {6})$
and separately, tag contents using this expression: (?!([^<]+)?>) 并分别使用以下表达式标记内容:(?!([^ <] +)?>)
but none of these seem to work as expected. 但这些似乎都无法按预期工作。 I'm not even sure I have the exclude expression correct — when it follows the expression I started with following new RegExp...above. 我什至不确定我的exclude表达式是否正确—当它遵循该表达式时,我首先遵循了新的RegExp...。
Thanks for any insights you can share... 多谢您分享的见解...
test is at https://regex101.com/r/rW6iL6/13 or, 测试位于https://regex101.com/r/rW6iL6/13或
I don't know enough about the strings to generalize this better, but it matches what you're looking for in the example: 我对字符串的了解还不足以更好地对此进行概括,但它与示例中要查找的内容匹配:
var email = 'Lorem ipsum <a href="http://www.example.com/09D623" target="blank" style="color: #66BB12;">dolor sit</a> amet, 09D623 non pulvinar nunc egestas. Nunc sit amet imperdiet 09D623 magnat.';
var modded = email.replace(/(\s\d+[A-Z]+\d+\s)/g, "<strong>$1</strong>");
document.write(modded);
So your regex seems a lot more complicated than it needs to be: 因此,您的正则表达式似乎要复杂得多:
\\s([0-9A-Z]{2,})\\s
does a perfect job of matching what you want in the example: \\s([0-9A-Z]{2,})\\s
可以完美匹配示例中的所需内容:
Finds any match 2 or more characters long surrounded by whitespace and captures only the numbers. 查找任何由空格包围的2个或更多匹配字符,并且仅捕获数字。
You could also add in allowed punctuation to the edges, but as long as you leave off #
or ;
您还可以在边缘添加允许的标点符号,但是只要不使用#
或;
, it won't match the hex: ,它与十六进制不匹配:
[.,-"' ]([0-9A-Z]{2,})[.,-"' ]
will match most other options that could be near the product number [.,-"' ]([0-9A-Z]{2,})[.,-"' ]
将匹配大多数其他可能接近产品编号的选项
If you want to do it based on location according to >
and <
: 如果要根据>
和<
:
>[^<]*?([0-9A-Z]{2,})(?:[^<]*?([0-9A-Z]{2,}))*
This allows it to look through any non-tag strings for any number of product numbers and return up to 2 results per ><
. 这样一来,它就可以通过任何非标记字符串查找任意数量的产品编号,并且每个><
返回最多2个结果。 You can chain more if you need more, but that's how the regex capture group do. 如果需要更多,则可以链接更多,但是正则表达式捕获组就是这样做的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.