简体   繁体   English

将HTML标签添加到此正则表达式字符串

[英]Add HTML tags to this regex string

I'm using a tiny little JS plugin to truncate multiple lines of text on a site I'm working on. 我正在使用一个小的JS小插件来截断我正在处理的网站上的多行文本。

The only problem is that the script is counting HTML tags <a href="..."></a> for example in the character count which is throwing things off a little. 唯一的问题是该脚本正在计数HTML标记<a href="..."></a> ,例如在字符计数中,这会使事情有些混乱。

This is how the script currently excludes characters; 这是脚本当前排除字符的方式。

regex = /[!-\/:-@\[-`{-~]$/

Which basically just strips out certain punctuation characters. 这基本上只是去除某些标点符号。

I've tried changing it to this; 我尝试将其更改为此;

regex = [!-\/:-@\[-`{-~]$<[^>]*>

But, not being too familiar with regex, it didn't seem to work. 但是,由于对regex不太熟悉,所以它似乎没有用。

If someone could nudge me in the right direction that would be great. 如果有人可以向正确的方向推动我,那将是很棒的。

In your initial regex you're looking for single characters that matches the tail of the string - either it be a character, word, line. 在初始正则表达式中,您要查找与字符串尾部匹配的单个字符-它可以是字符,单词,行。 Note the dollar sign '$'. 注意美元符号“ $”。

regex = /[!-\/:-@\[-`{-~]$/

Now you want to match anything between < and > . 现在,您要匹配<>之间的任何内容。

regex = /[!-\/:-@\[-`{-~]$|<[^>]*$/

Note that you'll match: < , <aaaa , <aaaa< until the end of the string that you are matching against. 请注意,您将匹配: <<aaaa<aaaa<直到要匹配的字符串的末尾。

greedy_regex = /[!-\/:-@\[-`{-~]$|<[^>]*/
non_greedy_regex = /[!-\/:-@\[-`{-~]$|<[^>]*?/

If you remove the second '$' - greedy_regex - it will do a greedy match, matching <b>c</b> of a<b>c</b>d . 如果您删除第二个'$' - greedy_regex -它会做一个贪婪的匹配,匹配<b>c</b>a<b>c</b>d Using the ? 使用? as in non_greedy_regex it will match the '` only. 就像在non_greedy_regex ,它将仅匹配'`。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM