[英]Split string ignoring html tags
Is it possible to split a string by space " 是否可以用空格分割字符串?
" and to ignore the html tags in it ?
并忽略其中的html标签吗?
The html tags may have style elements like : style="font-size:14px; color: rgb(0, 0, 0)" ..... html标签可能具有如下样式元素:style =“ font-size:14px; color:rgb(0,0,0)” .....
The string i'm talking about is: 我在说的字符串是:
<div class="line"><span style="color: rgb(0,0,0)">John</span><u> has</u><b> apples</b></div>
If you can see i have space character inside the u
tag and inside the b
tag 如果您可以看到我在
u
标签和b
标签内都有空格字符
What i am trying to get is the text to split as following 我想要得到的是要拆分的文本,如下
<div class="line"><span style="color: rgb(0,0,0)">John</span><u>
has</u><b>
apples</b></div>
I have the following regex but it does not give me the rest of the string, just the first 2 parts : 我有以下正则表达式,但它没有给我剩下的字符串,只有前两个部分:
[\<].+?[\>]\s
Split using the following regexp: 使用以下正则表达式拆分:
str.split(/ (?=[^>]*(?:<|$))/)
[
"<div class="line"><span style="color: rgb(0,0,0)">John</span><u>",
"has</u><b>",
"apples</b></div>"
]
The ?=
is a look-ahead . ?=
是超前的 。 It says, "find spaces which are followed by some sequence of characters that are NOT greater-than signs, then a less-than sign (or end of string). 它说:“查找空格,后面跟一些不是大于号的字符序列,然后是小于号(或字符串的结尾)。
The ?:
is a non-capturing group . ?:
是非捕获组 。 We need that here, because split
has a special behavior: the presence of a capturing group tells it to include the splitters in the resulting array of pieces, which we don't want. 我们在这里需要这样做,因为
split
具有特殊的行为:捕获组的存在会告诉它将splitters包含在结果数组中,这是我们不想要的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.