I want to split the following string by <p>
tags which contain text less than 4 characters. Let's say <p>1</p>
, <p>2</p>
using Regex.
<span id="_ctl0_contentMain__kDP_dp_Text" class="kDPText">
<p>1</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p>
<p>2</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p>
</span>
The following regex matches <p>...</p>
with up to three characters between the tags:
<p>.{0,3}<\/p>
Demo:
var input = `<span id="_ctl0_contentMain__kDP_dp_Text" class="kDPText"> <p>1</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p> <p>2</p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p> <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. </p> </span>`; console.log(input.split(/<p>.{0,3}<\\/p>/));
If you want to resort to Regular Expression, you can resort to something similar to this code.
var string_to_split= document.getElementById("_ctl0_contentMain__kDP_dp_Text").innerHTML
var your_regExp = new RegExp("<p>.{0,3}<\/p>","ig");
var result = string_to_split.split(your_regExp).filter(function(x) {return x.trim().length;});
If you do not want to resort to RegEx you can use a script like this one (still vanilla javascript, but in older browser [ie ie8] you would use a polyfill for querySelectorAll
, I guess ):
var allParagraph = document.querySelectorAll("#_ctl0_contentMain__kDP_dp_Text > p");
var split_para = Array.prototype.reduce.call(
allParagraph,
function(acc, x) {
if (x.innerHTML.length < 4) {
acc.unshift([]);
} else {
acc[0].push(x);
}
return acc;
},
[]
).reverse();
Sure, the first one solution is simpler but in the result variable there is a string, the split_para array has the original paragraph into an array grouped following your splitting specification
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.