简体   繁体   中英

Javascript reg exp between closing tag to opening tag

How do I select with Regular Expression the text after the </h2> closing tag until the next <h2> opening tag

<h2>my title here</h2>
Lorem ipsum dolor sit amet <b>with more tags</b>
<h2>my title here</h2>
consectetur adipisicing elit quod tempora

In this case I want to select this text: Lorem ipsum dolor sit amet <b>with more tags</b>

Try this: /<\\/h2>(.*?)</g

This finds a closing tag, then captures anything before a new opening tag.

in JS, you'd do this to get just the text:

substr = str.match(/<\/h2>(.*?)<h2/)[1];

Regex101

 var str = '<h2>my title here</h2>Lorem ipsum <b>dolor</b> sit amet<h2>my title here</h2>consectetur adipisicing elit quod tempora'; var substr = str.match(/<\\/h2>(.*?)<h2/)[1].replace(/<.*?>/g, ''); console.log(substr); //returns: Lorem ipsum dolor sit amet

Try

/<\/h2>((?:\s|.)*)<h2/

And you can see it in action on this regex tester .

You can see it in this example below too.

 (function() { "use strict"; var inString, regEx, res, outEl; outEl = document.getElementById("output"); inString = "<h2>my title here</h2>\\n" + "Lorem ipsum dolor sit amet <b>with more tags</b>\\n" + "<h2> my title here </h2>\\n" + "consectetur adipisicing elit quod tempora" regEx = /<\\/h2>((?:\\s|.)*)<h2/ res = regEx.exec(inString); console.log(res); res.slice(1).forEach(function(match) { var newEl = document.createElement("pre"); newEl.innerHTML = match.replace(/</g, "&lt;").replace(/>/g, "&gt;"); outEl.appendChild(newEl); }); }());
 <main> <div id="output"></div> </main>

I added \\n to your example to simulate new lines. No idea why you aren't just selecting the <h2> with a querySelector() and getting the text that way.

Match the tags and remove them, by using string replace() function. Also this proposed solution removes any single closure tags like <br/>,<hr/> etc

 var htmlToParse = document.getElementsByClassName('input')[0].innerHTML; var htmlToParse = htmlToParse.replace(/[\\r\\n]+/g,""); // clean up the multiLine HTML string into singleline var selectedRangeString = htmlToParse.match(/(<h2>.+<h2>)/g); //match the string between the h2 tags var parsedString = selectedRangeString[0].replace(/((<\\w+>(.*?)<\\/\\w+>)|<.*?>)/g, ""); //removes all the tags and string within it, Also single tags like <br/> <hr/> are also removed document.getElementsByClassName('output')[0].innerHTML += parsedString;
 <div class='input'> <i>Input</i> <h2>my title here</h2> Lorem ipsum dolor sit amet <br/> <b>with more tags</b> <hr/> <h2>my title here</h2> consectetur adipisicing elit quod tempora </div> <hr/> <div class='output'> <i>Output</i> <br/> </div>

Couple of things to remember in the code.

htmlToParse.match(/(<h2>.+<h2>)/g); returns an array of string, ie all the strings that was matched from this regex.

selectedRangeString[0] I am just using the first match for demo purspose. If you want to play with all the strings then you can just for loop it with the same logic.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM