I am trying to code some stuff in HTML, CSS and Javascript. I have some problems with regex.
Let me take a simple example to explain my problem because I can't find the solution.
<script> var str = "I am <b>a tennis player</b> but I like also playing <i>football</i> and <i>rugby</i>, I am <b>34</b> years old, I like <u>cooking</u> even if there is nothing in common with <i>tennis</i>, <i>football</i> or <i>rugby</i>."; var result = str.match(/<b>(.*?)<\\/b>/g).map(function(val){ return val.replace(/<\\/?b>/g,''); }); alert(result) </script>
So as you may have guessed it, I am looking for selecting all the text between the tags <b></b>,<i></i>,<u></u>
. To be clearer I want to be able to select " a tennis player
", " football
", " rubgy
", " 34
", " cooking
" etc.
For the moment, I managed to deal with only one tag. When I try with several ones I fail. I have no experience on regex (I didn't study and work in this field) and the courses I found on the internet didn't answer my question. I don't think it is difficult to combine three regex, but I am lost with clast, with AND or OR etc. :/
You can use following regex to extract the innerText of elements.
/<([biu])>(.*?)<\/\1>/gi
Explanation:
<([biu])>
: Matches <
followed by either b
/ i
/ u
and then >
. Can also be written as <(b|i|u)>
and puts the tagName in the first captured group. (.*?)
: Non-greedy match. Matches as many as possible characters to satisfy the condition <\\/\\1>
: Matches the </
followed by the first captured group(see #1 above) followed by >
. Thus matching the closing tag. gi
: g: Global flag to match all possible results. i
: Case-insensitive match. var str = "I am <b>a tennis player</b> but I like also playing <i>football</i> and <i>rugby</i>, I am <b>34</b> years old, I like <u>cooking</u> even if there is nothing in common with <i>tennis</i>, <i>football</i> or <i>rugby</i>."; var regex = /<([biu])>(.*?)<\\/\\1>/gi, result = []; while (match = regex.exec(str)) { result.push(match[2]); } console.log(result); document.body.innerHTML = '<pre>' + JSON.stringify(result, 0, 4) + '</pre>';
You can also use jQuery.
var str = "I am <b>a tennis player</b> but I like also playing <i>football</i> and <i>rugby</i>, I am <b>34</b> years old, I like <u>cooking</u> even if there is nothing in common with <i>tennis</i>, <i>football</i> or <i>rugby</i>."; var result = []; $('<div/>').html(str).find('b, i, u').each(function(i, e) { result.push(e.innerText); }); console.log(result); $('body').html('<pre>' + JSON.stringify(result, 0, 4) + '</pre>');
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.0.0/jquery.min.js"></script>
Getting all text from u
, b
and i
tags can be easily achieved with plain JS DOM parser:
function getTagTexts(str, tag) { var el = document.createElement( 'html' ); // create an empty element el.innerHTML = '<faketag>' + str + '</faketag>'; // init the innerHTML property of the element var arr = []; // declare the array for the results [].forEach.call(el.getElementsByTagName(tag), function(v,i,a) { // iterate through the tags we want arr.push(v.innerText); // and add the innerText property to the array }); return arr; } var txt = "I am <b>a tennis player</b> but I like also playing <i>football</i> and <i>rugby</i>, I am <b>34</b> years old, I like <u>cooking</u> even if there is nothing in common with <i>tennis</i>, <i>football</i> or <i>rugby</i>."; var arrayI = getTagTexts(txt, "i"); var arrayU = getTagTexts(txt, "u"); var arrayB = getTagTexts(txt, "b"); document.body.innerHTML += JSON.stringify(arrayI, 0, 4) + "<br/>"; // => ["football", "rugby", "tennis", "football", "rugby"] document.body.innerHTML += JSON.stringify(arrayU, 0, 4) + "<br/>"; // => ["cooking"] document.body.innerHTML += JSON.stringify(arrayB, 0, 4); // => ["a tennis player", "34"]
Note that the faketag
is necessary if you need to parse an HTML fragment without html
/ body
tags.
See code below:
var str = "I am <b>a tennis player</b> but I like also playing <i>football</i> and <i>rugby</i>, I am <b>34</b> years old, I like <u>cooking</u> even if there is nothing in common with <i>tennis</i>, <i>football</i> or <i>rugby</i>."; var result = str.match(/<(b|i|u)>(.*?)<\\/\\1>/g).map(function(val){ return val.replace(/<\\/?b>|<\\/?i>|<\\/?u>/g,''); }); alert(result)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.