简体   繁体   中英

Extracting Partial Text (before the <br>) using getElementsByClassName

I'm having trouble grabbing the specific piece of text out of the Class attribute. The text has both a name and an ID. Both are important to me, but I need them split and placed in separate arrays.

<span class="locDescription"><b>Name1</b><br> ID1</span>
<span class="locDescription"><b>Name2</b><br>ID2</span>
<span class="locDescription"><b>Name3</b><br> ID3</span>

My first thought was to pop off the last item in each element (convert to string or list, delimit by a " " and pop off the last item). But, I realized that there is not always a space between the Name and ID so that doesn't work.

My second thought was to use the OuterHTML and grab everything before the <br> then do the same with the ID after the <br> .

However, this is what the returned text looks like which using outerHTML:

"&lt;span class=\&quot;locDescription\&quot;&gt;&lt;b&gt;Name1&lt;/b&gt;&lt;br&gt;ID1&lt;/span&gt;"

I could not find a way to simply grab before the <br> ... that would seem like something one could do easily... maybe I'm missing it.

In lieu of that, I attempted to use indexing to grab the text:

var product_name = []
var elements = document.getElementsByClassName('locDescription');
for(var i=0; i<elements.length; i++) product_name.push(elements[i].outerHTML)

test1 = product_name[0].indexOf('&gt;&lt;b&gt;')

console.log(test1)

That came back as -1 so it's not interpreting the garble in that text. Any idea of how I can accomplish this? I think I'm going down a rabbit hole at the moment.

querySelector and childNodes

 const spans = [...document.querySelectorAll(".locDescription")]; const details = spans.map(span => { const name = span.querySelector("b").textContent; const id = span.childNodes[2].nodeValue; return { name, id }; }); console.log(details);
 <span class="locDescription"><b>Name1</b><br> ID1</span> <span class="locDescription"><b>Name2</b><br>ID2</span> <span class="locDescription"><b>Name3</b><br> ID3</span>

 const spans = Array.from(document.querySelectorAll(".locDescription")); const details = spans.map(function(span){ const name = span.querySelector("b").textContent; const id = span.childNodes[2].nodeValue; return { name: name, id: id }; }); console.log(details);
 <span class="locDescription"><b>Name1</b><br> ID1</span> <span class="locDescription"><b>Name2</b><br>ID2</span> <span class="locDescription"><b>Name3</b><br> ID3</span>

You can use the properties .previousSibling and .nextSibling of a Node , those properties include other nodes, meaning TextNodes as well.

Note that you might want to trim() the .textContent of those other Nodes you want, as .textContent returns the text how it is written in your HTML after escaping HTML-Name codes, that means including the white-spaces and line-breaks, if any.

Here is a quick example:

  1. Query for <br>
  2. Use .previousSibling / .nextSibling
  3. Get their .textContent
  4. (Optional) trim() the returned text

 var brElement = document.querySelector('br'); console.log(brElement.previousSibling.textContent.trim()); console.log(brElement.nextSibling.textContent.trim());
 <p><b>First text</b><br> Second text</p>

You can use regex to find the two sides:

 var element = document.getElementsByClassName("locDescription")[0]; var array = []; array[0] = element.innerHTML.match(/.*(?=<br>)/)[0]; array[1] = element.innerHTML.match(/(?<=<br>).*/)[0]; console.log(array)
 <span class="locDescription"><b>Name1</b><br> ID1</span>

If you want to exclude the <b> tags:

 var element = document.getElementsByClassName("locDescription")[0]; var array = []; array[0] = element.innerHTML.match(/(?<=<b>).*(?=<\/b>)/)[0] array[1] = element.innerHTML.match(/(?<=<br>).*/)[0]; console.log(array)
 <span class="locDescription"><b>Name1</b><br> ID1</span>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM