简体   繁体   中英

How to clean DOM node

I need to remove unnecessary spaces and the head of a HTML node.

For example, for this node p :

<p>
   The cat 
   <b>
       <span>is on </span>
       <em><span>the bed</span></em>
   </b>
</p>

I would like to obtain:

<p>The cat <b><span>is on </span><em><span>the bed</span></em></b></p>

So that if node contains the DOM tree of root p and I execute the following code:

var text = node.innerText;
console.log(text);

I get The cat and not The cat

I found this method:

function clean(node)
{
  for(var n = 0; n < node.childNodes.length; n ++)
  {
    var child = node.childNodes[n];
    if
    (
      child.nodeType === 8 
      || 
      (child.nodeType === 3 && !/\S/.test(child.nodeValue))
    )
    {
      node.removeChild(child);
      n --;
    }
    else if(child.nodeType === 1)
    {
      clean(child);
    }
  }
}

I tried doing:

clean(node);
var text = node.innerText;
console.log(text);

and I obtain always The cat

Why? How can I solved my problem?

Thanks


If I had:

 <p>cat_</p>

or

 <p>
     cat_
 </p>

I would like to obtain always cat_ and not cat_

You can use String.prototype.trim() method, it will remove leading an trailing spaces

var spaces = "       your text     "
var required = spaces.trim()

Now required = "your text"

This will help you:

function whitespaceSimplify(str: string): string {
    str = str.replace(/\s+/g, " ") // Replace all whitespace in a row with a simple space sign
    return str
}

You can use this on the HTML-Code the remove any doublicated whitespaces:

clean(node);
node.innerHTML = whitespaceSimplify(node.innerHTML)

or use whitespaceSimplify(string) in clean

Demo:

 function clean(node) { for (var n = 0; n < node.childNodes.length; n++) { var child = node.childNodes[n]; if ( child.nodeType === 8 || (child.nodeType === 3 && !/\\S/.test(child.nodeValue)) ) { node.removeChild(child); n--; } else if (child.nodeType === 1) { clean(child); } } } function whitespaceSimplify(str) { str = str.replace(/\\s+/g, " ") // Replace all whitespace in a row with a simple space sign return str } var node = document.getElementById('node') clean(node) node.innerHTML = whitespaceSimplify(node.innerHTML) document.getElementById('output').innerText = node.innerHTML
 <div id="node"> <p> The cat <b> <span> is on </span> <em><span>the bed</span></em> </b> </p> </div> <code id="output"></code>

returns: <p> The cat <b><span> is on </span><em><span>the bed</span></em></b></p>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM