How do i get the text of a nested element in HTML for automation using Selenium or Protractor?

Question

I have below HTML code with me. I need to console log or print only the desc class text - "Print this" and not the spell class text in protractor or selenium.

<span class="desc">
Print this
    <a class="new-link" href="#">
        <span class="spell">And not this</span>
    </a>
</span>

I tried to getText() but it prints the complete statement with below code -

Print this And not this

In Protractor using Javascript:

element(by.css('.desc')).getText().then(function(text){
    console.log(text);
});

In Selenium using Java:

System.out.println(driver.findElement(by.xpath('//*[@class=".desc"]')).getText());

How do i print the first part of the text only(ie, "Print this")?

Any suggestions or help will be appreciated? Thanks.

Answer 1

ElementFinder.getText() calls innerHTML on the element and removes leading and trailing whitespaces, but innerHTML also includes all child elements of any level of nesting. There is no special property in DOM to get only first level text, but it is possible to implement by yourself. Text in DOM is also a node and is stored in DOM tree, the same way as any tag element, it just has different type and set of properties. We can get first level children of the element of all the types with the property Element.childNodes , then iterate over them and keep only the text nodes, then concatenate their content and return the result.

In Protractor I've decided to add a custom method to the prototype of ElementFinder to make it easy to use, so any Protractor element would have it. It's up to you where to place this extension code, but I'd suggest to include it somewhere before your tests, maybe in protractor.conf.js .

protractor.ElementFinder.prototype.getTextContent = function () {
    // inject script on the page
    return this.ptor_.executeScript(function () {
        // note: this is not a Protractor scope

        // current element
        var el = arguments[0];
        var text = '';

        for (var i = 0, l = el.childNodes.length; i < l; i++) {
            // get text only from text nodes
            if (el.childNodes[i].nodeType === Node.TEXT_NODE) {
                text += el.childNodes[i].nodeValue;
            }
        }

        // if you want to exclude leading and trailing whitespace
        text = text.trim();

        return text; // the final result, Promise resolves with this value

    }, this.getWebElement()); // pass current element to script
};

This method will return a Promise, which resolves with a value of text variable. How to use it:

var el = $('.desc');

expect(el.getTextContent()).toContain('Print this');

// or 

el.getTextContent().then(function (textContent) {
    console.log(textContent); // 'Print this'
});

Answer 2

I used Michael's solution and embedded into my test spec without calling the function. It would still be better to use it as a separate function if the need to use is recurring. However if you want an inline solution, Here's how to do it -

it("Get First part of text", function(){
    browser.executeScript(function () {
        var el = arguments[0], text = '';
        for (var i = 0, l = el.childNodes.length; i < l; i++)
            if (el.childNodes[i].nodeType === Element.TEXT_NODE)
                text += el.childNodes[i].nodeValue;
        return text.trim();
    },$('.desc').getWebElement()).then(function(text){
        //use expect statements with "text" here as needed
    });
});

Hope it helps.

How do i get the text of a nested element in HTML for automation using Selenium or Protractor?

Question

2 answers

solution1
3 ACCPTED 2015-09-09 13:36:42

solution2
1 2015-09-09 19:33:23

How do i get the text of a nested element in HTML for automation using Selenium or Protractor?

Question

2 answers

solution1 3 ACCPTED 2015-09-09 13:36:42

solution2 1 2015-09-09 19:33:23

solution1
3 ACCPTED 2015-09-09 13:36:42

solution2
1 2015-09-09 19:33:23