简体   繁体   中英

XPath: select node that contains normalized text, excluding ancestor nodes

I'm looking for a performant, generic query to select "The quick brown fox jumped over the lazy dog." similar to how a simple CMD+C would copy the text.

It seems that I need to use the textContent , but my XPath is also selecting the ancestors (up to body), rather than just the first (few) ancestors. How can I limit the scope?


example node.textContent

"

    The
    quick brown fox
    jumped over the


    lazy dog.

"

Run Code Snippet → Full Page → Follow instruction

 <div id='dont-select-this' style='text-align:center'> <div id='dont-select-this-but-it-would-be-cool-if-you-could' style='background:lightgreen;'> <div id='select-this-one'> <p> <span>The</span> <em>quick brown fox</em> <span>jumped over the</span> </p> <p> lazy<span> </span><b>dog.</b> </p> </div> </div> <hr> <div>open chrome devtools → console</div> <div>change frame (see below)</div> <div>type <code>$x('//*[contains(., "quick brown fox")]')</code></div> <hr> <img alt='select chrome devtools frame' height=300 src='https://i.imgur.com/L1MhCY8.png'/> </div>

This XPath,

//div[normalize-space() = 'The quick brown fox jumped over the lazy dog.']

selects only the two div elements with the following @id values,

dont-select-this-but-it-would-be-cool-if-you-could
select-this-one

as requested.


If you actually want to exclude the div element with an id value of dont-select-this-but-it-would-be-cool-if-you-could (despite its name) and only select the deepest element with the noted string value, then:

  1. Add an additional predicate to the above XPath.
  2. Change div to * .

Altogether:

//*[         normalize-space() = 'The quick brown fox jumped over the lazy dog.' ]
   [not(.//*[normalize-space() = 'The quick brown fox jumped over the lazy dog.' ])]

This selects only the div with an id attribute value of select-this-one .

Ended up going with this

//*[         contains(normalize-space(.), 'The quick brown fox jumped over the lazy dog.') ]
   [not(.//*[contains(normalize-space(.), 'The quick brown fox jumped over the lazy dog.') ])]

$x("//*[contains(normalize-space(.), 'The quick brown fox jumped over the lazy dog.')][not(.//*[contains(normalize-space(.), 'The quick brown fox jumped over the lazy dog.') ])]")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM