简体   繁体   中英

WWW::Mechanize::Firefox How do you extract the text within HTML element tags?

Good Day,

How do you print the text of an HTML tag with WWW::Mechanize::Firefox ?

I have tried:

    print $_->text, '/n' for $mech->selector('td.dataCell');

    print $_->text(), '/n' for $mech->selector('td.dataCell');


    print $_->{text}, '/n' for $mech->selector('td.dataCell');

    print $_->content, '/n' for $mech->selector('td.dataCell');

Remember I do not want {innerhtml} , but that does work btw.

print $_->{text}, '/n' for $mech->selector('td.dataCell');

The above line does work, but output is just multiple /n

my $node = $mech->xpath('//td[@class="dataCell"]/text()');

print $node->{nodeValue};

Note that if you're retrieving text interspersed with other tags, like "Test_1" and "Test_3" in this example...

<html>
  <body>
    <form name="input" action="demo_form_action.asp" method="get">
      <input name="testRadioButton" value="test 1" type="radio">Test_1<br>
      <input name="testRadioButton" value="test 3" type="radio">Test_3<br>
      <input value="Submit" type="submit">
    </form>
  </body>
</html>

You need to refer to them by their position within the tag (taking any newlines into account):

$node = $self->{mech}->xpath("//form/text()[2]", single=>1);

print $node->{nodeValue};

Which prints "Test_1".

I would do :

print $mech->xpath('//td[@class="dataCell"]/text()');

using a expression

The only solution I have is to use:

my $element = $mech->selector('td.dataCell');

my $string = $element->{innerHTML};

And then formatting the html within each dataCell

Either:

$element->{textContent};

or

$element->{innerText};

will work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM