简体   繁体   中英

parsing html page using php to find out text on which link is assiged

say i have html code like this

$html = "This is some stuff right here. <a href='index.html'>Check this out!</a> <a href=herp.html>And this is another thing!</a> <a href=\"derp.html\">OH MY GOSH</a>";

i am trying to get values of href and also on which anchor work i mean check this out text i am able to get href value by following this code

$displaybody->find('a ') as $element;
echo $element;

well it works for me but how do i get value of check this out could you guys help me out. i did search but i am not able to find it out . thanks in advance

my actual html look like this

<a href="www.myurl/point.html" class="l" style="color:#436DBA;" onclick="return rs(this,'8 Stunning Linguistic Miracles of The Holy Quran | Kinetic Typography 144p (Video Only).mp4');">&raquo; Download MP4 &laquo;</a> - <b>144p (Video Only)</b> - <span> 19.1</span> MB<br />

my href look like this above code return download mp4 and i want it like downloadmp4 114p (video only) 19.1 mb how do i do that

If what you are using now is the SimpleHTMLDOM , then ->innertext works fine on that anchor elements that you have found:

include 'simple_html_dom.php';
$html = "This is some stuff right here. <a href='index.html'>Check this out!</a> <a href=herp.html>And this is another thing!</a> <a href=\"derp.html\">OH MY GOSH</a>";

$displaybody = str_get_html($html);
foreach($displaybody->find('a ') as $element) {
    echo $element->innertext . '<br/>';
}

If you were referring to PHP's DOMDocument , then its not find() function you need to use, to target each anchor element, you need to use ->getElementsByTagName() , then each selected elements you need to use ->nodeValue :

$html = "This is some stuff right here. <a href='index.html'>Check this out!</a> <a href=herp.html>And this is another thing!</a> <a href=\"derp.html\">OH MY GOSH</a>";

$dom = new DOMDocument();
$dom->loadHTML($html);
foreach($dom->getElementsByTagName('a') as $element) {
    echo $element->nodeValue . '<br/>';
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM