ok so i have been battling with this for a while now so maybe someone can help me.
Im trying to get the email link from this HTML:
<div id="field_11" class="fieldRow span12 lastFieldRow">
<span class="caption">E-mail</span>
<span class="output">
<script type="text/javascript">
<!--
document.write('<a hr'+'ef="mai'+'lto'+':'+
'%40;%67;%6d;%61;%69;%6c;<\/a>');
//-->
</script>
<a href="mailto:%40%67%6d%61%69%6c">@mail</a>
</span>
</div>
Im trying to get the '@mail' part of the html code, after the a href="mailto:..." part. NOT the document.write() part but the last tag in the code.
for some reason when ever i try to get the children of the tag span with the output class it thinks it only has 1 child which is the script tag but i just can't seem to grab the email plain text.
So far what i have:
$target_url = "some_web_site";
$html = new simple_html_dom();
$html->load_file($target_url);
foreach($html->find('span[class=output]') as $d){
echo $d->children(1)->plaintext . "<br />";
}
any help?
It is possible with just DOM+Xpath, too.
$dom = new DOMDocument();
$dom->loadHtml($html);
//$dom->loadHtmlFile($htmlFile);
$xpath = new DOMXpath($dom);
var_dump(
$xpath->evaluate(
'string(//span[@class="output"]//a[starts-with(@href, "mailto:")])'
)
);
Output: https://eval.in/148063
string(5) "@mail"
The Xpath selects all span
elements with the class
attribute "output"
//span[@class="output"]
Then it looks for a
elements where the href
attribute starts with "mailto:"
//span[@class="output"]//a[starts-with(@href, "mailto:")]
The result of this is a list of a
element nodes (with the example content a single node). The string()
function casts the first node into a string if the node list is empty it will return an empty string.
string(//span[@class="output"]//a[starts-with(@href, "mailto:")])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.