简体   繁体   English

简单的HTML DOM从标题中获取href和锚文本

[英]Simple HTML DOM getting href and anchor text from within heading

For starters this is the code that I have 首先,这是我拥有的代码

    <?php
    include ('parser_class.php');
        $source = file_get_html('http://www.billboard.com/search/site/awards?f[0]=ss_bb_type%3Aarticle');
        $title = $source->find('h3.title'); //getting song title
    ?>
    <div id="awar">
    <?php
        if ($title){
            $title = array_slice($title, 0, 10);
            foreach($title as $titles){
                $links = $titles->href;
                $string = $titles->innertext;
                //$string = (strlen($string) > 75) ? substr($string,0,72).'...' : $string;
    ?>
            <center>
            <table style="width: 100%;">
                <tr>
                    <td style="width: 50%; text-align: left; padding-left: 5px;"><span class="song"><?php echo $string ?></span></td><td style="width: 25%; text-align: left; padding-left: 5px;"><a href="http://www.billboard.com<?php echo $links ?>" class="download">Read Article</a></td>
                </tr>
            </table>
            </center>
            <hr class="betw" />

    <?php
            }
        }
        else{
            echo"<p class='song'>No Articles Found</p>";
        }
    ?>

Since the website has no classes on their links I am having to pull my information from something like this 由于该网站的链接上没有课程,因此我不得不从类似的信息中获取信息

<h3 class="title"> <a href="/articles/columns/country/6784891/lady-antebellum-charles-kelley-steps-out-on-his-own">Lady Antebellum's Charles Kelley Steps Out On His Own In New York City</a> </h3>

Calling for innertext I get everything within the h3 调用innertext我得到了h3内的所有内容

What I need is to figure out how to get the href and the anchor text separately from within the h3 我需要弄清楚如何从h3内分别获取hrefanchor text

Is there a way to get the href from the innertext and then the innertext of the href ? 有没有一种方法可以从innertext获取href ,然后从innertext中获取href

I wish that this site had a class on their links as that would of course make this tons easier. 我希望该站点的链接上有一个类,因为这样做当然可以使大量工作变得更加容易。 I have used these functions with no issues because of the websites actually using classes on their links, but it looks like billboard has decided to make things harder for me! 我使用这些功能没有问题,因为网站实际上在其链接上使用类,但是看来广告牌决定让我更难做!

A point in the right direction would be greatly appreciated. 朝正确方向的观点将不胜感激。

NOTE: My parser_class.php is the one that is located here 注意:我的parser_class.php是位于此处的一个

Instead of h3 with class title you have to select the anchor. 代替带有类titleh3 ,您必须选择锚点。 so h3.title a now from that anchor you will get the href and anchor text . 所以h3.title a现在从该锚点您将得到hrefanchor text In order to get the href you can create SimpleXMLElement object from the anchor html. 为了获得href,您可以从锚点html创建SimpleXMLElement对象。

 <?php
    include ('parser_class.php');
    $source = file_get_html('http://www.billboard.com/search/site/awards?f[0]=ss_bb_type%3Aarticle');
    foreach ($source->find('h3.title a') as $anchor) {
        $anch = new SimpleXMLElement($anchor);
        echo "Anchor text is : ".$anch;
        echo "<br>";
        echo "href is : ";
        echo $link_href = $anch['href'];
        echo "<hr>";
    }
  ?>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM