简体   繁体   中英

Scraping href value from <a> tag using PHP

I am using simple_html_dom to retrieve href attribute value from a HTML tag. The data I am scraping is placed in the HTML table. The following code successfully points the data and display it as hyperlinks.

// Include the library to use it.
include_once('simple_html_dom.php');

// Get the HTML from the file or website.
$html = file_get_html('source.html');

// Put all of the <a> tags into an array named $result
$result = $html -> find('table tbody tr td a');

// Run through the array using a foreach loop and print each link out using echo
foreach($result as $link) {
echo $link."<br/>";
}

But I need the href value from a tag, and for this purpose I followed Retrieve multiple value of a <a href> tag using php explanation and used following code but it searches again the whole document and extracts all links which are located outside the table. I also replaced $html with $result and $link but it doesn't display anything then.

preg_match_all("/href=\"(.*?)\"/i", $html, $matches);
print_r($matches);

How can I use one of above methods to get value from href attribute placed under HTML table? The table doesn't have any class or id to use it in the selector.

Note: I also looked into: Retrieve The link value form <a href> tag using php but couldn't figure it out.

I used preg_match instead of preg_match_all and then used echo instead of print. The following code target the first link and display it on the screen.

preg_match("/href=\"(.*?)\"/i", $html, $matches);
echo $matches;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM