简体   繁体   中英

how to fetch all images of a page?

i need two tools or scripts in php....

First i need a tool/php scriot that can fetch all the images of a given link of a page,so that i can store those images in my database for later on showing them as the link's thumbnail.

Second i need a tool/php script that can fetch title,description and snapshot thumbnail of the given link of a page.

How can i do so?? Any tool or any php script ??

EDIT: I need something similar to Facebook's thing which you get when you tries to post a 'Link' on anyone's wall or so.

Maybe this tool is what you are looking for : http://simplehtmldom.sourceforge.net/ . You have an example in the Quick Start to get all the images.

Edit : Here is a tutorial if you want : http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/

Another way to do it is to use the DOM and the classes included in PHP (doc : http://fr2.php.net/manual/en/book.dom.php ). And to fetch all the meta tags of your page you can do :

<?php
$doc = new DOMDocument();
$doc->loadHTML('you_page.php');

$metas = $doc->getElementsByTagName('meta');

foreach ($metas as $meta)
{
    //To get a specific attribute
    echo $meta->getAttribute('your_attribute');
}

You could go with the current trends and use Node: Scrape web pages in real time with Node.js

Though if you're on Windows and Unix scares you it may be more trouble than it's worth.

Justin

++ for SimpleHtmlDom

$ret = $html->find('a, img'); 

and to get title ,etc, you can use the same refer to the manual,

http://simplehtmldom.sourceforge.net/manual.htm

facebook doent't display the screen shot of the website, but a image which it thinks is relevant. They also follow the opengraph protocol,

for example if your website has

<meta property="og:image" content="http://ia.media-imdb.com/rock.jpg"/>

meta tag , then it will use that image as the thumbnail for the wall post/ status.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM