简体   繁体   English

如何使用 PHP 从多个 url 下载图像?

[英]How to download image from multiple url using PHP?

I am new to PHP and also SO.我是 PHP 和 SO 的新手。 My task is to download images from multiple websites and place them to a folder with the product number as the name of the image .我的任务是从多个网站下载图片,并将它们放在一个文件夹中,并以产品编号作为图片的名称

I have an excel Book1.csv that contains all the product number in column A.我有一个 excel Book1.csv包含 A 列中的所有产品编号。

CSV 文件的片段

Each url contains the product number.每个 url 都包含产品编号。 An example of a website that I am trying to download the image.我正在尝试下载图像的网站示例。 ( https://www.rockwellautomation.com/en-us/products/details.20G1ABC460JN0NNNNN.html ) https://www.rockwellautomation.com/en-us/products/details.20G1ABC460JN0NNNNN.html

Here is a snippet of my updated codes:这是我更新的代码片段:

<?php
$file = fopen("Book1.csv","r");

// Remote image URL
while (($result = fgetcsv($file)) !== false)
{
  $fileName = rawurlencode($result[0]).".html";
  $link = "https://www.rockwellautomation.com/en-us/products/details.".$fileName;
  echo $link."<br>";
  file_put_contents($fileName, file_get_contents($link));
}

?>

Update:更新:

Now it downloads the url with images however I just want to download one image which is the product image on every website and place it in a folder.现在它会下载带有图像的 url 但是我只想下载一个图像,它是每个网站上的产品图像并将其放在一个文件夹中。

Any help is appreciated thank you感谢您的帮助

If the aim is to download the product image only rather than the HTML as per the updated comment in the question "I just want to download one image which is the product image on every website and place it in a folder" then perhaps using an XPath query to specifically target that image would be the way to proceed?如果目标是仅下载产品图像而不是 HTML 根据问题中的更新评论"I just want to download one image which is the product image on every website and place it in a folder"那么可能使用 XPath查询以专门针对该图像将是继续进行的方式吗?

The following has been tested with the two product codes shown in the screenshot of the CSV.以下已使用 CSV 截图所示的两个产品代码进行了测试。 A more explicit XPath pattern might make this slightly faster but hard to quantify currently.更明确的 XPath 模式可能会稍微快一些,但目前很难量化。

As you mention that the CSV has a single column each row is therefore a distinct product so using file to open the csv file yields an array which then makes processing very simple using a foreach loop.正如您提到的,CSV 每行都有一个列,因此是一个不同的产品,因此使用file打开 csv 文件会产生一个数组,然后使用foreach循环使处理变得非常简单。

<?php   
    # directory in which to store images
    $dir=__DIR__ . '/images/tmp';
    
    # The CSV file obvs '-)
    $file='rockwell-products.csv';
    
    # variable to track results
    $output=array();
    
    # size of all files downloaded
    $size=0;
    
    # XPath pattern to find file 
    $pttn='//img[@class="carousel__currentImage"]';
    # product url template
    $baseurl='https://www.rockwellautomation.com/en-us/products/details.%s.html';
    
    # create the DOMDocument instance & ignore errors with remote HTML
    libxml_use_internal_errors( true );
    $dom=new DOMDocument;
    $dom->validateOnParse=false;
    $dom->strictErrorChecking=false;
    $dom->recover=true;
    
    
    # open the csv file - as an array!
    $lines=file( $file );
    $count=0;
    
    # iterate through all lines in CSV file
    foreach( $lines as $code ){
        if( !empty( trim( $code ) ) ){
            
            # use the product code in the template string
            $url=sprintf( $baseurl, trim( $code ) );
            
            # load the HTML into the DOMDocument instance
            $dom->loadHTML( file_get_contents( $url ) );
            
            # Query the dom using XPath;
            $xp=new DOMXPath( $dom );
            $col=$xp->query( $pttn );
            
            if( $col && $col->length > 0 ){
                # get the src attribute from the image
                $src=$col->item(0)->getAttribute('src');
                $filename=sprintf( '%s.%s', trim( $code ), pathinfo( basename( $src ),PATHINFO_EXTENSION ) );
                $bytes=file_put_contents( sprintf( '%s/%s', $dir, $filename ), file_get_contents( $src ) );
                
                if( $bytes > 0 ) {
                    $output[]=sprintf('<div>Found and saved image <a href="%2$s" target="_blank">%1$s</a> - %3$sKb</div>', $filename, $src, round( $bytes/1024,2) );
                    $size+=$bytes;
                    $count++;
                }else{
                    $output[]=sprintf('<div>Found image but unable to save to disk - %s</div>', $filename );
                }
            }else{
                $output[]=sprintf('<div>Failed to find content for: "%1$s" - <a href="%2$s" target="_blank">%2$s</a></div>', $code, $url );
            }
            libxml_clear_errors();
            $xp=null;
        }
    }
    
    $dom=null;
    
    
    
    printf('<div>Finished! Processed %s entries and downloaded %s files - Total: %sMb</div>', count( $lines ), $count, round($size/pow(1024,2),2) );
    foreach($output as $item)echo $item;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM