簡體   English   中英

來自網址的PHP Preg_match圖片

[英]PHP Preg_match Image from url

我正在嘗試解析一個網站並獲取圖像的名稱或網址。

示例網址: http : //www.theworkingmanstore.com/georgia-gr14-infants-romeo.aspx

一個<td>有6張圖像或更多,我只想在該<td>獲得第一個img src。

我確信可以使用Dom Parser完成此操作,但是我沒有經驗。

任何援助將不勝感激。

謝謝

$html = file_get_contents($url);
$reg = '/img src=["\']?([^"\' ]*)["\' ]/';
preg_match_all($reg, $html, $m);
$arr = array_map(function($v){
return trim(str_replace(array('img src=', 'http://www.theworkingmanstore.com'), '', $v), '"');}, $m[0]);
print_r($arr)

輸出:這是正則表達式的輸出

Array
(
    [0] => /images/logo2.png
    [1] => /images/mod_head_category_lt.gif
    [2] => '/images/products/display/GR14_EXTRALARGE.jpg'
    [3] => '/images/products/thumb/GR14_EXTRALARGE.jpg'
    [4] => '/images/products/thumb/GR14_8_EXTRALARGE.jpg'
    [5] => '/images/products/thumb/GR14_5_EXTRALARGE.jpg'
    [6] => '/images/products/thumb/GR14_3_EXTRALARGE.jpg'
    [7] => '/images/products/thumb/GR14_42_EXTRALARGE.jpg'
    [8] => '/images/products/thumb/GR14_2_EXTRALARGE.jpg'
    [9] => /images/freeshipping.jpg
    [10] => /images/facebook_32.png
    [11] => images/twitter_32.png
    [12] => images/googleplus_32.png
    [13] => images/pinterest_32.png
    [14] => /images/payments.gif
    [15] => /images/brands/the-working-man.jpg
)

嘗試了Dom Parser的建議:

$html = file_get_contents($url) ;
$dom = new DOMDocument();
$dom->loadHtml($html);    
$xpath = new DOMXPath($dom);
echo $xpath->evaluate(
'string(//td/a[@id = "Zoomer"]/descendant::img[1]/@src)'
);

輸出錯誤:警告:DOMDocument :: loadHTML()[domdocument.loadhtml]:實體中的標簽導航無效

在DOM中,任何東西都是節點, img元素和src屬性。 XPath允許您從DOM中獲取節點列表。

$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->evaluate('//img/@src') as $src) {
  echo $src->value, "\n";
}

輸出:

http://www.theworkingmanstore.com/images/products/display/GR14_EXTRALARGE.jpg
http://www.theworkingmanstore.com/images/products/detail/GR14_EXTRALARGE.jpg
/images/products/thumb/GR14_EXTRALARGE.jpg
/images/products/thumb/GR14_8_EXTRALARGE.jpg
/images/products/thumb/GR14_5_EXTRALARGE.jpg
/images/products/thumb/GR14_3_EXTRALARGE.jpg
/images/products/thumb/GR14_42_EXTRALARGE.jpg
/images/products/thumb/GR14_2_EXTRALARGE.jpg

XPath允許退出復雜條件。 以下示例在任何td內輸出第一個imgsrc屬性。

$dom = new DOMDocument();
$dom->loadHtml($html);    
$xpath = new DOMXPath($dom);

foreach ($xpath->evaluate('//td/descendant::img[1]/@src') as $src) {
  echo $src->value, "\n";
}

輸出:

http://www.theworkingmanstore.com/images/products/display/GR14_EXTRALARGE.jpg

問題中的HTML僅包含一個td ,更重要的是img位於具有id屬性a元素內。 因此,它必須是一個唯一的值。 這允許它直接在XPath中強制轉換節點列表,並將其作為字符串返回。

$dom = new DOMDocument();
$dom->loadHtml($html);    
$xpath = new DOMXPath($dom);
echo $xpath->evaluate(
  'string(//td/a[@id = "Zoomer"]/descendant::img[1]/@src)'
);

輸出:

http://www.theworkingmanstore.com/images/products/display/GR14_EXTRALARGE.jpg

您可以嘗試使用此正則表達式。

$html = 'Your HTML';
$reg = '/img src=["\']?([^"\' ]*)["\' ]/';
preg_match_all($reg, $html, $m);
$arr = array_map(function($v){
    return trim(str_replace(array('img src=', 'http://www.theworkingmanstore.com'), '', $v), '"');
}, $m[0]);

print '<pre>';
print_r($arr);
print '</pre>';

輸出:

Array
(
    [0] => /images/products/display/GR14_EXTRALARGE.jpg
    [1] => /images/products/detail/GR14_EXTRALARGE.jpg
    [2] => /images/products/thumb/GR14_EXTRALARGE.jpg
    [3] => /images/products/thumb/GR14_8_EXTRALARGE.jpg
    [4] => /images/products/thumb/GR14_5_EXTRALARGE.jpg
    [5] => /images/products/thumb/GR14_3_EXTRALARGE.jpg
    [6] => /images/products/thumb/GR14_42_EXTRALARGE.jpg
    [7] => /images/products/thumb/GR14_2_EXTRALARGE.jpg
)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM