简体   繁体   中英

regex- how to stop at first occurrence of a character

I am trying to extract the src value from a tag, so far I seem to be able to extract the string between the src value and the final quotation mark in the string

String:

<img  border="0"  src="http://i.bookfinder.com/about/booksellers/logo_borderless/amazon_uk.gif" width="89" height="31" alt="">

eg in PHP:

preg_match('/src=\"(.*)\"/', $row->find('a img',0), $matches);
if($matches){
   echo $matches[0];
}

prints out src="http://i.bookfinder.com/about/booksellers/logo_borderless/amazon_uk.gif" width="89" height="31" alt=""

but what i really want printed is... src="http://i.bookfinder.com/about/booksellers/logo_borderless/amazon_uk.gif"

or if possible just... http://i.bookfinder.com/about/booksellers/logo_borderless/amazon_uk.gif

what should I be adding to the regex? Thanks

You were actually very close >>

Yours:        preg_match('/src=\"(.*)\"/',  $row->find('a img',0), $matches);
Correct one:  preg_match('/src=\"(.*?)\"/', $row->find('a img',0), $matches);

By adding ? you make request for match .* lazy, which means it will match anything until needed, not anything until can. Without lazy operator it will stop in front of last double-quote " , which is behind alt=" .

For RegExp:

preg_match('/src="([^"]+)"/', $row->find('a img',0), $matches);
echo $matches[1];

If i'm right, you are working with simple_html_dom_parser library. If that's true you can just type:

$row->find('a img',0)->src

试试,它应该对你的需求有益

/src=\"[^\"]+\"/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM