简体   繁体   English

PHP使用正则表达式查找子字符串

[英]PHP find string of substring using Regex

I have a web page source code that I want to use in my project. 我有一个要在项目中使用的网页源代码。 I want to use an image link in this code. 我想在此代码中使用图像链接。 So, I want to reach this link using regex in PHP. 因此,我想使用PHP中的regex到达此链接。

That's it: 而已:

img src="http://imagelinkhere.com" class="image" img src =“ http://imagelinkhere.com” class =“ image”

There is only one line like this. 只有这样的一行。 My logic is to get the string between 我的逻辑是让

=" =”

and

" class="image" “ class =” image“

characters. 字符。

How can I do that with REGEX? 我该如何使用REGEX? Thank you very much. 非常感谢你。

Don't use Regex for HTML .. try DomDocument 不要将Regex用于HTML ..试试DomDocument

$html = '<html><img src="http://imagelinkhere.com" class="image" /></html>';

$dom = new DOMDocument();
$dom->loadHTML($html);
$img = $dom->getElementsByTagName("img");

foreach ( $img as $v ) {
    if ($v->getAttribute("class") == "image")
        print($v->getAttribute("src"));
}

Output 产量

http://imagelinkhere.com

Using 运用

.*="(.*)?" .*

with preg replace gives you only the url in the first regex group (\\1). 使用preg replace时,只给您第一个正则表达式组(\\ 1)中的URL。

So complete it would look like 如此完整,看起来像

$str='img src="http://imagelinkhere.com" class="image"';
$str=preg_replace('.*="(.*)?" .*','$1',$str);
echo $str;

--> - >

http://imagelinkhere.com

Edit: Or just follow Baba's advice and use DOM Parser. 编辑:或者只是按照巴巴的建议,并使用DOM分析器。 I'll remember that regex will give you headaches when parsing html with it. 我会记得,使用regex解析html时,它会让您头疼。

preg_match("/(http://+.*?")/",$text,$matches);
var_dump($matches);

The link would be in $matches. 链接将在$ matches中。

There is several ways to do so : 有几种方法可以这样做:

1.you can use SimpleHTML Dom Parser which I prefer with simple HTML 1.您可以将SimpleHTML Dom Parser与简单HTML一起使用

2.you can also use preg_match 2.你也可以使用preg_match

$foo = '<img class="foo bar test" title="test image" src="http://example.com/img/image.jpg" alt="test image" class="image" />';
$array = array();
preg_match( '/src="([^"]*)"/i', $foo, $array ) ;

see this thread 看到这个线程

I can hear the sound of hooves, so I have gone with DOM parsing instead of regex. 我能听到蹄声,所以我使用DOM解析代替了正则表达式。

$dom = new DOMDocument();
$dom->loadHTMLFile('path/to/your/file.html');
foreach ($dom->getElementsByTagName('img') as $img)
{
    if ($img->hasAttribute('class') && $img->getAttribute('class') == 'image')
    {
        echo $img->getAttribute('src');
    }
}

This will echo only the src attribute of an img tag with a class="image" 这只会回显带有class="image"的img标签的src属性

Try using preg_match_all, like this: 尝试使用preg_match_all,如下所示:

preg_match_all('/img src="([^"]*)"/', $source, $images);

That should put all the URL's of the images in the $images variable. 那应该将所有图像的URL放在$images变量中。 What the regex does is find all img src bits in the code and matches the bit between the quotes. 正则表达式的作用是找到代码中的所有img src位,并匹配引号之间的位。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM