PHP正则表达式匹配某些字符并获取字符串结尾

Question

The title is kinda unclear imo, but I couldnt find a better way to tell my concern. 标题有点不清楚，但我找不到更好的方式来表达我的关注。 I am trying to get some pictures from Reddit. 我正在尝试从Reddit获取一些图片。 So when I tried to get the url to the image i got some problems. 因此，当我尝试获取图片的网址时，出现了一些问题。

$url = 'http://www.reddit.com/r/pics';
$str = file_get_contents($url);

This is what i currently have. 这就是我目前所拥有的。 To get the specific part in the url code where the image-url stands, I need to find this part of the html: 要获取图像代码所在的网址代码中的特定部分，我需要找到html的这一部分：

`<a class="thumbnail may-blank " href="http://i.imgur.com/K4q9i5c.jpg">`

As i was trying to figure out a way how to get each href of all the links on the page, I could only think about regex. 当我试图找出一种方法来获取页面上所有链接的每个href时，我只能考虑正则表达式。 Finding the part of 找到一部分

<a class="thumbnail may-blank "

and then find this > sign I could get the whole line. 然后找到这个>符号，我可以得到整行。 Where I eventually could get the url of the picture from. 我最终可以从哪里获得图片的网址。

So I have been trying and trying to find an regex to match is, I couldnt get it work. 所以我一直在尝试寻找一个正则表达式来匹配，我无法使其正常工作。 Maybe someone here can help me. 也许有人可以帮我。 Or either has a better solution. 或有更好的解决方案。

It would be highly appreciated, Thanks 非常感谢，谢谢

Answer 1

Shouldn't use regex to parse html, its really a bad choice. 不应该使用正则表达式来解析html，这确实是一个糟糕的选择。
But if you really have to, something like this might work. 但是，如果确实需要，类似这样的操作可能会起作用。
(untested) （未试）

 #  '/(?s)<a\s+class\s*=\s*(["\'])(?:(?!\1|[<>]).)*\1\s+href\s*=\s*(["\'])((?:(?!\2|[<>]).)*)\2/'

 (?s)                               # Dot-All
 <a \s+ class \s* = \s*             # class
 ( ["'] )                           # (1), delimiter
 (?:
      (?! \1 | [<>] )
      . 
 )*
 \1                                 # delimiter 
 \s+ 
                                    # [^<>]* ( add if necessary )
 href \s* = \s*                     # href
 ( ["'] )                           # (2), delimiter
 (                                  # (3 start), Url
      (?:
           (?! \2 | [<>] )
           . 
      )*
 )                                  # (3 end)
 \2                                 # delimiter

Answer 2

如果只需要a标签中的href，请尝试：

'<a.*href=\"(.*)\".*$'

PHP正则表达式匹配某些字符并获取字符串结尾

问题描述

2 个解决方案

解决方案1
0

解决方案2
0 2014-03-19 21:59:52

PHP正则表达式匹配某些字符并获取字符串结尾

问题描述

2 个解决方案

解决方案1 0

解决方案2 0 2014-03-19 21:59:52

解决方案1
0

解决方案2
0 2014-03-19 21:59:52