简体   繁体   English

如何使用preg_match()忽略空格

[英]How to Ignore Whitespaces using preg_match()

I have a string that looks like: 我有一个看起来像的字符串:

">ANY CONTENT</span>(<a id="show

I need to fetch ANY CONTENT. 我需要获取任何内容。 However, there are spaces in between 但是,两者之间有空格

</span> and (<a id="show

Here is my preg_match: 这是我的preg_match:

$success = preg_match('#">(.*?)</span>\s*\(<a id="show#s', $basicPage, $content);

\\s* represents spaces. \\ s *表示空格。 I get an empty array! 我得到一个空数组!

Any idea how to fetch CONTENT? 任何想法如何获取内容?

Use a real HTML parser. 使用真正的HTML解析器。 Regular expressions are not really suitable for the job. 正则表达式并不真正适合这项工作。 See this answer for more detail. 有关更多详细信息,请参见此答案

You can use DOMDocument::loadHTML() to parse into a structured DOM object that you can then query, like this very basic example (you need to do error checking though): 您可以使用DOMDocument :: loadHTML()解析为结构化的DOM对象,然后可以查询该对象,例如这个非常基本的示例(不过您需要进行错误检查):

$dom = new DOMDocument;
$dom->loadHTML($data);
$span = $dom->getElementsByTagName('span');
$content = $span->item(0)->textContent;

I just had to: 我只需要:

"> “>

define the above properly, because "> were too many in the page, so it didn't know which one to choose specficially. Therefore, it returned everything before "> until it hits ( 正确定义以上内容,因为“>在页面中过多,因此它不知道具体选择哪个。因此,它会返回”>之前的所有内容,直到出现(

Solution: 解:

.">

Sample: 样品:

$success = preg_match('#\.">(.*?)</span>\s*\(<a id="show#s', $basicPage, $content);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM