简体   繁体   English

PHP中的正则表达式:如何为html中的表创建模式

[英]Regular Expression in PHP: How to create a pattern for tables in html

I am using latest PHP. 我正在使用最新的PHP。 I want to parse HTML page to get data. 我想解析HTML页面以获取数据。

HTML: HTML:

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

PHP Code: PHP代码:

<?php

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.test.com/mypage.html');  
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);


$pattern = '/<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="1" cellpadding="0" cellspacing="0">[^~]</table>/';
preg_match_all($pattern, $result, $matches);
print_r($matches);

?>

I am not able to get all tables. 我无法获取所有表格。 When I use simple $pattern='/table/'; 当我使用简单的$ pattern ='/ table /'; , it gives me exact result. ,它给了我确切的结果。 How to create a pattern to get whole table at one array location? 如何创建模式以将整个表放在一个数组位置?

使用regex解析HTML充其量是一件痛苦的事情,因为HTML不是常规的,我建议您使用Simple HTML DOM

You can't parse [X]HTML with regex , but you can try: 您无法使用regex解析[X] HTML ,但可以尝试:

$pattern = '#<table(?:.*?)>(.*?)</table>#';

This won't work if there are nested tables. 如果存在嵌套表,这将无法工作。

Please have a look at this answer . 请看一下这个答案 It describes the usage of an HTML parser in PHP, which is what you want to do. 它描述了PHP想要使用HTML解析器的用途。

Or just use the DOM class php offers. 或仅使用php所提供的DOM类。 I think it can do the same as simple html dom but much faster (don't' get me wrong, I really like Simple Html DOM, but it's slow for files with a few dozen lines) 我认为它可以执行与简单html dom相同的操作,但速度要快得多(不要误会,我真的很喜欢Simple Html DOM,但是对于只有几十行的文件来说速度很慢)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM