簡體   English   中英

PHP嘗試從具有正則表達式的表格單元格中獲取字符串

[英]PHP Trying to get string from a table cell with regular expressions

我有以下網站,我希望使用正則表達式來獲取以下標記之間的文本

<td colspan="2" align="left" valign="top" bgcolor="#FBFAF4"> ..... </td>

我正在嘗試使用以下內容,但它會返回$ matches的空數組。

preg_match_all("/<td(.*) bgcolor=\"#FBFAF4\"\>(.*)\<\/td>/",$old_filecontents,$matches);

這是正確的模式?

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>Exotiq - Ðñïúüíôá</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-7"> <link href="Styles.css" rel="stylesheet" type="text/css"> <link href="stylesheets/Styles.css" rel="stylesheet" type="text/css"> <script src="scripts/PopBox.js" type="text/javascript"></script> <script type="text/javascript"> popBoxWaitImage.src = "images/spinner40.gif"; popBoxRevertImage = "images/magminus.gif"; popBoxPopImage = "images/magplus.gif"; </script> <script type="text/javascript"> AC_FL_RunContent('codebase', 'http://download.macromedia.com/pub/shockwave/ cabs/flash/swflash.cab#version=9,0,28,0', 'width','675','height','445','title','Morpork', 'src','assets/flash/morepork','loop', 'false','quality','high','pluginspage', 'http://www.adobe.com/shockwave/download/download.cgi?P1_Prod_Version=ShockwaveFlash', 'wmode','transparent','movie','assets/flash/morepork'); </script> </head> <body background="images/fonto2.jpg" topmargin="0"> <table width="948" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td><table width="948" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td width="24">&nbsp;</td> <td height="150" colspan="3"><object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,29,0" width="900" height="150"> <param name="movie" value="flash/top02.swf"> <param name="quality" value="high"> <param name="wmode" value="transparent"> <embed src="flash/top02.swf" quality="high" pluginspage="http://www.macromedia.com/go/getflashplayer" type="application/x-shockwave-flash" width="900" height="150"></embed></object></td> <td width="24" height="150">&nbsp;</td> </tr> <tr> <td height="31" colspan="5" valign="middle"> <div align="center"> <script src="menu/xaramenu.js"></script> <script Webstyle4 src="menu/menu_.js"></script> </div></td> </tr> <tr> <td width="24">&nbsp;</td> <td width="200" valign="top" background="images/GreenFasa.jpg"> <br> <table width="180" border="0" align="center" cellpadding="0" cellspacing="1"> <tr> <td height="25" class="styles"> &nbsp;<a href="MakutiUmbrela.html" class="styles">Makuti</a><br> <hr> </td> </tr> <tr> <td height="25" class="styles"> &nbsp;<a href="FunPalmUmbrela.html" class="styles">Fun Palm</a><br> <hr> </td> </tr> <tr> <td height="25" class="styles"> &nbsp;<a href="AlangUmbrela.html" class="styles">Alang-Alang</a><br> <hr> </td> </tr> <tr> <td height="25" class="styles"> &nbsp;<a href="ThatchUmbrela.html" class="styles">Thatch</a><br> <hr> </td> </tr> <tr> <td height="25" class="styles"> &nbsp;<a href="AbacaUmbrela.html" class="styles"><strong>Abaca</strong></a><br> <hr> </td> </tr> <tr> <td height="25" class="styles">&nbsp; </td> </tr> </table></td> <td colspan="2" align="left" valign="top" bgcolor="#FBFAF4"> <div align="left"> <table width="680" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td width="600" height="40" class="titles">ÊáôáóêåõÝò - ÏìðñÝëåò - Abaca</td> <td width="50" align="right" valign="middle" class="titles"> <div align="right"><a href="/AbacaUmbrela_en.html"><img src="images/uk-flag.jpg" width="30" height="17" border="0"></a></div></td> </tr> <tr> <td colspan="2" class="body"><p>Ç ïìðñÝëá <strong>Abaca</strong> Ýñ÷åôáé ùò Üîéïò áíôéêáôáóôÜôçò ôçò ïìðñÝëáò Rattan ðïõ åðß 15 ÷ñüíéá óôïëßæåé ôéò åëëçíéêÝò ðáñáëßåò. Ôï <strong>Abaca</strong> åßíáé Ýíá öõóéêü õëéêü ðéï <strong>áíèåêôéêü</strong> êáé ðéï üìïñöï áðü ôï Rattan. <br> Ðáñáäßäåôáé ìå <strong>îýëéíï êïñìü åìðïôéóìïý</strong> Ö8åê.<br> <br> </p> <table width="680" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="340" height="150" valign="middle"> <div align="left"><img src="images/Manufactures/Umbrelas/Abaca/AbacaUmbrela.jpg" width="328" height="500"></div></td> <td width="340" height="150" valign="bottom" class="body"> <table width="340" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="170" height="130"> <div align="center"><img src="images/Manufactures/Umbrelas/Abaca/1_Abaca02_s.jpg" width="152" height="101" class="PopBoxImageSmall" onclick="Pop (this,50,'PopBoxImageLarge');" title="ÌåãÝèõíóç" pbsrc="images/Manufactures/Umbrelas/Abaca/1_Abaca02.jpg" pbCaption="Abaca - ÏìðñÝëá ðáñáëßáò" popBoxCaptionBelow="true" /></div></td> <td width="170" height="130"> <div align="center"><img src="images/Manufactures/Umbrelas/Abaca/2_Abaca03_s.jpg" width="150" height="112" class="PopBoxImageSmall" onclick="Pop (this,50,'PopBoxImageLarge');" title="ÌåãÝèõíóç" pbsrc="images/Manufactures/Umbrelas/Abaca/2_Abaca03.jpg" pbCaption="Abaca - ÏìðñÝëá ðáñáëßáò" popBoxCaptionBelow="true" /></div></td> </tr> <tr> <td width="170" height="130"> <div align="center"><img src="images/Manufactures/Umbrelas/Abaca/3_Abaca01_s.jpg" width="150" height="112" class="PopBoxImageSmall" onclick="Pop (this,50,'PopBoxImageLarge');" title="ÌåãÝèõíóç" pbsrc="images/Manufactures/Umbrelas/Abaca/3_Abaca01.jpg" pbCaption="Abaca - ÏìðñÝëá ðáñáëßáò" popBoxCaptionBelow="true" /></div></td> <td width="170" height="130"> <div align="center"></div></td> </tr> <tr> <td width="170" height="130"> <div align="center"></div></td> <td width="170" height="130"> <div align="center"></div></td> </tr> <tr> <td width="170" height="130"> <div align="center"></div></td> <td width="170" height="130"> <div align="center"></div></td> </tr> </table></td> </tr> <tr> <td width="340" height="50" valign="top"> <p align="center">&nbsp;</p></td> <td width="340" height="50" valign="top"> <div align="center" class="perigrafes">ÊëéêÜñåôáé ðÜíù óôéò öùôïãñáößåò ãéá ìåãÝèõíóç</div></td> </tr> <tr> <td width="340" valign="bottom"> <div align="center"> </div></td> <td width="340" valign="bottom"> <p align="center">&nbsp; </p></td> </tr> <tr> <td width="340" valign="top"> <div align="center"></div></td> <td width="340" valign="top"> <p align="center">&nbsp;</p></td> </tr> <tr> <td height="20" colspan="2" valign="top">&nbsp;</td> </tr> </table></td> </tr> </table> <font color="#FFFFFF"></font></div></td> <td width="24" height="420">&nbsp;</td> </tr> <tr> <td width="24">&nbsp;</td> <td width="200">&nbsp;</td> <td width="600">&nbsp;</td> <td width="100">&nbsp;</td> <td width="24">&nbsp;</td> </tr> </table></td> </tr> <tr> <td height="22"><table width="900" border="0" align="center" cellpadding="0" cellspacing="0" bgcolor="#007F3E"> <tr> <td height="25"> <div align="center" class="styles">All rights reserved &reg; Designed by CONTINENTAL ADVERTISING </div></td> </tr> </table></td> </tr> </table> <script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12742174-1"); pageTracker._trackPageview(); } catch(err) {}</script> </body> </html>

鑒於您所談論的單元格包含HTML,實際上是另一個表格,您無法進行傳統的終止檢查...或者您將獲得單元格開頭和您找到的第一個</td>之間的內容。 加'。' 不是多行友好的,所以除非您的單元格打開並終止於同一行,否則您將無法獲得匹配。

我會說不要使用正則表達式 嘗試使用XML解析器。

如果您只是獲得純文本,那就沒問題,但是因為您要返回包含終結符的HTML,您需要使用具有某種DOM深度感知功能的解析器......或者找到一個計算正則表達式中的終結符的方法。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM