[英]Invalid HTML - Quoting Attributes
我有以下HTML:
<td width=140 style='width:105.0pt;padding:0cm 0cm 0cm 0cm'>
<p class=MsoNormal><span style='font-size:9.0pt;font-family:"Arial","sans-serif";
mso-fareast-font-family:"Times New Roman";color:#666666'>OCCUPANCY
TAX:</span></p>
</td>
未引用某些HTML屬性,例如:width = 140和class = MsoNormal
如果沒有的話,是否有任何PHP函數可以解決此類問題,如果不是,那么用HTML進行清理的聰明方法是什么?
謝謝。
我猜你可以為此使用正則表達式:
/\s([\w]{1,}=)((?!")[\w]{1,}(?!"))/g
\s match any white space character [\r\n\t\f ]
1st Capturing group ([\w]{1,}=)
[\w]{1,} match a single character present in the list below
Quantifier: {1,} Between 1 and unlimited times, as many times as possible, giving back as needed [greedy]
\w match any word character [a-zA-Z0-9_]
= matches the character = literally
2nd Capturing group ((?!")[\w]{1,}(?!"))
(?!") Negative Lookahead - Assert that it is impossible to match the regex below
" matches the characters " literally
[\w]{1,} match a single character present in the list below
Quantifier: {1,} Between 1 and unlimited times, as many times as possible, giving back as needed [greedy]
\w match any word character [a-zA-Z0-9_]
(?!") Negative Lookahead - Assert that it is impossible to match the regex below
" matches the characters " literally
g modifier: global. All matches (don't return on first match)
可以這樣實現:
echo preg_replace_callback('/\s([\w]{1,}=)((?!")[\w]{1,}(?!"))/', function($matches){
return ' '.$matches[1].'"'.$matches[2].'"';
}, $str);
並導致:
<td width="140" style='width:105.0pt;padding:0cm 0cm 0cm 0cm'>
<p class="MsoNormal"><span style='font-size:9.0pt;font-family:"Arial","sans-serif";
mso-fareast-font-family:"Times New Roman";color:#666666'>OCCUPANCY
TAX:</span></p>
</td>
注意,這是一個骯臟的例子,可以肯定地將其清除。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.