[英]PHP DOMdocument echoing problem
$content = '<!--<sup><span style="font-weight:bold;color:black;">0</span></sup><br/>-->
<div class="popular-video-image">
<a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/>
</a>
<span class="popular-video-artist ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular-video-title ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span>
</div>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($content);
foreach ($dom->getElementsByTagName('a') as $node)
{
$node->setAttribute('href', 'http://mysite.ru/' . $node->getAttribute('href'));
}
$dom->formatOutput = true;
echo $dom->saveXml($dom->documentElement);
輸出:
<html>
<body>
<div class="popular-video-image">
<a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/></a>
<span class="popular-video-artist ellipsis"><a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular-video-title ellipsis"><a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span>
</div>
</body>
</html>
我不想添加html和body標簽。 同樣也不想將標簽替換為<lang>
。 And
也是不必要的。
我只想在入口處收到經過修改的鏈接的此類內容。
對不起,英語不好!
您正在看到
在每行的末尾,因為HTML具有Windows樣式的行結尾 CR+LF
。 要擺脫它們,請在將其輸入DOMDocument
之前對其進行處理,以將它們轉換為Unix樣式的行尾LF
:
$content = preg_replace('/\r\n/', "\n", $content);
saveXml使用可選參數,以允許您指定要輸出的節點。
$dom->saveXml($dom->documentElement->firstChild->firstChild);
這將從輸出中刪除html和body標簽。
我猜想<html>
和<body>
標簽被放進去是因為您正在使用loadHTML
。 嘗試改用loadXML
。
至於<lang>
,因此必須將其替換,因為否則生成的XML將無效。 如果這導致您遇到問題,則應稍微改變一下方法並使用它,而不是反對它。
<?php
$content = '<!--<sup><span style="font-weight:bold;color:black;">0</span></sup><br/>-->
<div class="popular-video-image">
<a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/>
</a>
<span class="popular-video-artist ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular-video-title ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span>
</div>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($content);
foreach ($dom->getElementsByTagName('a') as $node)
{
$node->setAttribute('href', 'http://mysite.ru/' . $node->getAttribute('href'));
}
$dom->formatOutput = true;
echo preg_replace('#^<!DOCTYPE.+?>#', '', str_replace( array('<html>', '</html>', '<body>', '</body>', "\n\n", '<', '>'), array('', '', '', '', '', '<', '>',), $dom->saveHTML()));
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.