简体   繁体   English

使用php domdocument添加根路径的问题

[英]problem with adding root path using php domdocument

I would like to add root path of the site for those anchor tag which have not root path using php dom document, Till now a have made a function to do this with str_replace function but for some links its adding three and for times root path. 我想为那些没有使用php dom文档的根路径的锚标记添加站点的根路径,直到现在a已经使用str_replace函数创建了一个功能来执行此操作,但是对于某些链接,它添加了三遍,有时是根路径。 Then what i should to edit in this function. 然后,我应该在此功能中进行编辑。

Problem := The problem is its adding three and for times root path for every anchor tag, and not for some. 问题 :=问题在于,它为每个锚标记(而不是某些锚)添加了三个根路径。 $HTML variable has many anchor tags, about above 200 links. $ HTML变量具有许多锚标记,大约有200个以上的链接。 And also same for images. 图像也一样。

I know that its very dirty question, but what i have missed, i cant getting. 我知道这是一个非常肮脏的问题,但是我错过了什么,我无法理解。

function addRootPathToAnchor($HTML)
{
    $tmpHtml = '';
    $xml = new DOMDocument();
    $xml->validateOnParse = true;
    $xml->loadHTML($HTML);

   foreach ($xml->getElementsByTagName('a') as $a )
   {
      $href = $a->getAttribute('href');
      if(strpos($href,'www' > 0))
        continue;
      else
        $HTML = str_replace($href,"http://www.mysite.com/".$href,$HTML);  

   }

   return $HTML;
}

I see some problems in your code: 我在您的代码中看到了一些问题:

  1. The decision whether or not an URI has a full root path (is a fully qualified URI) or not. 决定URI是否具有完整的根路径(完全合格的URI)。
  2. You're not resolving relative URLs to the base URL. 您没有将相对URL解析为基本URL。 Just appending does not do the job. 仅附加并不能完成这项工作。
  3. The function returns a DomDocument Object and not a string. 该函数返回一个DomDocument对象,而不是一个字符串。 I assume you don't want that but I don't know, you have not written in your question. 我假设您不想要那个,但我不知道,您还没有写问题。

How to detect if a URL is a relative one. 如何检测URL是否是相对URL。

Relative URLs don't specifiy a protocol. 相对URL未指定协议。 So I would check for that to determine whether or not a href attribute is a fully qualified (absolute) URI or not ( Demo ): 因此,我将对此进行检查以确定href属性是否为完全限定的(绝对)URI( 演示 ):

$isRelative = (bool) !parse_url($url, PHP_URL_SCHEME);

Resolving a relative URL to a base URL 将相对URL解析为基本URL

However this won't help you to properly resolve a relative URL to the base URL. 但是,这不会帮助您正确地将相对URL解析为基本URL。 What you do is conceptually broken. 您所做的在概念上是行不通的。 It's specified in an RFC how to resolve a relative URI to the base URL ( RFC 1808 and RFC 3986 ). 在RFC中指定了如何解析相对于基本URL的相对URI( RFC 1808和RFC 3986 )。 You can use an existing library to just let the work do for you, a working one is Net_URL2 : 您可以使用现有的库让工作为您完成,可以使用的是Net_URL2

require_once('Net/URL2.php'); # or configure your autoloader

$baseUrl = 'http://www.example.com/test/images.html';

$hrefRelativeOrAbsolute = '...';

$baseUrl = new Net_URL2($baseUrl);

$urlAbsolute = (string) $baseUrl->resolve($hrefRelativeOrAbsolute);

Instead of if(strpos($href,'www' > 0)) you should use if(strpos($href,'www') !== false) . 而不是if(strpos($href,'www' > 0))您应该使用if(strpos($href,'www') !== false)

The > 0 was inside the function-call ( strpos() ). > 0在函数调用( strpos() )内。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM