简体   繁体   中英

automatic link creation using php without breaking the html tags

i want to convert text links in my content page into active links using php. i tried every possible script out there, they all fine but the problem that they convert links in img src tag. they convert links everywhere and break the html code.

i find a good script that do what i want exactly but it is in javascript. it is called jquery-linkify. you can find the script here http://github.com/maranomynet/linkify/

the trick in the script that it convert text links without breaking the html code. i tried to convert the script into php but failed.

i cant use the script on my website because there is other scripts that has conflict with jquery.

anyone could rewrite this script for php? or at least guide me how?

thanks.

First, parse the text with an HTML parser, with something like DOMDocument::loadHTML . Note that poor HTML can be hard to parse, and depending on the parser, you might get slightly different output in the browser after running such a function.

PHP's DOMDocument isn't very flexible in that regard. You may have better luck by parsing with other tools. But if you are working with valid HTML (and you should try to, if it's within your control), none of that is a concern.

After parsing the text, you need to look at the text nodes for links and replace them. Using a regular expression is the simplest way.

Here's a sample script that does just that:

<?php
function linkify($text)
{
  $re = "@\b(https?://)?(([0-9a-zA-Z_!~*'().&=+$%-]+:)?[0-9a-zA-Z_!~*'().&=+$%-]+\@)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-zA-Z_!~*'()-]+\.)*([0-9a-zA-Z][0-9a-zA-Z-]{0,61})?[0-9a-zA-Z]\.[a-zA-Z]{2,6})(:[0-9]{1,4})?((/[0-9a-zA-Z_!~*'().;?:\@&=+$,%#-]+)*/?)@";
  preg_match_all($re, $text, $matches, PREG_OFFSET_CAPTURE);

  $matches = $matches[0];

  $i = count($matches);
  while ($i--)
  {
    $url = $matches[$i][0];
    if (!preg_match('@^https?://@', $url))
      $url = 'http://'.$url;

    $text = substr_replace($text, '<a href="'.$url.'">'.$matches[$i][0].'</a>', $matches[$i][1], strlen($matches[$i][0]));

  }

  return $text;
}

$dom = new DOMDocument();
$dom->loadHTML('<b>stackoverflow.com</b> <a href="stackoverflow.com">test</a>');
$xpath = new DOMXpath($dom);

foreach ($xpath->query('//text()') as $text)
{
  $frag = $dom->createDocumentFragment();
  $frag->appendXML(linkify($text->nodeValue));
  $text->parentNode->replaceChild($frag, $text);
}

echo $dom->saveHTML();
?>

I did not come up with that regular expression, and I cannot vouch for its accuracy. I also did not test the script, except for this above case. However, this should be more than enough to get you going.

Output:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<body>
 <b><a href="http://stackoverflow.com">stackoverflow.com</a></b> 
 <a href="stackoverflow.com">test</a>
</body>
</html>

Note that saveHTML() adds the surrounding tags. If that's a problem, you can strip them out with substr() .

使用HTML解析器 ,仅在文本节点内搜索URL。

I think the trick is in tracking the single ' and double quotes '' in your PHP code and merging between them in a correct way so you put '' inside "" or vice versa.

For Example,

   <?PHP

   //old html tags
   echo "<h1>Header1</h1>";
   echo "<div>some text</div>";

   //your added links
   echo "<p><a href='link1.php'>Link1</a><br>";
   echo "<a href='link1.php'>Link1</a></p>";

   //old html tags
   echo "<h1>Another Header</h1>";
   echo "<div>some text</div>";

   ?>

I hope this helps you ..

$text = 'Any text ... link  http://example123.com  and image <img src="http://exaple.com/image.jpg" />';
$text = preg_replace('!([^\"])(http:\/\/(?:[\w\.]+))([^\"])!', '\\1<a href="\\2">\\2</a>\\3', $text);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM