简体   繁体   English

如何使用PHP和regexp将querystring附加到字符串中的每个URL

[英]How to append querystring to every URL in a string using PHP and regexp

I using PHP 5.6.40-0+deb8u5 on LINUX 我在LINUX上使用PHP 5.6.40-0 + deb8u5

I want to add querystring to every URL in the text string. 我想将querystring添加到文本字符串中的每个URL。 I NEARLY works, but never does the last URL. 我几乎可以使用了,但最后一个URL却没有。 What am I missing? 我想念什么?

Tried How to append to all urls in a string? 试图如何附加到字符串中的所有网址? but it never does the very last URL in the string. 但它永远不会在字符串中最后一个URL。

<?php
    $message = '<h4>Hello there AGAIN . visit  <br />         
    href="http://www.my-domain.com/another-link/" ' ; 
    $message .= ' <br /> or href="http://sub-domain.my-domain.com/subdir/sub-sub-dir/" ';
    $message .= ' <br /> or href="https://www.my-domain.com?uid=hello" ';
    $message .= ' <br /> or href="http://my-domain.com" ';
    $message .= ' <br /> or href="https://my-domain.com" ';
    $message .= ' <br /> or href="http://my-domain.com/" ';
    $message .= ' <br /> or href="https://my-domain.com/" ';
    $message .= ' <br /> or href="http://subdomain.my-domain.com/" ';
    $message .= ' <br /> or href="https://subdomain.my-domain.com" ';
    $message .= ' <br /> or href="http://subdomain.my-domain.com/more-page" ';
    $message .= ' <br /> or "https://subdomain.my-domain.com/"  with no href at the beginning';
    $message .= ' <br /> or href="http://subdomain.my-domain.com/one-more-page/sub-page"  with some more text after it.  ';
    $message .= ' <br /> or href="http://last-one.my-domain.com/one-more-page/sub-page"  with some more text after it. </h4>';

    echo $message;

    function AppendCampaignToString($string) {
        $regex = '/(href="https?:\/\/)(\w*.?my-domain\.com[^"]*)("[^>]*?>/i';
        return preg_replace_callback($regex, '_appendCampaignToString', $string);
    }

    function _AppendCampaignToString($match) {
        $url = $match[2];
        if (strpos($url, '?') === false) {
            $url .= '?';
        }
        else {
            $url .= '&';            
        }
        $url .= "MyID=666888";
        return $match[1].$url  ;
    }

    echo "<hr>" .  AppendCampaignToString($message) . "<hr />" ;
?>

It works for every kind of URL , sub-domain and file path EXCEPT the very last URL, no matter what type of URL the last URL is. 它适用于除最后一个URL以外的所有类型的URL,子域和文件路径,无论最后一个URL是哪种类型的URL。 so 所以

echo " 回声“


" . AppendCampaignToString($message) . " “ AppendCampaignToString($ message)。
" ; “;

gives: 得到:

Hello there AGAIN . 你好,再次。 visit 访问
href="http://www.my-domain.com/another-link/?MyID=666888" HREF = “http://www.my-domain.com/another-link/?MyID=666888”

or href="http://www.my-domain.com/subdir/sub-sub-dir/?MyID=666888" 或href =“ http://www.my-domain.com/subdir/sub-sub-dir/?MyID=666888”
or href="https://www.my-domain.com?uid=hello&MyID=666888" 或href =“ https://www.my-domain.com?uid=hello&MyID=666888”
or href="http://my-domain.com?MyID=666888" 或href =“ http://my-domain.com?MyID=666888”
or href="https://my-domain.com?MyID=666888" 或href =“ https://my-domain.com?MyID=666888”
or href="http://my-domain.com/?MyID=666888" 或href =“ http://my-domain.com/?MyID=666888”
or href="https://my-domain.com/?MyID=666888" 或href =“ https://my-domain.com/?MyID=666888”
or href="http://subdomain.my-domain.com/?MyID=666888" 或href =“ http://subdomain.my-domain.com/?MyID=666888”
or href="https://subdomain.my-domain.com?MyID=666888" 或href =“ https://subdomain.my-domain.com?MyID=666888”
or href="http://subdomain.my-domain.com/more-page?MyID=666888" 或href =“ http://subdomain.my-domain.com/more-page?MyID=666888”
or " https://subdomain.my-domain.com/ " with no href at the beginning 或“ https://subdomain.my-domain.com/ ”,开头没有href
or href="http://subdomain.my-domain.com/one-more-page/sub-page?MyID=666888" whit some more text after it. 或href =“ http://subdomain.my-domain.com/one-more-page/sub-page?MyID=666888”后面带有更多文本。
or href="http://last-one.my-domain.com/one-more-page/sub-page" with some more text after it. 或href =“ http://last-one.my-domain.com/one-more-page/sub-page”,其后还有更多文字。

Your last domain has - s in it so you need to put that in a character class with the \\w . 您的最后一个域中包含- ,因此您需要将其放入带有\\w的字符类中。 This works: 这有效:

(href="https?:\/\/)([\w-]*.?my-domain\.com[^"]*)("[^>]*?>)

https://regex101.com/r/etxiQI/2/ https://regex101.com/r/etxiQI/2/

Also note the regex in your question was missing a closing ) . 另请注意,您问题中的正则表达式缺少结尾( )

Additionally if my-domain is the top domain name the . 此外,如果my-domain是顶级域名,则. preceding that should be escaped as well. 在此之前也应避免。 eg: 例如:

(href="https?:\/\/)([\w-]*\.?my-domain\.com[^"]*)("[^>]*?>)

Although @user3783243 was faster than me, I am posting a pseudo-working script, because I spent some minutes on debugging this: 尽管@ user3783243比我快,但是我发布了一个伪工作脚本,因为我花了一些时间调试它:

<?php
    $message = '<h4>Hello there AGAIN . visit  <br />         
    href="http://www.my-domain.com/another-link/" ' ;
    $message .= ' <br /> or href="http://sub-domain.my-domain.com/subdir/sub-sub-dir/" ';
    $message .= ' <br /> or href="https://www.my-domain.com?uid=hello" ';
    $message .= ' <br /> or href="http://my-domain.com" ';
    $message .= ' <br /> or href="https://my-domain.com" ';
    $message .= ' <br /> or href="http://my-domain.com/" ';
    $message .= ' <br /> or href="https://my-domain.com/" ';
    $message .= ' <br /> or href="http://subdomain.my-domain.com/" ';
    $message .= ' <br /> or href="https://subdomain.my-domain.com" ';
    $message .= ' <br /> or href="http://subdomain.my-domain.com/more-page" ';
    $message .= ' <br /> or "https://subdomain.my-domain.com/"  with no href at the beginning';
    $message .= ' <br /> or href="http://subdomain.my-domain.com/one-more-page/sub-page"  with some more text after it.  ';
    $message .= ' <br /> or href="http://last-one.my-domain.com/one-more-page/sub-page"  with some more text after it. </h4>';

    echo $message;

    function AppendCampaignToString($string) {
        $regex = '/(href="https?:\/\/)([a-z0-9-]*.?my-domain\.com[^"]*)"[^>]*?>/i';
        return preg_replace_callback($regex, '_appendCampaignToString', $string, -1);
    }

    function _AppendCampaignToString($match) {
        $url = $match[2];

        echo "MATCHED $url \n";
        if (strpos($url, '?') === false) {
            $url .= '?';
        }
        else {
            $url .= '&';
        }
        $url .= "MyID=666888";
        return $match[1].$url  ;
    }

    echo "<hr>" .  AppendCampaignToString($message) . "<hr />" ;
?>
  • I took out the last open parenthesis from the regex (also mentioned by @user3783243) 我从正则表达式中取出了最后一个开放括号(也由@ user3783243提及)
  • added a debug message in the callback, to see what's actually being matched 在回调中添加了一条调试消息,以查看实际匹配的内容
  • extended the subdomain match to also match numbers, besides \\w and - 扩展了子域匹配以匹配数字,除了\\w-

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM