javascript 和 DOM 中的 Url 解析

Question

I am writing a support chat application where I want text to be parsed for urls.我正在编写一个支持聊天应用程序，我希望将文本解析为 url。 I have found answers for similar questions but nothing for the following.我找到了类似问题的答案，但没有找到以下问题的答案。

what i have我有的

function ReplaceUrlToAnchors(text) {
    var exp = /(\b(https?:\/\/|ftp:\/\/|file:\/\/|www.)
              [-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
    return text.replace(exp,"<a href='$1' target='_blank'>$1</a>"); 
}

that pattern is a modified version of one i found on the internet.该模式是我在互联网上找到的模式的修改版本。 It includes www.它包括万维网。 in the first token, because not all urls start with protocol:// However, when www.google.com is replaced with在第一个令牌中，因为不是所有的 url 都以 protocol:// 开头但是，当 www.google.com 被替换为

<a href='www.google.com' target='_blank'>www.google.com</a>

which pulls up MySite.com/webchat/wwww.google.com and I get a 404拉起 MySite.com/webchat/wwww.google.com 我得到 404

that is my first problem, my second is...这是我的第一个问题，我的第二个问题是......

in my script for generating messages to the log, I am forced to do it a hacky way:在我用于生成日志消息的脚本中，我不得不以一种骇人听闻的方式进行操作：

var last = 0;
function UpdateChatWindow(msgArray) {

    var chat = $get("MessageLog");
    for (var i = 0; i < msgArray.length; i++) {
        var element = document.createElement("div");
        var linkified = ReplaceUrlToAnchors(msgArray[i]);
        element.setAttribute("id", last.toString());
        element.innerHTML = linkified;
        chat.appendChild(element);
        last = last + 1;
    }
}

To get the "linkified" string to render HTML out correctly I have to use the non-standard.innerHTML attribute of element.要获得“链接”字符串以正确呈现 HTML，我必须使用元素的 non-standard.innerHTML 属性。 I would prefer a way were i could parse the string as tokens - text tokens and anchor tokens - and call either createTextNode or createElement("a") and stitch them together with DOM.我更喜欢一种方法是我可以将字符串解析为标记 - 文本标记和锚标记 - 并调用 createTextNode 或 createElement("a") 并将它们与 DOM 拼接在一起。

so question 1 is how should I go about www.site.com parsing, or even site.com?所以问题1是我应该如何go关于www.site.com解析，甚至site.com？ and question 2 is how would could I do this using only DOM?问题 2 是我如何仅使用 DOM 来做到这一点？

Answer 1

Another thing you could do is this:你可以做的另一件事是：

function ReplaceUrlToAnchors(text) {
    var exp = /(\b(https?:\/\/|ftp:\/\/|file:\/\/|www.)
              [-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
    return text.replace(exp, function(_, url) {
      return '<a href="' +
        (/^www\./.test(url) ? "http://" + url : url) +
        'target="_blank">' +
        url +
        '</a>';
    }); 
}

That is kind-of like your solution, but it does the check for "www" URLs in that callback passed in to ".replace()".这有点像您的解决方案，但它会检查传递给“.replace()”的回调中的“www”URL。

Note that you won't be picking up "stackoverflow.com" or "newegg.com" or anything like that, which I understand may be unavoidable (and even desirable, given the false positives you'd pick up).请注意，您不会选择“stackoverflow.com”或“newegg.com”或类似的东西，我理解这可能是不可避免的（甚至是可取的，因为您会选择误报）。

Answer 2

Here is what I came up with, perhaps someone has something better?这是我想出的，也许有人有更好的东西？

function replaceUrlToAnchors(text) {
    var naked = /(\b(www.)[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|](.com|.net|.org|.co.uk|.ca|.))/ig;
    text = text.replace(naked, "http://$1");

    var exp = /(\b(https?:\/\/|ftp:\/\/|file:\/\/)([-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]))/ig;
    return text.replace(exp,"<a href='$1' target='_blank'>$3</a>"); 
}

the first regex will replace www.google.com with http://www.google.com and is good enough for what I am doing.第一个正则表达式将用http://www.google.com替换 www.google.com 并且对于我正在做的事情已经足够了。 However, I will hold off marking this as the answer because I would also like to make (www.) optional but when I do (www.)?但是，我不会将此标记为答案，因为我也想将 (www.) 设为可选，但当我这样做时 (www.)？ it replaces every word with http://word/它用http://word/替换每个单词

javascript 和 DOM 中的 Url 解析

问题描述

2 个解决方案

解决方案1
1 已采纳 2011-05-25 17:19:02

解决方案2
0 2011-05-25 17:10:25

javascript 和 DOM 中的 Url 解析

问题描述

2 个解决方案

解决方案1 1 已采纳 2011-05-25 17:19:02

解决方案2 0 2011-05-25 17:10:25

解决方案1
1 已采纳 2011-05-25 17:19:02

解决方案2
0 2011-05-25 17:10:25