简体   繁体   中英

Regular Expression to find URLs in block of Text (Javascript)

I need a Javascript regular expression that scans a block of plain text and returns the text with the URLs as links.

This is what i have:

findLinks: function(s) {
          var hlink = /\s(ht|f)tp:\/\/([^ \,\;\:\!\)\(\"\'\\f\n\r\t\v])+/g;
          return (s.replace(hlink, function($0, $1, $2) {
              s = $0.substring(1, $0.length);
              while (s.length > 0 && s.charAt(s.length - 1) == '.') s = s.substring(0, s.length - 1);

              return ' ' + s + '';
          }));
      }

the problem is that it will only match http://www.google.com and NOT google.com/adsense

How could I accomplish both?

I use this a as reference all the time. This guy has 8 regex's you should know.

http://net.tutsplus.com/tutorials/other/8-regular-expressions-you-should-know/

Here is what he uses to look for URL's

/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/ 

He also breaks down what each part does. Very useful for learning regex's and not just getting an answer that works for reasons you don't understand.

This is a non-trivial task. To match any URI that is valid according to the relevant RFCs you need a monumentally complex regular expression, and even then that won't filter out URIs with invalid top-level domains (eg http://brussels.sprout/ ). So, you have to compromise. Determine what's important to you (examples: are false positives or false negatives more acceptable? Do you want to limit top-level domains to only those that currently exist? Do you allow non-Latin characters in matched URIs?) You should decide what you need you regular expression to do and design it accordingly rather than blindly copying and pasting an example from the web.

您可以使协议部分可选:

/\\s((ht|f)tp:\\/\\/)?([^ \\,\\;\\:\\!\\)\\(\\"\\'\\\\f\\n\\r\\t\\v])+/g

试试这个(适用于您的示例文本)

\S+\.\S+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM