简体   繁体   English

正则表达式(JS):匹配没有协议的URL,忽略带有一个协议的URL

[英]Regex (JS): Match URLs without a protocal, ignore URLS that with one

Apologies for yet another regex URL matching question, but I haven't been a able to find a solution in any of the other threads. 为另一个正则表达式URL匹配问题表示歉意,但是我无法在其他任何线程中找到解决方案。

I want to run a replace() method on a string, with a pattern that matches all URLs without a protocal (http, https etc) but ignores urls that do have one. 我想在一个字符串上运行一个replace()方法,该模式可以匹配所有没有协议(URL,http,https等)的URL,但是忽略确实包含一个的URL。

So given this input: 因此,鉴于此输入:

www.google.com www.facebook.com http://www.google.com http://www.facebook.com

It would match www.google.com and www.facebook.com on the first line (without any surrounding whitespace), but ignore the other URLs on the second and third line. 它与第一行的www.google.comwww.facebook.com匹配(没有任何空格),但忽略第二和第三行的其他URL。

I thought about just looking for www and ignoring matches which have // as preceding characters, which led me to this: 我考虑过只寻找www并忽略以//作为前一个字符的匹配项,这导致我想到了这一点:

https://www.regex101.com/r/Y3rqxy/1 https://www.regex101.com/r/Y3rqxy/1

However, as you can see the second match includes the preceding whitespace. 但是,您可以看到第二个匹配项包括前面的空格。 As I want to replace the www with http://www this whitespace buggers things up a little. 当我想用http://www替换www ,此空格会使事情有些混乱。

Any regex mandarins able to help me out on this one? 任何正则表达式普通话能够帮助我解决这一问题吗?

Mere seconds after posting this, one of my colleagues came up with a solution. 发布此消息仅几秒钟后,我的一位同事提出了一个解决方案。 It's a little wacky (thanks javascript) but it works! 有点古怪(感谢javascript),但是可以用! This example assumes you want to add http:// to any URLs that are missing their protocal. 本示例假定您要向缺少其协议的所有URL添加http://

First you have to reverse the string you're running the .replace() method on: 首先,您必须反转在以下位置运行.replace()方法的字符串:

string.split('').reverse().join('')

Then you can run call the replace method with the following regex (note the backwards http://www !): 然后,您可以使用以下正则表达式来运行replace方法(请注意向后的http://www !):

string.replace(/www(?!\\/\\/)/gi, 'www//:ptth')

Then you just reverse your string again: 然后,您只需再次反转字符串即可:

string.split('').reverse().join('')

And any URLS that are missing a protocal in that string will now have them. 现在,任何在该字符串中缺少协议的URL都将拥有它们。

It's not going to win any awards for cleanliness, but it works! 它不会赢得任何清洁奖,但是它有效!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM