使用谷歌应用程序脚本从字符串中提取所有链接

Question

i have an string variable with links inside (among other text), and i want to be able to extract all links containing a certain patron (like containing the word 'case')... is this possible to do?我有一个字符串变量，里面有链接（以及其他文本），我希望能够提取包含某个顾客的所有链接（比如包含“case”这个词）......这可能吗？

Variable string is something like:可变字符串是这样的：

var string = 'here is some text line among the ones there will be links like https://stackoverflow.com/questions/40725199/extract-all-links-from-a-string-with-google-app-script?noredirect=1#comment68679843_40725199 and more';

As a workaround, i used what described here: extract links from document , to create a document with the string as content and then extract the links, but i would like to do it directly...作为一种解决方法，我使用了此处描述的内容：从文档中提取链接，以字符串作为内容创建文档，然后提取链接，但我想直接执行此操作...

Regards,问候，

EDIT (To Ruben):编辑（对鲁本）：

If i use:如果我使用：

var string = 'http://mangafox.me/manga/tales_of_demons_and_gods/c105/1.html here is some text line among the ones there will be links like https://stackoverflow.com/questions/40725199/extract-all-links-from-a-string-with-google-app-script?noredirect=1#comment68679843_40725199 and more ';

I got only the first link twice (see screenshot here ).我只得到了第一个链接两次（请参阅此处的屏幕截图）。

And if i use:如果我使用：

var string = 'here is some text line among the ones there will be links like https://stackoverflow.com/questions/40725199/extract-all-links-from-a-string-with-google-app-script?noredirect=1#comment68679843_40725199 and more http://mangafox.me/manga/tales_of_demons_and_gods/c105/1.html ';

The same again (see screenshoot here ).再次相同（请参阅此处的屏幕截图）。

Answer 1

Google Apps Script Google Apps脚本

function test2(){
  var re = /\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'"".,<>?«»“”‘’]))/i;
  var string = 'here is some text line among the ones there will be links like https://stackoverflow.com/questions/40725199/extract-all-links-from-a-string-with-google-app-script?noredirect=1#comment68679843_40725199 and more';
  for(var i = 0; i <= re.exec(string).length; i++){
    if(re.exec(string)[i]) Logger.log(re.exec(string)[i]) 
  }
}

JavaScript. JavaScript。

 var re = /\\b((?:[az][\\w-]+:(?:\\/{1,3}|[a-z0-9%])|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][az]{2,4}\\/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:'"".,<>?«»“”'']))/i; var string = 'here is some text line among the ones there will be links like https://stackoverflow.com/questions/40725199/extract-all-links-from-a-string-with-google-app-script?noredirect=1#comment68679843_40725199 and more here is some text line among the ones there will be links like https://stackoverflow.com/questions/40725199/extract-all-links-from-a-string-with-google-app-script?noredirect=1#comment68679843_40725199 and more'; for(var i = 0; i <= re.exec(string).length; i++){ if(re.exec(string)[i]) console.log(re.exec(string)[i]) }

Reference 参考

RegularExpression to Extract Url For Javascript RegularExpression提取Javascript网址

Answer 2

If you're only getting the first match then I think you need the 'g' flag on the Regular Expression to capture all matches, then each call to exec() will return the next match.如果您只获得第一个匹配项，那么我认为您需要正则表达式上的 'g' 标志来捕获所有匹配项，然后每次调用 exec() 都将返回下一个匹配项。 I'm using:我正在使用：

const re = /(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#\/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[A-Z0-9+&@#\/%=~_|$])/igm;

while ((reResults = re.exec(s)) !== null) { //finds next match
      Logger.log(reResults[0]); //result of next match
}

使用谷歌应用程序脚本从字符串中提取所有链接

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-11-21 20:13:00

Google Apps Script Google Apps脚本

JavaScript. JavaScript。

Reference 参考

解决方案2
0 2021-11-16 23:40:08

使用谷歌应用程序脚本从字符串中提取所有链接

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-11-21 20:13:00

Google Apps Script Google Apps脚本

JavaScript. JavaScript。

Reference 参考

解决方案2 0 2021-11-16 23:40:08

解决方案1
2 已采纳 2016-11-21 20:13:00

解决方案2
0 2021-11-16 23:40:08