简体   繁体   English

JavaScript Regex-通过Regex模式将字符串拆分为数组

[英]JavaScript Regex - Splitting a string into an array by the Regex pattern

Given an input field, I'm trying to use a regex to find all the URLs in the text fields and make them links. 给定一个输入字段,我试图使用一个正则表达式在文本字段中查找所有URL并使它们链接。 I want all the information to be retained, however. 我希望保留所有信息。

So for example, I have an input of " http://google.com hello this is my content" -> I want to split that by the white space AFTER this regex pattern from another stack overflow question (regexp = /(ftp|http|https)://(\\w+:{0,1}\\w*@)?(\\S+)(:[0-9]+)?(/|/([\\w#!:.?+=&%@!-/]))?/) so that I end up with an array of [' http://google.com ', 'hello this is my content']. 因此,例如,我输入的内容为“ http://google.com,您好,这是我的内容”->我想在此regex模式与另一个堆栈溢出问题(regexp = /(ftp | HTTP | HTTPS)://(\\ w +:{0,1} \\ W * @)(\\ S +)(:[0-9] +)(/ | /([\\ w#:??!?+ =&%@!-/]))?/),这样我最终得到了[' http://google.com ','你好,这是我的内容']的数组。

Another ex: "hello this is my content http://yahoo.com testing testing http://google.com " -> arr of ['hello this is my content', ' http://yahoo.com ', 'testing testing', ' http://google.com '] 另一个例子:“您好,这是我的内容http://yahoo.com测试测试http://google.com ”-> ['你好,这是我的内容”,“ http://yahoo.com ,测试测试”,“ http://google.com ”]

How can this be done? 如何才能做到这一点? Any help is much appreciated! 任何帮助深表感谢!

First transform all the groups in your regular expression into non-capturing groups ( (?:...) ) and then wrap the whole regular expression inside a group, then use it to split the string like this: 首先将正则表达式中的所有组转换为非捕获组( (?:...) ),然后将整个正则表达式包装在组中,然后使用它来分割字符串,如下所示:

var regex = /((?:ftp|http|https):\/\/(?:\w+:{0,1}\w*@)?(?:\S+)(?::[0-9]+)?(?:\/|\/(?:[\w#!:.?+=&%@!-/]))?)/;
var result = str.split(regex);

Example: 例:

 var str = "hello this is my content http://yahoo.com testing testing http://google.com"; var regex = /((?:ftp|http|https):\\/\\/(?:\\w+:{0,1}\\w*@)?(?:\\S+)(?::[0-9]+)?(?:\\/|\\/(?:[\\w#!:.?+=&%@!-/]))?)/; var result = str.split(regex); console.log(result); 

You had few unescaped backslashes in your RegExp . 您的RegExp几乎没有未转义的反斜杠。

 var str = "hello this is my content http://yahoo.com testing testing http://google.com"; var captured = str.match(/(ftp|http|https):\\/\\/(\\w+:{0,1}\\w*@)?(\\S+)(:[0-9]+)?(\\/|\\/([\\w#!:.?+=&%@!-/]))?/g); var nonCaptured = []; str.split(' ').map((v,i) => captured.indexOf(v) == -1 ? nonCaptured.push(v) : null); console.log(nonCaptured, captured); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM