简体   繁体   English

Javascript 正则表达式在文本被引号包围时用新行拆分字符串

[英]Javascript regex split a string with new lines when text is surrounded by quotes

I want to use the split string method to split a string when a text is quoted.我想在引用文本时使用split字符串方法拆分字符串。 For example, I want this string:例如,我想要这个字符串:

Some text  

"This is what
I want to catch"

Some more text

To become a string array such as:变成一个字符串数组如:

0: "Some text"
1: "This is what↵I want to catch"
2: "↵↵Some more text"

To achieve that, I am calling:为此,我呼吁:

inputText.split(/"((.+)|\s)+"/)

This doesn't work as it creates the array:这不起作用,因为它创建了数组:

0: "Some text"
1: "I want to catch"
2: "I want to catch"
3: "↵↵Some more text"

Any idea how to achieve what I want?知道如何实现我想要的吗?

I cannot use look-behinds and look-aheads, because I still want the quotes to be a part of the match.我不能使用后视和前瞻,因为我仍然希望引号成为匹配的一部分。 My goal is to split the string when a text is surrounded by quotes, but not have the quotes themselves in the array.我的目标是在文本被引号包围时拆分字符串,但数组中没有引号本身。

As desired you may use this regex in split :根据需要,您可以在split中使用此正则表达式:

/\n*(?:"([^"\\]*(?:\\.[^"\\]*)*)")?\n+/

RegEx Explanation:正则表达式解释:

  • \n* : Match 0+ line breaks \n* : 匹配 0+ 换行符
  • (?: : Start non-capture group (?: : 启动非捕获组
    • " : Match opening " " : 比赛开场"
    • ( : Start capture group ( : 开始捕获组
      • [^"\\]* : match 0+ of any characters that are not " and not \ [^"\\]* : 匹配 0+ 任何不是"和不是\的字符
      • (?:\\.[^"\\]*)* : Match an escaped character followed by 0+ of any characters that are not " and not \ . (?:\\.[^"\\]*)* :匹配一个转义字符,后跟 0+ 任何不是"和不是\的字符。 Repeat this group 0 or more times重复此组 0 次或多次
    • ) : End capture group #1 ) : 结束捕获组 #1
    • " : Match closing " " : 比赛结束"
  • )? : End non-capture group. :结束非捕获组。 ? makes this group optional使该组可选
  • \n+ : Match 1+ line breaks \n+ : 匹配 1+ 换行符

Code:代码:

 const s = `Some text "This is what I want to catch" Some more text` var m = s.split(/\n*(?:"([^"\\]*(?:\\.[^"\\]*)*)")?\n+/) console.log(m)


Alternatively , you may use this regex in Javascript to match quoted string that allows escaping of quotes as well:或者,您可以在 Javascript 中使用此正则表达式来匹配允许 escaping 引用的字符串:

/"[^"\\]*(?:\\.[^"\\]*)*"|[^"\n]+/g

RegEx Demo正则表达式演示

RegEx Explanation:正则表达式解释:

  • " : match opening " " : 比赛开场"
  • [^"\\]* : match 0+ of any characters that are not " and not \ [^"\\]* : 匹配 0+ 任何不是"和不是\的字符
  • (?:\\.[^"\\]*)* : Match an escaped character followed by 0+ of any characters that are not " and not \ . (?:\\.[^"\\]*)* :匹配一个转义字符,后跟 0+ 任何不是"和不是\的字符。 Repeat this group 0 or more times重复此组 0 次或多次
  • " : Match closing " " : 比赛结束"
  • | : OR : 或者
  • [^"\n] : Match other lines without newline or " [^"\n] : 匹配没有换行符或"的其他行

Code:代码:

 const s = `Some text "This is what I want to catch" Some more text` var m = s.match(/"[^"\\]*(?:\\.[^"\\]*)*"|[^"\n]+/g) console.log(m)

You can simply split over " . You can also truncate the leading and trailing new line characters.您可以简单地拆分" 。您还可以截断前导和尾随的换行符。

 const s = `Some text "This is what I want to catch" Some more text` console.log(s.split('"'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM