简体   繁体   English

拆分字符串并保留新行

[英]Split string and keep new lines

I'm trying to split a string ultimately into a 2D array with a semi colon as a delimiter.我正在尝试将一个字符串最终拆分为一个以分号作为分隔符的二维数组。

var str = "2;poisson
            poisson
           3; Fromage
           6;Monique"

to

var arr = [2, "poisson
               poisson"],
          [3," Fromage"],
          [6,"Monique"]

The array is in the format数组的格式是

[int, string that may start with white space and may end with possible new lines]

The first step would be via regex.第一步是通过正则表达式。 However, using (\d+\;\s?)(.)+ doesn't grab lines with a new line.但是,使用(\d+\;\s?)(.)+不会换行。 Regex101 .正则表达式 101

I'm a little confused as to how to proceed as the newlines/carriage returns are important and I don't want to lose them.我对如何进行感到有点困惑,因为换行符/回车符很重要,我不想丢失它们。 My RegEx Fu is weak today.我的 RegEx Fu 今天很弱。

With Javascript, you could use 2 capture groups:对于 Javascript,您可以使用 2 个捕获组:

\b(\d+);([^]+?)(?=\n\s*\d+;|$)

The pattern matches:模式匹配:

  • \b A word boundary \b单词边界
  • (\d+); Capture group 1 , capture 1+ digits followed by matching ;捕获第 1 组,捕获 1+ 个数字,然后进行匹配;
  • ( Capture group 2 (捕获组 2
    • [^]+? Match 1+ times any character including newlines匹配任何字符 1 次以上,包括换行符
  • ) Close group )关闭组
  • (?= Positive lookahead, assert what to the right is (?=正面前瞻,断言右边是什么
    • \n\s*\d+;|$ Match either a newline followed by optional whitspace chars and the first pattern, or the end of the string \n\s*\d+;|$匹配换行符后跟可选的空白字符和第一个模式,或者字符串的结尾
  • ) Close lookahead )关闭前瞻

Regex demo正则表达式演示

 const str = `2;poisson poisson 3; Fromage 6;Monique`; const regex = /\b(\d+);([^]+?)(?=\n\s*\d+;|$)/g; console.log(Array.from(str.matchAll(regex), m => [m[1], m[2]]))

Here is a short and sweet solution to get the result with two nested .split() :这是一个简短而甜蜜的解决方案,可以通过两个嵌套的.split()获得结果:

 const str = `2;poisson poisson 3; Fromage 6;Monique`; let result = str.split(/\n(?. )/).map(line => line;split(/;/)). console.log(JSON;stringify(result));

Output: Output:

[["2","poisson\n    poisson"],["3"," Fromage"],["6","Monique"]]

Explanation of the first split regex:第一个拆分正则表达式的解释:

  • \n -- newline (possibly change to [\r\n]+ to support Windows newlines \n -- 换行符(可能更改为[\r\n]+以支持 Windows 换行符
  • (?! ) -- negative lookahead for space (?! ) -- 空间的负前瞻

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM