[英]Removing blank lines between only certain other matching lines
I am trying to remove blank lines between other lines that match a particular pattern. 我正在尝试删除与特定模式匹配的其他行之间的空白行。 In my case, that pattern is just that the line begins with a -
character. 就我而言,该模式只是该行以-
字符开头。
const orig = `
- line1
- line2
- line3
- line4
- line5
`.trim();
const actual =
orig.replace(/((?:^|\n)-.*\n)\n(-)/g, '$1$2');
In the code above, I'm using a regex to match: 在上面的代码中,我使用正则表达式进行匹配:
-
prefixed line, followed by.. -
前缀行,后跟.. -
另一个-
I'm globally replacing the entire expression with the two capture groups that omit the empty line between them. 我正在全局地用两个捕获组替换整个表达式,这两个捕获组之间省略了空行。 This sort of works like I expected it to, but omits every other empty line, and I don't know why. 这有点像我的作品,预计到,但省略所有其他空行,我不知道为什么。
Where I would have expected the code above to give me this: 我期望上面的代码给我这样的地方:
- line1
- line2
- line3
- line4
- line5
...it actually gives me this: ...实际上给了我这个:
- line1
- line2
- line3
- line4
- line5
Here is a fiddle that demonstrates the problem. 这是一个证明问题的小提琴。
Question: What about the regex is causing this behavior? 问题:正则表达式会导致这种行为吗?
Bonus: Is there a better way to do this? 奖励:有更好的方法吗? (eg via split
/ reduce
- although I would still like to know why it doesn't work) (例如,通过split
/ reduce
尽管我仍然想知道为什么它不起作用)
The last -
is a part of the consuming pattern. 最后-
是消费模式的一部分。 Once the (-)
matches, the regex index is set after that -
, and you cannot find that match as -
in (?:^|\\n)-
cannot match that -
. 一旦(-)
匹配,则在-
之后设置正则表达式索引,而在-
(?:^|\\n)-
中找不到与-
匹配的正则表达式。 You need to put it into a positive lookahead. 您需要将其提前确定。 Then, you need to use m
modifier to let ^
match start of a line positions, not just start of string. 然后,您需要使用m
修饰符让^
匹配行位置的开头,而不仅仅是字符串的开头。
Use 采用
/((?:^|\n)-.*\n)\n(?=-)/gm
See the regex demo . 参见regex演示 。 Replacement string is reduced to $1
since there is only one capturing group left. 由于只剩下一个捕获组,替换字符串减少为$1
。
Here is the fixed expression demo: 这是固定表达式演示:
const orig = ` - line1 - line2 - line3 - line4 - line5 `.trim(); const actual = orig.replace(/((?:^|\\n)-.*\\n)\\n(?=-)/gm, '$1'); document.getElementById('orig').innerText = orig; document.getElementById('actual').innerText = actual;
ul { font-family: sans-serif; list-style: none; padding: 0; } li { display: inline-block; padding: 1em; vertical-align: top; }
<ul> <li><h3>Original</h3><pre id="orig"></pre></li> <li><h3>Expected</h3><pre>- line1<br />- line2<br />- line3<br />- line4<br />- line5</pre></li> <li><h3>Actual</h3><pre id="actual"></pre></li> </ul>
The reason for this behavior is that the regex does not overlap matches. 此行为的原因是正则表达式不与匹配项重叠。 It consumes and matches: 它消耗并匹配:
- line 1
-
Replaces with: 替换为:
- line 1
-
And then continues traversing the string from the end of its previous match. 然后从上一个匹配的结尾继续遍历该字符串。
For this reason it does not match the next newline, because 因此,它与下一个换行符不匹配,因为
line 2
- line 3
Does not contain a match your pattern. 不包含与您的模式匹配的内容。 The next match to your pattern will be 您的图案的下一个匹配项将是
<newline>
- line 3
-
Replaced by: 取而代之:
<newline>
- line 3
-
A way to solve this is by using either lookaheads or lookbehinds , which allow conditional matching based on surrounding patterns without consuming those patterns . 解决此问题的方法是使用先行或后行 ,它们允许基于周围模式进行条件匹配而无需使用这些模式 。
We can modify your pattern slightly to use a lookahead to make sure the next line adheres to the pattern 我们可以略微修改您的图案以使用前瞻性以确保下一行符合该图案
const actual = orig.replace(/^(-.*\n)\n(?=-)/gm, '$1');
https://regex101.com/r/fPUkYh/4 https://regex101.com/r/fPUkYh/4
I also changed ((?:^|\\n)-.*\\n)\\n
to ^(-.*\\n)\\n
and added the m
flag because the start of line assertion ^
does not need to be in the capturing group and the \\n
leads to the removal of preceding newlines. 我还将((?:^|\\n)-.*\\n)\\n
更改为^(-.*\\n)\\n
并添加了m
标志,因为行断言的开始^
不需要在捕获组, \\n
导致删除前面的换行符。
This pattern could also be modified to match an arbitrary number of bl;ank lines in between lines matching the pattern: 也可以修改此模式以匹配任意数量的bl;与该模式匹配的行之间的ank行:
/^(-.*\n)\n+(?=-)/gm
https://regex101.com/r/X7B7pi/2 https://regex101.com/r/X7B7pi/2
Easy enough when using the Multi-line modifier //m
使用多行修饰符//m
足够容易
( # (1 start), Stuff to write back
^ # BOL
- .*
\r? \n
) # (1 end)
\s* # Blank lines to remove
\r? \n
var orig_str = "- line1\\n\\n\\n- line2\\n\\n- line3\\n\\n- line4\\n\\n- line5\\n- line6"; var new_str = orig_str.replace(/(^-.*\\r?\\n)\\s*\\r?\\n/mg, '$1'); console.log( "Original\\n--------\\n" + orig_str + "\\n" ); console.log( "New\\n--------\\n" + new_str );
Output 输出量
Original
--------
- line1
- line2
- line3
- line4
- line5
- line6
New
--------
- line1
- line2
- line3
- line4
- line5
- line6
If just between -lines
is what you need, just add an assertion at the 如果只需要在-lines
之间-lines
,只需在
end (^-.*\\r?\\n)\\s*\\r?\\n(?=-) 结束(^-。* \\ r?\\ n)\\ s * \\ r?\\ n(?=-)
You can do it in the following way 您可以通过以下方式进行
const orig = ` - line1 - line2 - line3 - line4 - line5 `.trim(); const actual = orig.replace(/(\\-[^\\n]*)([^-]*)(?=-)/g, '$1\\n'); document.getElementById('orig').innerText = orig; document.getElementById('actual').innerText = actual;
<ul> <li><h3>Original</h3><pre id="orig"></pre></li> <li><h3>Expected</h3><pre>- line1<br />- line2<br />- line3<br />- line4<br />- line5</pre></li> <li><h3>Actual</h3><pre id="actual"></pre></li> </ul>
see the regex demo 见正则表达式演示
这里是一个较短的正则表达式,包括您要进行加工的模式:
const actual = orig.replace(/(-.*\\n)\\n/g, '$1');
这会给您您所需要的-
const actual = orig.replace(/\n\n|\r\r/g, "\n");
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.