简体   繁体   English

正则表达式 - 匹配多个前瞻条件

[英]Regex - match multiple lookahead conditions

I need to find a regex rule which finds newlines between bullet points and removes them.我需要找到一个正则表达式规则,它可以在项目符号之间找到换行符并删除它们。 For instance:例如:

• Here is some text • 这是一些文字

which flows to this separated by 2 newlines流向此,由 2 个换行符分隔

• This is a new bullet point separated by 2 newlines • 这是一个新的要点,由 2 个换行符分隔

Should become:应该变成:

• Here is some text which flows to this separated by 2 newlines • 这是一些流向此的文本,由 2 个换行符分隔

• This is a new bullet point separated by 2 newlines • 这是一个新的要点,由 2 个换行符分隔

Here's what I've tried:这是我尝试过的:

•(.+)\K\n+(?(?=[^•])(?=.+\n+•))

Where my thinking is:我的想法是:

  1. Find a previous line which starts with •查找以 • 开头的上一行
  2. Collect any character up until one or more newlines and discard.收集任何字符直到一个或多个换行符并丢弃。 I'm now ready to match ahead and replace newlines based on some conditions.我现在准备好提前匹配并根据某些条件替换换行符。
  3. Lookahead and check next character after newline is not a bullet.前瞻并检查换行符后的下一个字符不是项目符号。
  4. If it is, check to make sure that after all characters followed by one or more newlines, there is another bullet.如果是,请检查以确保在所有字符后跟一个或多个换行符之后,还有另一个项目符号。

I think my problem is not properly understanding how to chain together these conditions in the positive lookahead but struggling to find any clearcut answers / examples which deal with this kind of problem.我认为我的问题是没有正确理解如何在积极的前瞻中将这些条件链接在一起,而是努力寻找任何明确的答案/例子来处理这类问题。

As ever, any help is greatly appreciated!一如既往,非常感谢任何帮助!

You could match 2 newlines and then assert that what is on the right does not start with a bullet but does contain a bullet after that.您可以匹配 2 个换行符,然后断言右侧的内容不是以项目符号开头,但之后确实包含项目符号。

^•.*\K\r?\n\r?\n(?=(?!•).*\r?\n\r?\n•)

In parts在零件

  • ^ Start of string ^字符串开头
  • •.* Match a bullet and any char 0+ times except a newline •.*匹配项目符号和任何字符 0+ 次,换行符除外
  • \K\r?\n\r?\n Forget what was matched and match 2 newlines \K\r?\n\r?\n忘记匹配的内容并匹配 2 个换行符
  • (?= Positive lookahead, assert what is on the right is (?=正向前瞻,断言右边是
    • (?.•).* Negative lookahead, assert what is on the right is not (?.•).*负前瞻,断言右边的不是
    • \r?\n\r?\n• Match 2 newlines followed by \r?\n\r?\n•匹配 2 个换行符,后跟
  • ) Close positive lookahead )关闭正向前瞻

Regex demo正则表达式演示

If \R is supported to match a unicode newline sequence, you could also use如果支持\R以匹配 unicode 换行序列,您还可以使用

^•.*\K\R{2}(?=(?!•).*\R{2}•)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM