我应该为正则表达式编写什么查询来捕获指示的段落格式并跳过其余部分？

Question

I am trying to write a regex query to capture either forms of following paragraphs from 'DIAGNOSIS' until before 'Board of pathologists' and ignoring the rest.我正在尝试编写一个正则表达式查询来捕获从“诊断”到“病理学家委员会”之前的以下段落中的任何一种形式，并忽略其余部分。 What is a good regex query for this?什么是好的正则表达式查询？

("" indicate the beginning and the end of paragraphs and not included in the wanted string) （“”表示段落的开头和结尾，不包含在想要的字符串中）

("THIS IS DIAGNOSIS..." and "diagnosis result" are sample texts for the sake of the question and are replaced by different things in the data) （“THIS IS DIAGNOSIS...”和“diagnosis result”是问题的示例文本，并由数据中的不同内容替换）

Paragraph format 1:段落格式1：

" ”

DIAGNOSIS:诊断：

A- THIS IS THE DIAGNOSIS, NO.1: A- 这是诊断，NO.1：

diagnosis results诊断结果

B- THIS IS THE DIAGNOSIS, NO.2: B- 这是诊断，NO.2：

diagnosis result诊断结果
another result另一个结果

Board of pathologists: .病理学家委员会：。 . . . .

" ”

Paragraph format 2:段落格式2：

" ”

DIAGNOSIS:诊断：

THIS IS THE DIAGNOSIS:这是诊断：

diagnosis results诊断结果

Board of pathologists:病理学家委员会：
. . . . . .

" ”

I used "DIAGNOSIS:(\\s*)((\\w*.\\s*)*)".我使用了“诊断：(\\s*)((\\w*.\\s*)*)”。 I know that this captures almost anything after diagnosis and my output shows that :) I couldn't find any better solution to capture the paragraphs.我知道这会在诊断后捕获几乎所有内容，并且我的输出显示:) 我找不到任何更好的解决方案来捕获这些段落。

Answer 1

You could match ^DIAGNOSIS: form the start of the string.您可以匹配^DIAGNOSIS:形成字符串的开头。

Then you could repeatedly match the following lines that do not start with Board of pathologists: using a negative lookahead (?:(?!Board of pathologists:).*\\r?\\n)*然后，您可以重复匹配以下不以Board of pathologists开头的行： using a negative lookahead (?:(?!Board of pathologists:).*\\r?\\n)*

^DIAGNOSIS:\s*(?:\r?\n)(?:(?!Board of pathologists:).*\r?\n)*

Regex demo正则表达式演示

我应该为正则表达式编写什么查询来捕获指示的段落格式并跳过其余部分？

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-02-09 19:35:57

我应该为正则表达式编写什么查询来捕获指示的段落格式并跳过其余部分？

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-02-09 19:35:57

解决方案1
0 已采纳 2020-02-09 19:35:57