简体   繁体   English

冒号前后匹配短语

[英]Match Phrase before and after colon

I have the following string:我有以下字符串:

'FIELDS--> FIELD1: Random Sentence  \r\n FIELD2: \r\nSOURCEHINT--> FIELD3: 
 value.nested.value, FIELD4: 5.5.5.5, FIELD5: Longer Sentence, with more words-and punctation\r\n'

I want the following from the string above:我想从上面的字符串中得到以下内容:

[FIELD1, Random Sentence]
[FIELD2, ]
[FIELD3, value.nested.value]
[FIELD4, 5.5.5.5]
[FIELD5, Longer Sentence, with more words-and punctation]

I still want the value if it is empty and I want the full sentences.如果它是空的,我仍然想要这个值并且我想要完整的句子。 The amount of fields may vary as well.字段的数量也可能不同。 This is similar to Match word before and after colon , but in this case I want the full sentence instead of just the word.这类似于在冒号之前和之后匹配单词,但在这种情况下,我想要完整的句子而不仅仅是单词。 Additionally the FIELD names can change.此外,FIELD 名称可以更改。 So they could KEY3, instead of FIELD1.所以他们可以使用 KEY3,而不是 FIELD1。

I tried:我试过:

re.findall(r'(\w+) *:(?:(.*)?), x)

It stops matching after the first match, so this just outputs FIELD1, and matches everything after it.它在第一次匹配后停止匹配,所以这只是输出 FIELD1,并匹配它之后的所有内容。

It seems you may use看来你可以用

r'(\w+) *: *(.*?)(?=\s*(?:\w+:|$))'

See the regex demo查看正则表达式演示

Details细节

  • (\\w+) - Group 1: one or more word chars (\\w+) - 第 1 组:一个或多个单词字符
  • *: * - a : enclosed with spaces *: * - a :用空格括起来
  • (.*?) - Group 2: any chars, 0 or more repetitions, as few as possible, up to the first occurrence of (.*?) - 第 2 组:任何字符,0 次或多次重复,尽可能少,直到第一次出现
  • (?=\\s*(?:\\w+:|$)) - 0+ whitespaces followed with either 1+ word chars followed with : or an end of the string position. (?=\\s*(?:\\w+:|$)) - 0+ 个空格后跟 1+ 个单词字符后跟:或字符串位置的结尾。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM