简体   繁体   English

正则表达式非贪婪匹配跨换行符

[英]Regex Non Greedy Matching across newlines

I am trying to match the following: 我正在尝试匹配以下内容:

str = "---title: Some Title\ndate: 01/01/2012---\n\nSome other stuff---\n\n"

And I would like to get: 我想得到:

"title: Some Title\ndate: 01/01/2012"

So, the regex I came up with was: 因此,我想到的正则表达式是:

~r/---(.+)---(.+)/s

It's unfortunately, being greed and matching: 不幸的是,贪婪和相配:

"title: Some Title\ndate: 01/01/2012---\n\nSome other stuff"

I also tried the non-greedy operator and that failed too: 我也尝试了非贪心运算符,但也失败了:

(~r/---(.+)---(.+)?.*/s

Any suggestions would be super helpful. 任何建议将超级有帮助。

Thanks 谢谢

Use string.scan function like below. 使用如下所示的string.scan函数。

> str = "---title: Some Title\ndate: 01/01/2012---\n\nSome other stuff---\n\n"
> str.scan(/---([\s\S]+?)---/)[0][0]
=> "title: Some Title\ndate: 01/01/2012"

Output of the above scan function is a two dimensional array is because of the existence of capturing group. 上面的扫描函数的输出是二维数组,是因为存在捕获组。 [\\s\\S]+? Matches one or more space or non-space characters non-greedily. 非贪婪地匹配一个或多个空格或非空格字符。 Note that this pattern would also match the line breaks ( \\n , \\r ). 请注意,此模式还将与换行符( \\n\\r )相匹配。

A more generic regex is: 更通用的正则表达式是:

(?:---)?(?<key>[a-z]+)\s*:\s*(?<value>(?!\\n).+?)(?:\\n|---|$)

It splits the match in key:value. 它将匹配项拆分为key:value。

DEMO DEMO

---(?:(?!---).)*---

Try this.See demo. 试试看。看演示。

https://regex101.com/r/fA6wE2/34 https://regex101.com/r/fA6wE2/34

The right way here is not to try to match the part you want to extract, but match the part you want to throw away and use split . 正确的方法不是尝试匹配要提取的部分,而是匹配要扔掉的部分并使用split

s.split(/---\n*/)
#=> ["", "title: Some Title\ndate: 01/01/2012", "Some other stuff"]

str.split(/---\n*/)[1]
#=> "title: Some Title\ndate: 01/01/2012"

If you ultimately want the title and date string, you may as well pull them out directly: 如果您最终想要标题和日期字符串,则不妨直接将它们拉出:

str.scan(/---title:\s+([^\n]+)\ndate:\s+(\d{2}\/\d{2}\/\d{4})/)
  #=> [["Some Title", "01/01/2012"]]

A perl way to do it: 一种perl方法:

#!/usr/bin/perl
use Modern::Perl;

my $str = "---title: Some Title\ndate: 01/01/2012---\n\nSome other stuff---\n\n";
$str =~ s/---(.+?)---.*?$/$1/s;
say $str;

Output: 输出:

title: Some Title
date: 01/01/2012

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM