grep 拆分匹配

Question

I need to parse a CHANGELOG in Keep a changelog format with grep (or awk, etc in shell/bash) and get the last version (the first one after [Unreleased] tag).我需要在Keep a changelog format with grep（或 awk 等在 shell/bash 中）中解析 CHANGELOG 并获取最新版本（[Unreleased] 标记后的第一个版本）。

It means, split this file with block '\n## ', ignore the first ([Unreleased]) and get the second (if exists).这意味着，将此文件与块 '\n## ' 分开，忽略第一个（[Unreleased]）并获取第二个（如果存在）。

With nodeJS, it's very easy and readable CHANGELOG.split(/\n## /)[2];使用 nodeJS，它非常简单易读CHANGELOG.split(/\n## /)[2]; But I can't make it work with grep... grep -zoP -m 1 "(\n##.*)(\n##.*)?(\n## )?" CHANGELOG.md但我不能让它与 grep 一起使用... grep -zoP -m 1 "(\n##.*)(\n##.*)?(\n## )?" CHANGELOG.md grep -zoP -m 1 "(\n##.*)(\n##.*)?(\n## )?" CHANGELOG.md

I can't make the regex match group with multiline even using (.|\n)+ Since I'm on it since few days and trying again and again, the Machine Learning found this ##(?:[^be]+[^#]*###)+[^#]* but, it looks like too heavy for just "block split with \n## ".即使使用(.|\n)+我也无法使正则表达式匹配组与多行匹配因为我几天前就开始使用它并一次又一次地尝试，机器学习发现了这个##(?:[^be]+[^#]*###)+[^#]*但是，对于“用\n##分割块”来说它看起来太重了。

# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [1.0.0] - 2017-06-20
### Added
{...}

### Changed
{...}

### Removed
{...}

## [0.3.0] - 2015-12-03
{...}

I need to capture the block:我需要捕获块：

## [1.0.0] - 2017-06-20
### Added
{...}

### Changed
{...}

### Removed
{...}

UPDATE #1更新 #1

I found one working (see in regex101.com ) with (?=\n##.*?)(\n##.*?)(?=\n## |$) and now, just need to print Match 2我发现一个工作（见regex101.com ）与(?=\n##.*?)(\n##.*?)(?=\n## |$)现在，只需要打印Match 2

Any help?有什么帮助吗？ Thank you!谢谢！

Answer 1

This perl one-liner does the job, it reads the file in “slurp” mode and prints the data you're looking for:这个 perl 一行代码完成了这项工作，它以“slurp”模式读取文件并打印您要查找的数据：

perl -0777 -ane '/## \[Unreleased]\R\R\K##[\s\S]+(?=## \[\d)/ && print$&' logfile
## [1.0.0] - 2017-06-20
### Added
{...}

### Changed
{...}

### Removed
{...}

Explanation:解释：

/                       # regex delimiter
    ## \[Unreleased]        # literally
    \R\R                    # 2 linebreak
    \K                      # forget all we have seen until this position
    ##[\s\S]+               # 2 # followed by 1 or more any character including newline
    (?=## \[\d)             # positive lookahead, make sure we have ## [digit after (previous relaese)
/                       # regex delimiter

If this regex matches, then print what is matched print$&如果此正则表达式匹配，则打印匹配的内容print$&

Answer 2

Thanks to @Toto answer that help me go near the solution.感谢@Toto 的回答帮助我 go 接近解决方案。

Here's how I end up:这是我的结局：

perl -0777 -ane '/## \[Unreleased][\s\S]+?\K(\n## [\s\S]+?)(?=\n## |$)/ && print$&' CHANGELOG.md

Answer 3

Ed can do this.埃德可以做到这一点。

#!/bin/sh

cp CHANGELOG.md stack

cat >> extract.ed << EOF
/2017/
.,\$w new-changelog.txt
EOF

cat >> ex2.ed << EOF
/2015/
-1
.,\$d
wq
EOF

ed -s stack < extract.ed
ed -s new-changelog.txt < ex2.ed
rm -v ./extract.ed ./ex2.ed

grep 拆分匹配

问题描述

UPDATE #1更新 #1

3 个解决方案

解决方案1
1 2020-07-30 08:59:25

解决方案2
1 已采纳 2020-07-30 20:12:11

解决方案3
0 2020-07-29 09:27:20

grep 拆分匹配

问题描述

UPDATE #1更新 #1

3 个解决方案

解决方案1 1 2020-07-30 08:59:25

解决方案2 1 已采纳 2020-07-30 20:12:11

解决方案3 0 2020-07-29 09:27:20

解决方案1
1 2020-07-30 08:59:25

解决方案2
1 已采纳 2020-07-30 20:12:11

解决方案3
0 2020-07-29 09:27:20