简体   繁体   English

grep 拆分匹配

[英]grep split and match

I need to parse a CHANGELOG in Keep a changelog format with grep (or awk, etc in shell/bash) and get the last version (the first one after [Unreleased] tag).我需要在Keep a changelog format with grep(或 awk 等在 shell/bash 中)中解析 CHANGELOG 并获取最新版本([Unreleased] 标记后的第一个版本)。

It means, split this file with block '\n## ', ignore the first ([Unreleased]) and get the second (if exists).这意味着,将此文件与块 '\n## ' 分开,忽略第一个([Unreleased])并获取第二个(如果存在)。

With nodeJS, it's very easy and readable CHANGELOG.split(/\n## /)[2];使用 nodeJS,它非常简单易读CHANGELOG.split(/\n## /)[2]; But I can't make it work with grep... grep -zoP -m 1 "(\n##.*)(\n##.*)?(\n## )?" CHANGELOG.md但我不能让它与 grep 一起使用... grep -zoP -m 1 "(\n##.*)(\n##.*)?(\n## )?" CHANGELOG.md grep -zoP -m 1 "(\n##.*)(\n##.*)?(\n## )?" CHANGELOG.md

I can't make the regex match group with multiline even using (.|\n)+ Since I'm on it since few days and trying again and again, the Machine Learning found this ##(?:[^be]+[^#]*###)+[^#]* but, it looks like too heavy for just "block split with \n## ".即使使用(.|\n)+我也无法使正则表达式匹配组与多行匹配 因为我几天前就开始使用它并一次又一次地尝试,机器学习发现了这个##(?:[^be]+[^#]*###)+[^#]*但是,对于“用\n##分割块”来说它看起来太重了。

# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [1.0.0] - 2017-06-20
### Added
{...}

### Changed
{...}

### Removed
{...}

## [0.3.0] - 2015-12-03
{...}

I need to capture the block:我需要捕获块:

## [1.0.0] - 2017-06-20
### Added
{...}

### Changed
{...}

### Removed
{...}

UPDATE #1更新 #1

I found one working (see in regex101.com ) with (?=\n##.*?)(\n##.*?)(?=\n## |$) and now, just need to print Match 2我发现一个工作(见regex101.com )与(?=\n##.*?)(\n##.*?)(?=\n## |$)现在,只需要打印Match 2

Any help?有什么帮助吗? Thank you!谢谢!

This perl one-liner does the job, it reads the file in “slurp” mode and prints the data you're looking for:这个 perl 一行代码完成了这项工作,它以“slurp”模式读取文件并打印您要查找的数据:

perl -0777 -ane '/## \[Unreleased]\R\R\K##[\s\S]+(?=## \[\d)/ && print$&' logfile
## [1.0.0] - 2017-06-20
### Added
{...}

### Changed
{...}

### Removed
{...}

Explanation:解释:

/                       # regex delimiter
    ## \[Unreleased]        # literally
    \R\R                    # 2 linebreak
    \K                      # forget all we have seen until this position
    ##[\s\S]+               # 2 # followed by 1 or more any character including newline
    (?=## \[\d)             # positive lookahead, make sure we have ## [digit after (previous relaese)
/                       # regex delimiter

If this regex matches, then print what is matched print$&如果此正则表达式匹配,则打印匹配的内容print$&

Thanks to @Toto answer that help me go near the solution.感谢@Toto 的回答帮助我 go 接近解决方案。

Here's how I end up:这是我的结局:

perl -0777 -ane '/## \[Unreleased][\s\S]+?\K(\n## [\s\S]+?)(?=\n## |$)/ && print$&' CHANGELOG.md

Ed can do this.埃德可以做到这一点。

#!/bin/sh

cp CHANGELOG.md stack

cat >> extract.ed << EOF
/2017/
.,\$w new-changelog.txt
EOF

cat >> ex2.ed << EOF
/2015/
-1
.,\$d
wq
EOF

ed -s stack < extract.ed
ed -s new-changelog.txt < ex2.ed
rm -v ./extract.ed ./ex2.ed

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM