简体   繁体   English

使用sed / awk / grep格式化git log输出

[英]Formatting git log output with sed/awk/grep

Summary / 'gist of' version, 摘要/''gist of'版本,

if I have a set of messages with subject [SUB] and body [BODY] like below, How can I add a newline after the subject only if [BODY] exists (And replace the place holders with * ) 如果我有一组主题[SUB]和正文[BODY]的消息如下, 如果[BODY]存在 ,如何在主题后面添加换行符(并用*替换占位符)

[SUB] some subject. [BODY] some body lines 
with newline chars and !@@# bunch of other *#@ chars
 without [(BODY)] or [(SUB)]... and more stuff
[SUB] Another subject. with no body [BODY] 
[SUB] another [BODY] some body.

I want this to be formatted like 我想要将其格式化为

* some subject.

some body lines 
with newline chars and !@@# bunch of other *#@ chars
 without [(BODY)] or [(SUB)]... and more stuff
* Another subject. with no body 
* another 

some body.

What I really wanna do, 我真的想做什么

So I am trying to auto-generate my CHANGELOG.md file from the git log output. 所以我试图从git log输出中自动生成我的CHANGELOG.md文件。 The problem is, I need to put newline char only if the body of the commit message is non empty. 问题是,只有在提交消息的主体非空时才需要添加换行符。

The current code looks like this, (broken into two lines) 当前代码如下所示(分为两行)

git log v0.1.0..v0.1.2 --no-merges --pretty=format:'* %s -- %cn | \
[%h](http://github.com/../../commit/%H) %n%b' | grep -v Minor | grep . >> CHANGELOG.md

and a sample output, 和样本输出,

* Added run information display (0.1.2) -- ... | [f9b1f6c](http://github.com/../../commit/...) 
+ Added runs page to show a list of all the runs and run inforation, include sorting and global filtering.
+ Updated run information display panel on the run-info page
+ Changed the links and their names around.

* Update README.md -- abc | [2a90998](http://github.com/../../commit/...) 

* Update README.md -- xt | [00369bd](http://github.com/../../commit/...) 

You see here, the lines starting with the * are the commits, and the lines starting on + are just a part of the body for the first commit. 你在这里看到,以*开头的行是提交,从+开始的行只是第一次提交的主体的一部分。 Right now it adds a %n (newline) in front of all the body sections regardless of whether it's empty or not. 现在它在所有正文部分前面添加%n (换行符),无论它是否为空。 I want to add this ONLY if its non empty (probably even after removing the whitespaces) 如果它非空(我甚至可能在删除空格后),我想添加它

How would I achieve this? 我怎么做到这一点? my knowledge of sed and awk is almost non-existing, and trying to learn didn't help much. 我对sedawk了解几乎不存在,并且尝试学习并没有多大帮助。

(I will can make sure all the code in the body is indented, so it wont confuse list of commits with lists in the body) (我将确保正文中的所有代码都缩进,因此它不会混淆提交列表与正文中的列表)


My Answer 我的答案

i'm sure jthills answer is correct (and maye even a better way to do it), but while I was looking to figure out what his meant, i came up wit this. 我确定jthills的答案是正确的(甚至可能是更好的方式),但是当我想弄明白他的意思时,我想到了这个。 Hope it will help myself or someone in he future, 希望它能帮助自己或未来的某个人,

I am pasting the full shell script that I used, 我正在粘贴我使用的完整shell脚本,

mv CHANGELOG.md CHANGELOG.md.temp
printf '### Version '$1' \n\n' > CHANGELOG.md
git log $2..$1 --no-merges --pretty=format:'[SUB]%s -- %cn | \
    [%h](http://github.com/<user>/<gitrepo>/commit/%H) [BODY]%b' | grep -v Minor | \
    sed '{:q;N;s/\s*\[BODY\][\n\s]*\[SUB\]/\n\[SUB\]/;b q}' | \
    sed 's/\[SUB\]/* /g' | 
    sed 's/\[BODY\]/\n\n/'>> CHANGELOG.md
cat CHANGELOG.md.temp >> CHANGELOG.md
rm CHANGELOG.md.temp

I am basically prepending the new commit log to the CHANGELOG.md using the temp file. 我基本上使用临时文件将新的提交日志添加到CHANGELOG.md。 Please feel free to suggest shorter versions for this 3 sed commands 请随意为这3个sed命令建议更短的版本

Tag your syntax in the git log output. git log输出中标记语法。 This will handle inserting the newlines properly, the rest you know: 这将处理正确插入换行符,其余的你知道:

git log --pretty=tformat:'%s%xFF%x01%b%xFF%x02' \
| sed '1h;1!H;$!d;g              # buffer it all (see comments for details)
       s/\xFF\x01\xff\x02//g     # strip null bodies
       s/\xFF\x01/\n/g           # insert extra newline before the rest
       s/\xFF.//g                # cleanup
'

( edit: quote/escape typos) 编辑:引用/逃脱错别字)

For your first file in your question, you could try the following: 对于您问题中的第一个文件,您可以尝试以下操作:

awk -f r.awk input.txt 

where input.txt is the input file, and r.awk is : 其中input.txt是输入文件, r.awk是:

{
    line=line $0 ORS
}

END {
    while (getSub()) {
        getBody()
        print "* " subj
        if (body) {
            print ""
            print body
        }
    }
}

function getBody(ind) {
    ind=index(line,"[SUB]")
    if (ind) {
        body=substr(line,1,ind-1)
        line=substr(line,ind)
    }
    else
        body=line
    sub(/^[[:space:]]*/,"",body)
    sub(/[[:space:]]*$/,"",body)
}

function getSub(ind,ind2) {
    ind=index(line,"[SUB]")
    if (ind) {
        ind=ind+5
        ind2=index(line,"[BODY]")
        subj=substr(line, ind, ind2-ind)
        line=substr(line,ind2+6)
        return 1
    }
    else
        return 0
}

gives output: 给出输出:

*  some subject. 

some body lines 
with newline chars and !@@# bunch of other *#@ chars
 without [(BODY)] or [(SUB)]... and more stuff
*  Another subject. with no body 
*  another 

some body.

I wrestled with this way longer than expected, simply trying to get a git log output with some sed tweaking of the git message to format/extract our JIRA messages. 我用这种方式比预期的更长时间摔跤,只是试图通过git消息的一些sed调整得到一个git log输出来格式化/提取我们的JIRA消息。 Here is my solution: 这是我的解决方案:

logsheet = "!f() { git log --format='%h ^ %<(80,trunc)%s ^ A:%<(20,trunc)%an ^ D:%ad ' --no-merges --date=short $1 | sed -e 's/\\\\([AZ]*-[0-9]*\\\\)/\\\\1 ^/'; }; f"

The escapes, the shell function with a ! 逃脱,外壳功能搭配! were all needed because I had an arg as well as a pipe. 都需要因为我有一个arg和一个管道。 :-) :-)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM