简体   繁体   English

正则表达式多次匹配git log中的任何内容

[英]Regex match anything between multiple times from git log

I want split the git log message into parts, so I can access each commit and its hash and message separated. 我想将git日志消息分成几部分,以便我可以访问每个提交及其散列和消息。

This is the git log command: 这是git log命令:

git log --pretty=short --abbrev-commit -n 2 HEAD

Here an example log: 这是一个示例日志:

commit bfb9bac
Author: XXXXX XXXXXXXX <xxx.xxxxx@xxxxx.xxx>

    Something awesome happened here

commit a4fad44
Author: XXXXX XXXXXXXX <xxx.xxxxx@xxxxx.xxx>

    Ooh, more awesomeness
    So many lines

what I have tried so far: 到目前为止我尝试过的是:

([a-f0-9]{7})\n(?:Author.+\n\n)([\s\S]+)(?=\ncommit)

here a link to RegExr: https://regexr.com/4d523 这里是RegExr的链接: https ://regexr.com/4d523

at the end it should look like this: 最后应该看起来像这样:

const result = commits.match(regex)

result[0][0] // bfb9bac
result[0][1] // Something awesome happened here

result[1][0] // a4fad44
result[1][1] // Ooh, more awesomeness\n    So many lines

It would be also okay to do this in two steps; 分两步进行也是可以的。 first splitting the commits and then splitting hash and message. 首先拆分提交,然后拆分哈希和消息。

You can omit the use of [\\s\\S] by matching the whole string using .* and repeting a pattern that matches a newline and asserts that the string does not start with commit: 您可以通过使用.*匹配整个字符串并重复一个与换行符匹配的模式并断言该字符串不是以commit开头的方式来省略[\\s\\S]的使用:

^commit ([a-f0-9]{7})\nAuthor.*\n+[ \t]+(.*(?:\n(?!commit).*)*)

Explanation 说明

  • ^ Start of string ^字符串开头
  • commit Match commit followed by a space commit匹配提交,后跟一个空格
  • ([a-f0-9]{7}) Capture in group 1 matching 7 times what is listed in the character class ([a-f0-9]{7})在组1中捕获,匹配字符类中列出的内容的7倍
  • \\nAuthor.* Match a newline, then Author and 0+ times any char except a newline \\nAuthor.*匹配换行符,然后对Author和除换行符以外的任何字符进行0倍以上的匹配
  • \\n+[ \\t]+ Match 1+ times a newline followed by 1+ spaces or tabs \\n+[ \\t]+匹配1+次换行符,再加上1+空格或制表符
  • ( Capturing group (捕获组
    • .* Match 0+ times any char except a newline .*匹配0+次除换行符以外的任何字符
    • (?:\\n(?!commit).*)* Repeat 0+ times matching a newline, assert what is on the right is not commit, then match any char 0+ times except a newline (?:\\n(?!commit).*)*重复0+次匹配换行符,断言右边的内容不是commit,然后匹配除换行符以外的任何char 0+次
  • ) Close capturing group )关闭捕获组

Regex demo 正则表达式演示

 const regex = /^commit ([a-f0-9]{7})\\nAuthor.*\\n+[ \\t]+(.*(?:\\n(?!commit).*)*)/gm; const str = `commit bfb9bac Author: XXXXX XXXXXXXX <xxx.xxxxx@xxxxx.xxx> Something awesome happened here commit a4fad44 Author: XXXXX XXXXXXXX <xxx.xxxxx@xxxxx.xxx> Ooh, more awesomeness So many lines `; let m; while ((m = regex.exec(str)) !== null) { if (m.index === regex.lastIndex) { regex.lastIndex++; } console.log("hash: " + m[1]); console.log("message: " + m[2]); } 

You can use this regex to match each of the commit log and capture sha1 in group1 and message in group2, 您可以使用此正则表达式来匹配每个提交日志,并捕获group1中的sha1和group2中的消息,

^commit\s+(\S+)\n^Author:[\w\W]+?^\s+((?:(?!commit)[\w\W])+)

Regex Explanation: 正则表达式说明:

  • ^commit - Starts matching commit at the beginning of line ^commit在行首开始匹配commit
  • \\s+(\\S+)\\n - Matches one or more whitespace followed by sha1 value which gets captured in group1 using (\\S+) followed by a newline \\n \\s+(\\S+)\\n匹配一个或多个空格,后跟sha1值,该值使用(\\S+)在group1中捕获,后跟换行符\\n
  • ^Author:[\\w\\W]+? - Again starts matching Author from start of line followed by colon followed by any character one or more times as less as possible -再次从行首开始匹配Author ,后跟冒号,然后将任何字符尽可能少地重复一次或多次
  • ^\\s+ - This matches one or more whitespace from the beginning of line and this is the point from which message will start getting captured by next regex part ^\\s+ -这与从行首开始的一个或多个空格匹配,这是消息将从下一个正则表达式部分开始捕获的点
  • ((?:(?!commit)[\\w\\W])+) - This expression (aka tempered greedy token ) captures any character including newlines using [\\w\\W] but stops capturing if it sees commit and places the whole match in group2 ((?:(?!commit)[\\w\\W])+) -此表达式(又名“ 钢化贪婪令牌” )使用[\\w\\W]捕获包括换行符在内的任何字符,但如果看到commit并放置整个字符,则停止捕获在第2组比赛

Regex Demo 正则表达式演示

Here is a JS code demo, 这是一个JS代码演示,

 str = `commit bfb9bac Author: XXXXX XXXXXXXX <xxx.xxxxx@xxxxx.xxx> Something awesome happened here commit a4fad44 Author: XXXXX XXXXXXXX <xxx.xxxxx@xxxxx.xxx> Ooh, more awesomeness So many lines`; reg = new RegExp(/^commit\\s+(\\S+)\\n^Author:[\\w\\W]+?^\\s+((?:(?!commit)[\\w\\W])+)/mg); while(null != (m=reg.exec(str))) { console.log("SHA1: " + m[1] + ", Message: " + m[2]); } 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM