简体   繁体   English

如何格式化多分支项目中的代码?

[英]How can I format the code in a multi-branch project?

So we have this hundreds of thousands of lines of code git repository and since I joined the project 2 years ago, the formatting bugs me.所以我们有几十万行代码的 git 存储库,自从我 2 年前加入这个项目以来,格式让我很头疼。 And it not only bugs me but as devs randomly "fix" the fomratting, merges result in headache when the code-formatting was applied on one side only.它不仅让我感到困扰,而且当开发人员随机“修复”格式时,当仅在一侧应用代码格式时,合并会导致头痛。 Now reformat code is a two minutes task but results in merge conflict hell, too.现在重新格式化代码是一个两分钟的任务,但也会导致合并冲突地狱。 I recently merged master to a long-living feature branch and tried:我最近将 master 合并到一个长期存在的功能分支并尝试:

  • format code in master, merge to feature branch: 3-way merge tool meld gives me exactly the mess I mentioned above.在 master 中格式化代码,合并到功能分支:3 路合并工具 meld 给了我上面提到的一团糟。 Doesn't detect function boundaries.不检测函数边界。 Really no fun to merge.合并真的没有乐趣。
  • format code in master, format code in feature branch, merge master: Now I still get 30 files with conflicts that are much easier to sort out在 master 中格式化代码,在 feature 分支中格式化代码,合并 master:现在我仍然得到 30 个有冲突的文件,这些文件更容易解决

Now I wonder if it's worth merging, as there are another 15 branches that will all need the exact same code reviews and as manual merging is error-prone I wonder if there is some way of doing this without getting these merge conflicts.现在我想知道是否值得合并,因为还有另外 15 个分支都需要完全相同的代码审查,并且手动合并容易出错,我想知道是否有某种方法可以做到这一点而不会出现这些合并冲突。

Edit, Jun 2022编辑,2022 年 6 月

I'm just boosting the signal from Rufus' comment below :我只是在增强下面 Rufus 评论的信号:

https://github.com/emilio/clang-format-merge contains code that provides a merge driver , rather than clean and smudge filters. https://github.com/emilio/clang-format-merge包含提供合并驱动程序的代码,而不是清洁和涂抹过滤器。 It looks likely to be useful though, especially for repositories that have never had standard formatting enforced.不过,它看起来可能很有用,尤其是对于从未强制执行标准格式的存储库。

Recipe with assumptions带有假设的食谱

(note: I have not tested any of this) (注意:我没有测试过这些)

We'll assume the reformatter is in ~/Downloads/android-studio/bin/format.sh and [note: apparently this is a bad assumption!] that it reads stdin and writes stdout, and works on one file at a time.我们假设重新格式化程序位于~/Downloads/android-studio/bin/format.sh并且 [注意:显然这是一个错误的假设!] 它读取标准输入并写入标准输出,并且一次处理一个文件。 (It's possible, but very difficult, to make this work with something that needs more than one file at a time. You cannot use this recipe for this case, though. Git's basic filtering mechanism requires that each filter simply read stdin and write stdout. By default Git assumes the filter works, even if it exits with a failure status.) (有可能,但非常困难,使这个工作一次需要多个文件的东西。不过,你不能在这种情况下使用这个秘诀。Git 的基本过滤机制要求每个过滤器简单地读取标准输入并写入标准输出。默认情况下,Git 假定过滤器有效,即使它以失败状态退出。)

Choose where to run the filter as well;选择在哪里运行过滤器; here I've set it up as the "clean" filter only.在这里,我仅将其设置为“干净”过滤器。

In ~/.gitconfig or .git/config , add the definition for the filter:~/.gitconfig.git/config中,添加过滤器的定义:

[filter "my-xyz-language-formatter"]
    clean = ~/Downloads/android-studio/bin/format.sh
    smudge = cat

(this assumes that running cat runs a filter that writes, to its stdout, its unchanged input; this is true on any Unix-like system). (这假设运行cat运行一个过滤器,该过滤器将其未更改的输入写入其标准输出;这在任何类 Unix 系统上都是如此)。

Then, create a .gitattributes file if needed.然后,如果需要,创建一个.gitattributes文件。 It will apply to the directory you create it in, and all sub-directories, unless overridden in those sub-directories, so place it in the highest sensible location, usually the root of the repository, but sometimes underneath a source/ or src/ or whatever directory.它将应用于您创建它的目录和所有子目录,除非在这些子目录中被覆盖,因此将其放置在最高合理的位置,通常是存储库的根目录,但有时位于source/src/下或任何目录。 Add line(s) to direct file(s) matching some pattern(s) through your formatter.通过格式化程序将行添加到与某些模式匹配的定向文件。 We'll assume here that all files named *.xyz should be formatted:我们在这里假设所有名为*.xyz的文件都应该被格式化:

*.xyz   filter=my-xyz-language-formatter

This filter will now apply to all extractions and insertions of *.xyz files.此过滤器现在将应用于*.xyz文件的所有提取和插入。 The gitattributes documentation talks about these being applied at check-out and check-in time, but that's not quite precisely correct. gitattributes 文档讨论了这些在签出和签入时应用的内容,但这并不完全正确。 Instead, a clean filter is applied whenever Git copies from work-tree to index (essentially, git add —well before git commit unless you use git commit -a or similar flags).相反,每当 Git 从工作树复制到索引时,都会应用一个干净的过滤器(本质上, git add ——在git commit之前,除非你使用git commit -a或类似的标志)。 A smudge filter is applied whenever Git copies from index to work-tree (essentially, git checkout , but also some additional cases, such as git reset --hard ).每当 Git 从索引复制到工作树时,都会应用涂抹过滤器(本质上是git checkout ,但也有一些其他情况,例如git reset --hard )。

Note that spinning up one filter for each file can be quite slow.请注意,为每个文件启动一个过滤器可能会非常慢。 There's a "long running filter process" protocol you can use if you have a lot of control over the filter, which can speed this up (especially on Windows).如果您对过滤器有很多控制权,则可以使用“长时间运行的过滤器进程”协议,这可以加快速度(尤其是在 Windows 上)。 That's beyond the scope of this answer, though.不过,这超出了这个答案的范围。

Running git merge normally does not use the filters (it works on the copies that are already in the index, which is outside the filtering step).运行git merge通常不使用过滤器(它适用于已经在索引中的副本,这在过滤步骤之外)。 However, adding -X renormalize to a standard merge will make git merge do the "virtual check-in and check-out" described below, so that it will apply the filters.但是,将-X renormalize添加到标准合并将使git merge执行下面描述的“虚拟签入和签出”,以便应用过滤器。 This happens for all three commits involved in the merge (and in both directions—clean and smudge—so it's roughly 6x slower than for just one commit).合并中涉及的所有三个提交都会发生这种情况(并且在两个方向上——干净和涂抹——所以它比一个提交慢大约 6 倍)。

Description (see below)说明(见下文)

Git itself is only partially helpful here. Git 本身在这里只是部分有用。

Fundamentally, the problem is that Git is stupid and line-oriented: it runs git diff from the merge base commit to each tip commit.从根本上说,问题在于 Git 是愚蠢的和面向行的:它从合并基础提交到每个提示提交运行git diff If one or both of these git diff s sees a lot of formatting changes, it considers those significant and worthy of applying to the base.如果其中一个或两个git diff看到很多格式更改,它会认为那些重要且值得应用到基础。 It has no semantic knowledge of the input code.它没有输入代码的语义知识。

(Since you can take over the entire merge process, you could write a smarter merge that does use semantic analysis. This is pretty difficult, though. The only system I know of that does this, or something approaching this, is Ira Baxter's commercial software, and I've never actually used that; I just understand the theory behind it.) (由于您可以接管整个合并过程,因此您可以编写一个使用语义分析的更智能的合并。不过,非常困难。我所知道的唯一能做到这一点的系统,或者接近这个的系统,是 Ira Baxter 的商业软件,而我从未真正使用过它;我只是了解它背后的理论。)

There is a solution that does not depend on making Git smarter.一个解决方案不依赖于让 Git 更智能。 If you have a semantic analyzer that outputs consistently formatted code, regardless of the input form, you can feed all three versions— B for base, L for left or local or --ours , and R for right or remote or other or --theirs —into this formatter:如果您有一个语义分析器输出格式一致的代码,无论输入形式如何,您都可以提供所有三个版本 - B表示基本, L表示左或本地或--oursR表示右或远程或其他或--theirs ——进入这个格式化程序:

reformat < B > B.formatted
reformat < L > L.formatted
reformat < R > R.formatted

Now you can have Git merge all three formatted versions, rather than merging the original possibly-not-yet-formatted (but maybe formatted) versions.现在您可以让 Git 合并所有三个格式化版本,而不是合并原始可能尚未格式化(但可能已格式化)的版本。

The result of this merge will, of course, be re-formatted.当然,此合并的结果将被重新格式化。 But presumably this is what you'd like anyway.但大概这就是你想要的。

The way to achieve this with Git's built-in tools is to use what it calls smudge and clean filters.使用 Git 的内置工具实现此目的的方法是使用所谓的涂抹清洁过滤器。 A smudge filter is applied to files as they are extracted from the repository into the work-tree.当文件从存储库中提取到工作树中时,会将涂抹过滤器应用于文件。 A clean filter is applied to files whenever they go from the work-tree into the repository.每当文件从工作树进入存储库时,都会对文件应用干净的过滤器。

In this case, the smudge filter can be "do nothing to the data", preserving exactly what was committed.在这种情况下,污迹过滤器可以“对数据不做任何事情”,准确地保留提交的内容。 The clean filter can be the reformatter.干净的过滤器可以是重整器。 Or, if you prefer, the smudge filter can be the reformatter, and the clean filter can be the reformatter again, or a no-op filter.或者,如果您愿意,污迹过滤器可以是重新格式化器,而清洁过滤器可以是重新格式化器,或无操作过滤器。 Once you have this in place—this is something you set up in .gitattributes , by defining a filter for particular files by path names, and the filter-driver in .git/config or your main (user or system wide) .gitconfig .一旦你有了这个——这是你在.gitattributes中设置的东西,通过路径名为特定文件定义一个过滤器,在.git/config或你的主(用户或系统范围) .gitconfig中定义过滤器驱动程序。

Once you have all that set up, you can run git merge -X renormalize .完成所有设置后,您可以运行git merge -X renormalize Git will extract the B , L , and R versions as usual, but then run them through a "virtual check-out and check-in" step, making three temporary commits, 1 B.formatted and so on. Git 将像往常一样提取BLR版本,然后通过“虚拟签出和签入”步骤运行它们,进行三个临时提交, 1 B.formatted等等。 It then does the merge using the three temporary commits, rather than from the original three commits.然后它使用三个临时提交而不是原始的三个提交进行合并。

The hard part is finding a reformatter that does just what you want / need.困难的部分是找到一个可以满足您想要/需要的重新格式化程序。 Some modern systems have them, eg, gofmt or clang-format .一些现代系统有它们,例如gofmtclang-format If there's one that does what you need, it just becomes a matter of plugging all this together—and getting buy-in from the rest of your group, that this reformatting is a good idea.如果有一个可以满足您的需求,那么只需将所有这些整合在一起,并获得团队其他成员的支持,这种重新格式化是一个好主意。


1 Technically it just makes tree objects; 1从技术上讲,它只是制作树对象; there's no need for actual commits.不需要实际的提交。

While torek probably got me on a good track, it did not help me to get the reformatting done across branches.虽然 torek 可能让我走上正轨,但它并没有帮助我完成跨分支的重新格式化。 The problem was that the filter applied after git had added these问题是在git添加了这些之后应用的过滤器

<<<< HEAD
bla foo 123
====
bla 123
>>>> otherBranch

blocks, so the filter would indent the conflict markers ... which is not good.块,所以过滤器会缩进冲突标记......这不好。

While this probably has some solution, I went with a custom merge tool:虽然这可能有一些解决方案,但我使用了一个自定义合并工具:

#!/bin/bash

BASE=$1
LOCAL=$2
REMOTE=$3
MERGED=$4

if echo "$BASE" | grep -q "\.java"; then
    echo "Normalizing java file";
    astyle $BASE
    astyle $LOCAL
    astyle $REMOTE
    astyle $MERGED
fi


meld "$LOCAL" "$BASE" "$REMOTE" --output "$MERGED"

configured in .gitconfig as:.gitconfig中配置为:

[merge]
    tool = customMergeTool
[mergetool "customMergeTool"]
    cmd = /path/to/customMergeTool.sh \"$BASE\" \"$LOCAL\" \"$REMOTE\" \"$MERGED\"

With my approach, git would still detect conflicts that when handled with my script are without merge conflicts in 40 of my 100 cases, so torek's approach could probably speed things up there but I ran into serious issues merging the other 40 files, so I gave it up for now.使用我的方法,git 仍然会检测到在我的 100 个案例中,有 40 个在使用我的脚本处理时没有合并冲突的冲突,所以 torek 的方法可能会加快速度,但我在合并其他 40 个文件时遇到了严重问题,所以我给了它现在。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Jenkins DSL插件创建多分支项目? - How to create multi-branch project with Jenkins DSL plugin? git多分支差异? - Multi-branch diff with git? 如何使用 Gitlab Webhook 触发 Jenkins 多分支流水线 - How to trigger Jenkins Multi-Branch Pipeline with Gitlab Webhook 使用多分支管道 Jenkins 时? - When using multi-branch pipline Jenkins? Jenkins多分支管道不修剪从远程删除的分支 - Jenkins Multi-Branch Pipeline Not Pruning Branches Deleted from Remote Jenkins 多分支管道卡在旧的 Jenkinsfile 上 - Jenkins multi-branch pipeling stuck on an old Jenkinsfile GitSCMExtension 或 GitSCM 分支源配置以获取 Jenkins 多分支管道中的所有/一些其他远程分支 - GitSCMExtension or GitSCM branch source configuration to fetch all/some additional remote branches in Jenkins multi-branch pipeline “java.lang.Exception: None or multiple repos”与 Jenkins 多分支管道中的 BitBucket Notifier - "java.lang.Exception: None or multiple repos" with BitBucket Notifier in Jenkins multi-branch pipeline 如何在 github 的不同分支上推送新项目 - how can I push new project on different branch on github 如何从尚未签入的开发分支中的代码更改创建功能分支? - How can I create a feature branch from code changes in my development branch that I haven't checked in?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM