简体   繁体   中英

Merge two branches, splitting a directory

In one of our git repositories, we have two branches, each of which has worked on certain directories to the point where they're widely divergent 1 . We now want to merge the two branches, keeping both versions.

I've played around with renaming one directory such that they don't overlap on disk, but when I merge the branches git knows they both originally came from the same source, and helpfully "moves" files from one to the other, with concomitant merge conflicts.

I've also tried using git merge -s ours branchname then followed by a git checkout branchname -- directory/ , but that looks to destroy the history of the "theirs" branch, making it look like the files suddenly appeared. Ideally, I'd like to preserve the ability to make modifications on the files in pre-merged branches, with merges being able to find the correct version of the file.

Is there a way to tell git to merge two branches but keep certain files/directories as "separate", despite shared origin? Or in other words, is there a way to spit the history of a file such that git knows it moved in one branch but not in the other?


1 These are documentation/test directories, so standard concerns about code duplication are minimal to non-existent.

There is bad news, and good news, here.

Git doesn't care (to some extent anyway, there are secret care-y bits in the way pack files work for instance, and there's what I am about to mention as well) about path names in commits. It only cares about content: the bits inside the files, and the names under which it should put those contents. Aside from parent IDs, each commit is completely independent of any commits before or (eventually) after it. As such, "files" don't have any history at all.

Obviously, though, files do have history, because if you diff two commits (which is what git show does when showing a commit), you see a patch from "previous version of foo" to "new version of foo", and you can do things like "git blame foo" to look at the history.

Git reconciles these two opposites by constructing a history every time you ask for one , using the contents. If you run git show , or git log -p , to see what changed, git reconstructs a history right then and there, based on the contents.

In terms of finding files that were moved / renamed, git uses one or more of several tricks, depending on how you direct it. You can tell git diff (including most commands that get diffs, which includes merge operations) not to check at all. This is the fastest method.

You can tell it to use a mostly-fast (but still O(n 2 )) algorithm that looks only at path names that are only in one of the two commits that the diff is comparing. This is the default method for merge (and you can configure it as your default method for diff by configuring diff.renameLimit , or you can supply it with the -M option).

Or, you can tell it to use a slow or even very slow method, with --find-copies (aka -C ) or --find-copies-harder .

The default mostly-fast method does use path names, while the very-slow method does not. Both do still rely on content as well, though. In particular, files are considered "the same", in terms of copy or rename detection, if they're "at least 50% similar", or whatever other similarity ratio you choose with your -M and/or -C arguments to diff .

This is both the good news and the bad news. Essentially, every time you get git to compare two commits—including any future merges that look back at these for their merge-bases—git will find some renames, and not find some other renames and/or copies, depending on flags you give it, and content similarity. You can fuss with the detection values during merge ( -X rename-threshold rather than -M ) but the controls here are pretty crude.

(Note that git blame , and git log --follow , also do this kind of name-and-content based matching when trying to discover renames. The algorithm for git log --follow only works when moving backwards in time, from current path to previous, so it fails when combined with --reverse .)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM