简体   繁体   中英

Merge conflicts between branches when merging to qc and master separately

We branch off master (we don't use a dev branch, don't ask why) and then when QA makes a merge request for the branch, we merge it to the qc branch, then when they're done testing, we merge it to the master branch. We never merge qc to master or rebase branches from master to qc, though we may fast-forward the branch to master (through git merge --ff-only master or git rebase -i), though generally just merging across is fine.

However, branches (based on tasks determined by the manager) can be started at different bases, merged into qc in a different order, and merged back into master in a different order than that.

The problem is because we often have to change the same line of code for different features (we're unable to completely avoid this due to the way the code was originally structured and which changes are requested) so we get merge conflicts almost every week, and when things are being merged in different orders to different branches, we can end up having to fix the same conflicts over and over again or merge in different changes to the same line (or have to spread out new lines in different branches to different lines). Whew

To eliminate identical merge conflict fixes, we tried rebasing the branch merged into QC (which may be based on a master commit going way back), but that somehow duplicated the commits on qc and master; we're fine having the commits re-applied to master and losing the ones on qc since we can just merge it back into qc (we merge master into qc whenever it's updated).

How do we avoid these duplicate commits and conflicts, and is there any better way to manage this?

Someone suggested one thing: git rebase --onto master qc branch Would this be feasible for branches that would conflict that need to go to master from qc to prevent redundant conflict fix?

Rebasing duplicates commits because that's what rebasing is : it is an automated series of git cherry-pick operations, and git cherry-pick is a commit-copier.

Each object's identity is its hash: 251654c5f6f256fe6e23c2c85f1a70594aae00d4 for instance. The hash value is a checksum of the contents of the object being hashed. In the case of a commit object , the contents are:

  • the name and email for the commit's author;
  • the name and email for the committer;
  • the identity of the top level source tree for the commit (ie, a snapshot of every file associated with this commit, represented by the ID of a tree object );
  • the list of parent commit IDs (more commit-object hashes); and
  • the log message for the commit.

Typically, the reason you cherry-pick a commit is to make a slightly altered version—in fact, if you make a bit-for-bit exactly-the-same copy you just get the original commit's ID back, since the hash of the same set of input bits is the same value. With a cherry-picked commit, though, you usually have a slightly different source tree, and pretty much always have a different parent ID.

Consider this (ever so slightly modified, to defeat spam-email scanners) example commit:

tree 3b2ac3530e713fc93aa93525bed9623679f99173
parent d2628061aa3b38b9f5dbdcd9136711a5a4ac3a1a
author Chris Torek <chris.torek@somewhere.com> 1459821791 -0700
committer Chris Torek <chris.torek@somewhere.com> 1459821791 -0700

distributed: clarify GUID uniqueness

Add the phrase {Doppelg\"anger commit} in a side-note.
Git and Mercurial allow them as long as they never meet.

I can copy this commit by first doing a git diff against its parent commit (to show what I changed):

diff --git a/distributed.tex b/distributed.tex
index 40fb7fd..6d8854b 100644
--- a/distributed.tex
+++ b/distributed.tex
@@ -41,8 +41,18 @@ to discover and exchange commits
 whenever you direct the system to synchronize your clone
 with a peer.
 In order to make this work correctly,
-these GUIDs really must be globally unique
-(across all repositories).
+these GUIDs really must be globally unique.\sidenote
+{More specifically, they must be unique
+among all clones of a given repository,
+\emph{including forks that may rejoin in the future}.
+This is a somewhat weaker requirement than true global uniqueness.
+For instance, if Alice makes a commit,
+but then destroys it without ever sharing it with anyone else,
+the destroyed commit is allowed to have the same GUID
+as some future commit,
+or a commit in an unrelated repository.
+You can think of this as allowing Doppelg\"anger commits:
+they may share a GUID only as long as they never meet.}
 It would not do
 for Bob to create a \emph{different} commit (in \git)
 or changeset (in \mercurial)

Now that I have the diff, I can check out some other commit, apply the diff as a patch, and make a new commit from the result, re-using the log message. I will get a new commit, with the same log message and the same effect on file distributed.tex , but (probably) different tree and (definitely) different parent , and a new committer time stamp, and therefore a different hash.

That's a single cherry-pick: compare a commit to its parent (to convert it to a changeset), apply that change elsewhere, and make a new commit from the result. Suppose we repeat this operation for a chain of commits:

... <- B3 <- B4 <- B5   <-- branch-X
  \
   F6 <- F7 <- F8       <-- feature-Y

Let's say we copy F6 through F8 . Call the copy of F6 , F6' to distinguish it from the original. Make the parent of F6' be B5 , and the parent of F7' be F6' , and the parent of F8' be F7' , so that we get:

                      F6' <- F7' <- F8'
                     /
... <- B3 <- B4 <- B5   <-- branch-X
  \
   F6 <- F7 <- F8       <-- feature-Y

Now let's do one last thing. Let's remove the label feature-Y , which is currently stuck on commit F8 , and paste it on F8' instead:

                      F6' <- F7' <- F8'   <-- feature-Y
                     /
... <- B3 <- B4 <- B5   <-- branch-X
  \
   F6 <- F7 <- F8       [abandoned]

Et voila, we have just rebased—ie, copied —some commits that were on feature-Y . The new copies are now on branch feature-Y (during the copying process they were on no branch, or rather, on the special anonymous branch you have when you are in "detached HEAD" mode) and the original copies are ... well, still there.

I marked them [abandoned] here, but if there are other branch or tag labels that will let you find commit F8 , you will still be able to see all the original commits. The git rebase command moves the feature-Y label, but does not look to see whether some other label keeps the commits visible.


Edit : the same is true if some of the copied-and-then-abandoned commits are visible via a merge commit, eg:

                       F6' <- F7' <- F8'   <-- feature-Y
                      /
.... <- B3 <- B4 <- B5   <-- branch-X
\  \
 \  F6 <- F7 <- F8       [abandoned]
  \               \
   C2 <--- C3 <-- C4     <-- branch-C

Here branch-C makes commits C2 through C4 visible, but C4 is a merge commit which makes F8 visible. So now both F8 (via C4 from branch-C ) and F8' (via feature-Y ) will show up when you browse the repository.

This sort of thing controls when and why you should or should not rebase. Since rebase copies commits, you will need to be sure of one of two things: that the originals are really going to be abandoned after all, or that it's OK for the originals to be preserved elsewhere. [ End edit. ]


The other thing that git rebase automates for you is choosing which commits to copy . Why did we decide to copy, specifically, F6 through F8 ? Why include those three and exclude all others? This is where the arguments to git rebase come in. Let's return now to this suggestion:

 git rebase --onto master qc branch 

This is the fully-specified form of git rebase . The very last argument, branch in this case, makes git rebase start by doing git checkout branch . That is, the extra argument is just passed to git checkout . After that, the extra argument is no more use, so we could do:

git checkout branch && git rebase --onto master gc

instead.

The --onto commit part of the command specifies where the copies go . More specifically, we copied F6 to F6' with its new parent being B5 . If you use --onto you are specifying this target commit. (Using a branch name, in this case master , just means to use the tip commit on that branch.)

The one remaining argument, gc in our case, is what git rebase calls <upstream> .

If we did not specify --onto , the <upstream> argument would have two roles: it would name the target commit, and it would provide git rebase the identity of a commit that it should specifically not copy (and then not copy even more commits, based on that ID). When we use --onto , though, we have already specified the target commit, so the remaining argument has just one remaining role: to specify commits to avoid copying.

The way this "commits to avoid copying" thing works can be a bit tricky to wrap your head around. We specify one particular commit, but rebase uses a trick here that git itself uses all the time. When we point git to some commit, we can ask it to look at just that commit:

$ git rev-parse c96a
c96af44caf036cf9f04d77c6146086a4ee422ceb

or we can ask it to look at that commit and all its parent commits (aka its ancestry ):

$ git rev-list c96a
c96af44caf036cf9f04d77c6146086a4ee422ceb
bca8213e6fda6fdf92b3a5fbc4ee59f755e04a9c
86ac01b3e1d771033e93af93366ace1b586d7c74
bc3fa7a3571231e334f6f06e5610e04227ef1f0b

( rev-parse does the one-commit thing while rev-list does the ancestry variant by default, although rev-list is tremendously flexible and is at the heart—or at least the liver or spleen—of many git commands).

Git's rebase selects its <upstream> argument with ancestry to get commits to exclude , while selecting the current branch (after checking it out if you give it that one extra argument) with ancestry to get commits to include . Whatever remains, after excluding the exclusion set, are the commits to copy.

Using the --onto form releases <upstream> from its double-duty role, so that you can be more selective when you choose which commits to avoid copying. You will still be copying commits, though, with all that this entails. In particular, if you use rebase to copy (and then forget about the originals of) commits that other people are using, they must also rebase any work they have done that depends on the originals.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM