简体   繁体   中英

GIT : How to maintain same history of commits in two different branches

Using gitbash to merge and commit.

Let me explain the basic structure first. So we have origin/dev which we pull and start working on. After the changes are done we push the changes to origin/dev.

Then using gitbash to merge dev to qa,I do below

git checkout qa

# for all recent changes in origin/qa(similar have parallel origing/dev and uat as well.)
git pull

# for checking out changes in dev to my local qa space which will be merged
# to origin/qa by the below commands
git checkout dev -- directorynameToCheckoutCodeFrom

git commit
git push

So this is the process normally followed between any 2 different environment when merging happens.

So my issues is I make 5 commits for 5 issues in DEV all have different commit id. So when i merge from DEV to QA when I commit all five changes in 1, I get 1 commit id and all the changes will be merged in 1. Same happens when merging in UAT.

Is there any way we can maintain same history between different environments. The real issues comes in QA we might merge 4-5 times in 10 days and in UAT we would like to keep intact and merge only once a month. In that case if we commit all the changes from QA to UAT as one commit the history which is different in QA will be lost. Any way to tackle this?

Gone through some posts online but was unable to understand, what i understood was the only way is making frequent commit as we doing in DEV env. For 1 issue merge in dev>then qa>the uat this is the only way to preserve the same history is my understanding correct.

There is not a history of commits . There are only commits; the commits are the history.

Each commit is uniquely identified by a hash ID. That hash ID is the true name of the commit, as it were. If you have that commit, you have that hash ID. If you have that hash ID, you have that commit. Read out the big ugly hash ID and see if it's in your database of "all the commits that I have in this repository": ie, see if Git knows it. If so, you have that commit. For instance, b5101f929789889c2e536d915698f58d5c5c6b7a is a valid hash ID: it's a commit in the Git repository for Git. If you have that hash ID in your Git repository, you have that commit.

People don't normally type in, or use, these hash IDs at all. Git uses them, but Git is a computer program, not a human. Humans don't do well with these things—I have to cut and paste the above hash ID or I'll get it wrong—so humans use a different way to get started. Humans use branch names . But many different Git repositories all have master and this master doesn't always (or ever!) mean that big ugly hash ID I typed in above. So a name like master is specific to one particular Git repository, while hash IDs are not.

Now, every commit stores some stuff. What a commit stores includes a snapshot of all the files that go with that commit, so that you can get it back out later. It also includes the name and email address of the person who made that commit, so that you can tell who to praise or blame. 😀 It includes a log message: why the person who made the commit says they made that commit. But—here's the first tricky part—almost every commit also includes at least one hash ID , which is the commit that comes before this particular commit.

So, if you have b5101f929789889c2e536d915698f58d5c5c6b7a , then what you have is this:

$ git cat-file -p b5101f929789889c2e536d915698f58d5c5c6b7a | sed 's/@/ /'
tree 3f109f9d1abd310a06dc7409176a4380f16aa5f2
parent a562a119833b7202d5c9b9069d1abb40c1f9b59a
author Junio C Hamano <gitster pobox.com> 1548795295 -0800
committer Junio C Hamano <gitster pobox.com> 1548795295 -0800

Fourth batch after 2.20

Signed-off-by: Junio C Hamano <gitster pobox.com>

(The tree line represents the saved snapshot that goes with this commit. You can ignore this here.) The parent line gives the hash ID of the commit that comes before b5101f929789889c2e536d915698f58d5c5c6b7a .

If you have b5101f929789889c2e536d915698f58d5c5c6b7a you almost certainly also have a562a119833b7202d5c9b9069d1abb40c1f9b59a . The history for the later commit is the earlier commit.

If we replace each of these big ugly hash IDs with a single uppercase letter, 1 we can draw this sort of history a lot more easily:

... <-F <-G <-H

where H is the last commit in a long chain of commits. Since H holds G 's hash ID, we don't need to write down G 's big ugly hash ID, we can just write down H 's hash. We use that to have Git find G 's ID, inside H itself. If we want F , we use H to find G to find F 's ID, which lets Git retrieve F .

But we still have to write down that last hash ID. This is where branch names come in. Branch names like master act as our way of saving the hash ID of the last commit.

To make a new commit, we have Git save the hash ID of H in our new commit. We have Git save a snapshot and our name and email address and all the rest of that as well—"the rest" includes a time stamp, the precise second when we had Git do all this. Now Git computes the actual hash ID of all of this data, including the time stamp. The commit is now saved in our database of all commits, and Git has given us a new hash ID I :

...--F--G--H   <-- master
            \
             I

We have Git automatically write I 's hash ID into our name master :

...--F--G--H--I   <-- master

and we've added new history, which retains all the existing history.


1 Of course, if we only used one uppercase letter like this, we'd run out of the ability to create commits, anywhere in the world, after creating just 26 commits. That's why Git's hash IDs are so big. They hold 160 bits so the number of possible commits or other objects is 2 160 or 1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976. As it turns out, this isn't really enough, and Git will probably move to a larger hash that can hold 79,228,162,514,264,337,593,543,950,336 times as many objects. While the first number is big enough to enumerate all the atoms in the universe, there are specific attacks that are troublesome, so a 256-bit hash is a good idea. See How does the newly found SHA-1 collision affect Git?


This tells you how to have the same history

History is the commits. To have the same history in two branches, you need both branch names to point to the same commit:

...--F--G--H--I   <-- master, dev

Now the history in master is: Starting at I , show I , then move back to H and show H , then move back to G ... Likewise, the history in dev is: Starting at I , show I , then move back to H ...

Of course, that's not quite what you want. What you want is to have history that diverges , then converges again . That's what branches are really about:

...--F--G--H   <-- master
            \
             I   <-- dev

Here the history in dev starts (ends?) at I , then goes back to H , and then G , and so on. The history in master starts (ends?) at H , goes back to G , and so on. As we add more commits, we add more history, and if we do it like this:

             K--L   <-- master
            /
...--F--G--H
            \
             I--J   <-- dev

then the history of the two branches diverges . Now master starts at L and works backwards, while dev starts at J and works backwards. There are two commits on dev that are not on master , and two commits that are on master that are not on dev , and then everything from H on back is on both branches.

This divergence—the commits that are not on some branch—is where the lines of work diverge. The branch names still only remember one commit each , specifically the tip or last commit of each line of development. Git will start at this commit, by the saved hash ID, and use that commit's saved parent hash ID to walk backwards, one commit at a time. Where the lines rejoin, the history rejoins. That's all there is in a repository, except for the next section.

Merges combine history (and snapshots)

What you can do now is make a merge commit . The main way to make a merge commit is using the git merge command. This has two parts:

  • combining work , where Git figures out what has changed in each line of development; and
  • making a merge commit , which is a commit with exactly one special feature.

To make a merge, you start by picking one branch tip. You run git checkout master or git checkout dev here. Whichever one you pick, that's the commit you have out now, and Git attaches the special name HEAD to that branch name to remember which one you picked:

             K--L   <-- master (HEAD)
            /
...--F--G--H
            \
             I--J   <-- dev

Now you run git merge and give it an identifier to choose the commit to merge . If you're on master = L , you'll want to use dev = J as the commit to merge:

git merge dev         # or git merge --no-ff dev

Git will now walk the graph as usual to find the best shared commit—the best commit that's on both branches, to use as a starting point for this merge. Here, that's commit H , where the two branches first diverge.

Now Git will compare the snapshot saved with commit H —the merge base—to the one in your current commit L . Whatever is different , you must have changed on master . Git puts those changes into one list:

git diff --find-renames <hash-of-H> <hash-of-L>   # what we changed

Git repeats this but with their commit J :

git diff --find-renames <hash-of-H> <hash-of-J>   # what they changed

Now Git combines the two sets of changes . Whatever we changed, we want to keep changed. Whatever they changed, we want to use those changes too. If they changed README.md and we did not, we'll take their change. If we changed a file and they didn't, we'll take our change. If we both changed the same file, Git will try to combine those changes. If Git succeeds, we have a combined change for that file.

In any case, Git now takes all of the combined changes and applies them to the snapshot in H . If there were no conflicts, Git automatically makes a new commit from the result. If there were conflicts, Git still applies the combined changes to H , but leaves us with the messy result, and we have to fix it up and do the final commit ourselves; but let's assume there were no conflicts.

Git now makes a new commit with one special feature. Instead of just remembering our previous commit L , Git has this merge commit remember two previous commits, L and J :

             K--L   <-- master (HEAD)
            /    \
...--F--G--H      M
            \    /
             I--J   <-- dev

Then, as always, Git updates our current branch to remember the new commit's hash ID:

             K--L
            /    \
...--F--G--H      M   <-- master (HEAD)
            \    /
             I--J   <-- dev

Note that if we do the merge by running git checkout dev; git merge master git checkout dev; git merge master , Git would do the same two diffs and get the same merge commit M (well, as long as we did it at the exact same second so that the time stamps match up). But then Git would write the hash ID of M into dev rather than into master .

In any case, if we now ask about the history of master , Git will start at M . Then it will walk back to both L and J and show both of them. (It has to pick one to show first, and git log has a lot of flags to help you choose which one to show first.) Then it will walk back from whichever one it picked first, so that it now has to show both K and J , or both L and I . Then it will walk back from whichever one it picked to show.

In most cases Git shows all the children before any of the parents, ie, eventually, it will have shown all four of I , J , K , and L and have only H to show. So from here, Git will show H , then G , and so on—there's now just one chain to walk back, one commit at a time. But be aware that when you traverse back from a merge, you run into the which commit to show next problem.

git merge does not always make a merge commit

Suppose you have this history:

...--F--G--H   <-- master
            \
             I--J   <-- dev

That is, there's no divergence , dev is merely strictly ahead of master . You do git checkout master to select commit H :

...--F--G--H   <-- master (HEAD)
            \
             I--J   <-- dev

and then git merge dev to combine the work you've done since the merge base with the work they did since the merge base.

The merge base is the best shared commit. That is, we start at H and keep going back as needed, and also start at dev and keep going back as needed, until we reach a common starting point. So from J we go back to I and to H , and from H we just sit quietly at H until J goes back here.

The merge base, in other words, is the current commit . If Git ran:

git diff --find-renames <hash-of-H> <hash-of-H>

there would be no changes . The act of combining no changes (from H to H via master ) with some changes (from H to J via dev ), then applying those changes to H , is just going to be whatever is in J . Git says: well, that was too easy and instead of making a new commit, it just moves the name master forwards , in the opposite of the usual backwards direction. (In fact, Git really did work backwards—from J to I to H —in order to figure this out. It just remembers that it started from J .) So what you get here, by default, is this:

...--F--G--H
            \
             I--J   <-- dev, master (HEAD)

When Git is able to slide a label like master forward like this, it calls that operation a fast-forward . When you do this with git merge itself, Git calls it a fast-forward merge , but it's not really a merge at all. What Git really did was to check out commit J , and make master point to J .

In many cases, this is is OK! The history is now: For master , start at J and walk back. For dev , start at J and walk back. If that's all you need and care about, that's fine. But if you want a real merge commit—so that you can tell master and dev apart later, for instance—you can tell Git: Even if you can do a fast-forward instead of a merge, do a real merge anyway. Git will go ahead and compare H to H , and then compare H to J , and combine the changes and make a new commit:

...--F--G--H------K   <-- master (HEAD)
            \    /
             I--J   <-- dev

Now you get a real merge commit K , with two parents as required to be a merge commit. The first parent is H as usual, and the second is J , as is usual for a merge commit. The history of master now includes the history of dev , but remains different from the history of dev , because the history of dev doesn't include commit K .

Note that if you now switch back to dev and make more commits, the result looks like this:

...--F--G--H------K   <-- master
            \    /
             I--J--L--M--N   <-- dev (HEAD)

You can now git checkout master and git merge dev again. This time you won't need --no-ff because there is a commit that's on master that's not on dev , namely K , and of course there are commits on dev that are not on master , namely LMN . The *merge base* this time is shared commit . The *merge base* this time is shared commit J (not H H is also shared, but J` is better ). So Git will combine changes by doing:

git diff --find-renames <hash-of-J> <hash-of-K>   # what did we change?
git diff --find-renames <hash-of-J> <hash-of-N>   # what did they change?

What did we change from J to K ? (That's an exercise for you, the reader.)

Assuming Git is able to combine the changes on its own, this merge operation will succeed, producing:

...--F--G--H------K--------O   <-- master (HEAD)
            \    /        /
             I--J--L--M--N   <-- dev

where new merge commit O combines the J -vs- K changes with the J -vs- N changes. The history of master will start at O and will include N and M and L and K and J and I and H and so on. The history of dev will start at N and include M and L and J (not K !) and I and H and so on. Git always works backwards , from child to parent. Merges let / make Git work backwards along both lines at the same time (but shown to you one at a time, in some order depending on arguments you supply to git log ).

you can try with

git checkout qa

git merge dev --no-ff

git push

git merge dev --no-ff

is mostly use to pull all the dev branch commit to qa with their history.

In the process you describe, you want to 'merge' changes from an individual directory within a repo. This is contrary to how git works, and that is why you're having trouble keeping a good history.

It's important to understand that what you're doing is not really a merge[1]. A merge commit has two (or more) parent commits, and in that way the full history is preserved. To be fair, git has a tendency to be "flexible" to the point of inconsistency in how it uses certain terms; there are operations it calls "merging' that don't result in merge commits. But even with those operations, you merge the entire content - not an individual directory.

If you have distinct modules - or, however you might describe them, different content in different directories - that change independently (which certainly applies if you promote them between branches/environments separately), they should be in separate repos. I suppose if it helps you could gather them up as submodules of a 'parent' repo, to have the ability to clone from a single url or whatever. But beyond that, if this type of separation isn't acceptable for some reason, you may need to consider whether git is the best tool to meet your particular source control requirements.


[1] I could also argue semantics about merging due to the fact that if both dev and qa had changes, the changes from qa would be overwritten and lost - which is not typically what is desired in a merge. But you would then probably argue that changes always flow from dev to qa, so it's not applicable; and anyway, git does sometimes describe the clobbering of one branch from another as a merge (ie the "ours merge strategy').

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM