How can I make git history reflective of development reality when merging upstream changes onto a shared branch?

Question

I work with a few other developers on a fork of a popular public git repository. Periodically we need to merge upstream changes into our fork. How can I do this and keep the history reflective of development reality? For example, after I merge upstream changes, I want it to be easy for one of our developers to sync to a place on our fork which exactly reflects the state of the world N months ago when our development team was sending commits and performing tests, so we can eg run the same tests and expect the same results as they happened N months ago, and eg easily do binary searches through the history to narrow down when late-to-be-discovered bugs were introduced.

A few things I've tried:

It appears that the default git merge will end up interleaving upstream changes with our own changes, based on the dates of the commits, which then makes it near impossible to sync to a true state of the world as it happened while we were developing.
By using merge with --squash, the git history correctly reflects the development reality (easy to sync before and after the merge of upstream changes, and end up in same state as really happened N months ago), however it seems that --squash tells git to forget all of the individual commits, which then makes any future merging of new upstream changes extremely difficult due to gargantuan conflicts.
I also tried --no-ff but that didn't seem to help at all. Still get interleaving of upstream history with our history.

Am I trying to do something that git was never intended for? One strategy that may work going forward, is to merge upstream changes really often (like every day or every week), that way development reality stays extremely similar between upstream and us. But now that we've gone a few months without merging upstream, it would be nice if there's a way to essentially achieve the nice --squash behavior but without giving up git's knowledge of the individual commits. I considered manually doing many smaller --squash merges, like one for each ~week of upstream history. I'm guessing this will make it easier to merge future upstream changes but it's pretty painful. Any strong reason to do it this way or not do it this way? Any better ideas?

Oh actually one more thought: maybe I could rebase upstream's changes onto our branch, which if I understand correctly will achieve the "reflects state of development reality" that I want (can someone confirm this to be true?), however I'd still like to squash things a bit so that my team's vastly smaller number of commits aren't overwhelmed by the upstream commits, which makes inspecting and understanding our branch history much more difficult. In that case, does it make sense to rebase+squash, or is there no real advantage over pure squash?

Answer 1

Using a rebase workflow might be what you are looking for here. To understand how rebasing work, consider the following simple branch diagram:

remote: A -- B
               \
local:           C

Here, we start with a local branch, branched off the remote, which added some new commit C , but has not yet pushed. Of course, in reality, others on your team may be pushing new commits of their own to the remote, so let's do that too:

remote: A -- B -- D
               \
local:           C

Appreciate now that you can't simply push you local branch to the remote. One way to deal with this would be to merge the remote into your local branch. But that would create a merge commit, leaving you with a non linear history that, over time, could become difficult to read. There is an alternative here; you may use git rebase . If you ran the following commands:

git fetch origin
git rebase yourBranch origin/yourBranch

then you would be left with the following branch diagram:

remote: A -- B -- D
                    \
local:                C'

In other words, your local branch would be completely linear, and you would just be able to do a simple git push to integrate your changes into the remote. Such a push would actually just be fast-forwarding the remote, laying down your new commits.

If you can find a way to follow a rebase workflow, then your remote branch would always be linear, and, to some extent, you would be to go back in time to any exact point in your development process.

How can I make git history reflective of development reality when merging upstream changes onto a shared branch?

Question

1 answers

solution1
0 2018-08-23 05:43:50

How can I make git history reflective of development reality when merging upstream changes onto a shared branch?

Question

1 answers

solution1 0 2018-08-23 05:43:50

solution1
0 2018-08-23 05:43:50