简体   繁体   中英

Can I delete the history and keep the sha of recent git commits?

I have a 4 year old repository which realistically doesn't need more than a year of history.

Basically assume my tree's hashes look like I and I want, logically, II :

I)   0-1-2-3-4-5-6-7-8-9-a-b-c-d-e-f
II)  a-b-c-d-e-f

I've seen advice to truncate history a bit like this:

git log --after="2018-12-28" --until="2018-12-31" #finding a good quiet period
git checkout --orphan trunc 4bf6824d4
git commit -m "Truncated history to 2019-12-29: a low activity point"
git fetch --all
git rebase --onto trunc 4bf6824d4 upstream/master
#OR
git cherry-pick 4bf6824d4..upstream/master

Then remove the upstream master and replace it with trunc renamed to master. Or force a push. I ran this scenario and the outcome is III instead, but I thought I could roughly at least get IV .

III) g-h-i-j-k-l (h-l are b'-c'-d'-e'-f')
IV)  g-b-c-d-e-f (like editing the parent of b only)

Is this impossible because you can't edit any given SHA's lineage?
How can I make it that everyone's in-progress work will likely merge with the replacement branch… are they all going to have to pull the new branch and rebase a range of their own work onto it?
Also, why did rebasing give me a few points where conflicts occurred and I had to merge with theirs ?

The short answer is no.

The Object Database section of the Git User Manual explains.

In fact, all the information needed to represent the history of a project is stored in objects with such names. In each case the name is calculated by taking the SHA-1 hash of the contents of the object. The SHA-1 hash is a cryptographic hash function …

  • A “commit” object ties such directory hierarchies together into a directed acyclic graph of revisions — each commit contains the object name of exactly one tree designating the directory hierarchy at the time of the commit. In addition, a commit refers to “parent” commit objects that describe the history of how we arrived at that directory hierarchy.

By changing your a commit to a root commit, ie , changing its parent from 9 to null, you necessarily and unavoidably change its hash or object name, which is a function of the commit's tree, parent or parents, author, committer, and commit message as defined in the Commit Object section of the Git User Manual.

As you can see, a commit is defined by:

  • a tree: The SHA-1 name of a tree object (as defined below), representing the contents of a directory at a certain point in time.

  • parent(s): The SHA-1 name(s) of some number of commits which represent the immediately previous step(s) in the history of the project. The example above has one parent; merge commits may have more than one. A commit with no parents is called a "root" commit, and represents the initial revision of a project. Each project must have at least one root. A project can also have multiple roots, though that isn't common (or necessarily a good idea).

  • an author: The name of the person responsible for this change, together with its date.

  • a committer: The name of the person who actually created the commit, with the date it was done. This may be different from the author, for example, if the author was someone who wrote a patch and emailed it to the person who used it to create the commit.

  • a comment describing this commit.

Changing a to a' knocks over the first domino in a chain that forces b' , c' , d' , e' , and f' because the parent of each changed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM