简体   繁体   English

Git:更改提交者信息

[英]Git: changing committers info

I'm using this script to modify commits: 我正在使用此脚本来修改提交:

rm -rf repo

echo "clonning $1"
git clone $1 repo

cd repo
git checkout dev

echo "setting remote origin to $2"
git remote set-url origin $2

array=( 'email1@gmail.com' 'email2@gmail.com' )
for OLD_EMAIL in "${array[@]}"
do
  echo $OLD_EMAIL
  git filter-branch -f --env-filter '
  CORRECT_NAME="New name"
  CORRECT_EMAIL="new@email.com"
  if [ "$GIT_COMMITTER_EMAIL" = '$OLD_EMAIL' ]
  then
      export GIT_COMMITTER_NAME="$CORRECT_NAME"
      export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
  fi
  if [ "$GIT_AUTHOR_EMAIL" = '$OLD_EMAIL' ]
  then
      export GIT_AUTHOR_NAME="$CORRECT_NAME"
      export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
  fi
  ' --tag-name-filter cat -- --tags
done
echo "Authors list:"
git log --format='%cE' | sort -u
echo -n "Push to destination (y/n)? "
read answer
if echo "$answer" | grep -iq "^y" ;then
    git push
else
    echo Aborted
fi

cd ../

It pulls data from first repo, modifies committers info and pushes to second repo. 它从第一个仓库中提取数据,修改提交者信息并推送到第二个仓库。

The problem arises if someone will commit directly to the second repo. 如果有人直接提交第二个回购,问题就出现了。 How do i apply those changes to the first repo? 如何将这些更改应用于第一个回购?

If I'm understanding your question correctly (after reading the comments), your repo currently looks something like this: 如果我正确理解你的问题(阅读评论后),你的回购目前看起来像这样:

初始状态

The commits in the first repo (ad) have been modified to create the alternate commits (a'-d') which were pushed into a second repo and then had additional commits added, (eg). 第一个repo(ad)中的提交已被修改以创建备用提交(a'-d'),这些提交被推入第二个仓库,然后添加了其他提交(例如)。

Re-editing Your History 重新编辑您的历史记录

Because you don't have a 1:1 relationship between the identity information in both repos, attempting to modify a'-d' with filter-branch in order to restore the original history, while theoretically possible, will require a method that will positively identify the 'original commit' without the one piece of information required to positively identify a commit (its hash). 因为你在两个repos中的身份信息之间没有1:1的关系,试图修改带有filter-branch的'-d'以恢复原始历史,虽然理论上可行,但是需要一个积极的方法识别“原始提交”,而没有确定提交(其哈希)所需的一条信息。

A commit is basically made up of a few pieces of information: 提交基本上由几条信息组成:

  1. The hash of the tree 树的哈希
  2. The hash(s) of the commit's parent(s) 提交的父级的哈希值
  3. The author's identity information 作者的身份信息
  4. The timestamp of the authoring 创作的时间戳
  5. The committer's identity information 提交者的身份信息
  6. The timestamp of the commit 提交的时间戳
  7. The commit message 提交消息
  8. The size of all that information 所有信息的大小

All this is hashed to create the unique identifier for your commit. 所有这些都经过哈希处理,以便为您的提交创建唯一标识符。 Having altered 2, 3, 5, and 8, we're left with the tree, which is not necessarily unique, the timestamps, which are not necessarily unique, and the commit message, which is not necessarily unique. 改变了2,3,5和8之后,我们留下了树,它不一定是唯一的,时间戳(不一定是唯一的)和提交消息,它不一定是唯一的。

Odds are you could get a decent match from just comparing the tree and one of the timestamps, so let's write a little pseudo-code for that scenario. 可能只是比较树和其中一个时间戳,你可以获得一个不错的匹配,所以让我们为该场景编写一些伪代码。

# create a variable to hold the information from teh current commit
pseudoidentifier=$TREE + $AUTHOR_TIMESTAMP

# go to the first repo
cd /path/to/firstrepo

# output the log | grep to search | sed to remove everything after delimeter
oldhash=`git log --format="{hash}~{tree}{authortimestamp}" | grep pseudoidenfier | sed "s/~.+$//"`

# get the new identity using a custom formatted show command
newidentity=`git show -q --format="{formatted identity}" $oldhash`

# parse out the name and email, probably with sed
CORRECT_NAME=`sed 's/pattern//' $newidentity`
CORRECT_EMAIL=`sed 's/pattern//' $newidentity`

# go to the second repo
cd /path/to/secondrepo

export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"

Unfortunately, this would be slow to write and difficult and time-consuming to test. 不幸的是,编写速度慢,测试困难且耗时。 Probably requiring re-running the entire thing multiple times. 可能需要多次重新运行整个事物。 Since your ultimate goal is to re-unite the code. 因为你的最终目标是重新统一代码。 There are several other options that will likely cause a lot less headache and be a lot faster. 还有其他一些选项可能会减少头痛并且速度更快。 Especially if you indeed need to keep the second repo with the identity updates intact. 特别是如果你确实需要保持第二个回复,身份更新完好无损。

Alternate Methods 替代方法

Without a common history, you can still bring the two into sync using somewhat more manual means. 如果没有共同的历史记录,您仍然可以使用更多手动方式将两者同步。 Here are three methods I would recommend in this situation. 在这种情况下,我建议使用以下三种方法。

A little pre-work 一点前期工作

Before we begin, we can check to see if the code at d and d' are indeed identical. 在开始之前,我们可以检查d和d'处的代码是否确实相同。 We can do this by using the git show command: 我们可以使用git show命令来做到这一点:

$ git show -q --format="%T" d
a017285da45ec06fc744815f33a2e22627f4a799
$ git show -q --format="%T" d'
a017285da45ec06fc744815f33a2e22627f4a799

This command will output the tree object the commit points to, if the two trees match, you're dealing with identical code. 此命令将输出提交指向的树对象,如果两个树匹配,则表示您正在处理相同的代码。 It is entirely possibly to perform the following procedure without a matching code base, but you're likely to have to resolve conflicts in that situation. 完全可能在没有匹配的代码库的情况下执行以下过程,但在这种情况下您可能必须解决冲突。 This step really just tells you how easily the two will come together. 这一步真的只是告诉你两者将如何轻松地融合在一起。

The Cherry-Pick method Cherry-Pick方法

If the repo you used to originally modify the commits is intact, you can fetch the branches from both into a single repo and attempt to use cherry-pick to copy the commits. 如果您最初修改提交的repo完好无损,则可以从两者中获取分支到单个repo中,并尝试使用cherry-pick复制提交。

git checkout <branch at d>
git cherry-pick d'...g

(Note that the syntax is 3 dots) This will apply the changes from each commit after (but not including) d' up to and including g onto d. (注意语法是3个点)这将在d'之后(但不包括)d'应用每个提交的更改,直到并包括g到d。 Creating new commits e'-g'. 创建新提交e'-g'。

樱桃采摘后的历史

The Patch Method 补丁方法

If you don't have an easy way to bring the changes from both branches into a single repo, you can create a series of patches for the commits on the second repo and apply them to the first. 如果您没有简单的方法将更改从两个分支转换为单个存储库,则可以为第二个存储库上的提交创建一系列修补程序并将其应用于第一个存储库。

In the second repo 在第二个回购

git checkout <branch of g>
git format-patch --output-directory <dir> d'...g

(Again, the syntax is 3 dots) This will output a series of patch files for each commit after (and not including) d' up to and including g. (再次,语法是3个点)这将在d'之后(并且不包括)d'之前和之后为每个提交输出一系列补丁文件。 Then copy these files to where you can get at them from the first repo to apply that patches. 然后将这些文件复制到第一个存储库中可以获取这些文件的位置。

In the first repo 在第一个回购

git checkout <branch of d>
git am /path/to/patches/*

You'll end up in the same place you did from the cherry pick method. 你最终会在樱桃采摘方法的同一个地方。

补丁后的历史

Create a Graft 创建一个移植

If there are a lot of conflicts and you don't need to keep the identity altered information, you can also use git replace to perform a graft. 如果存在大量冲突并且您不需要保留身份更改信息,则还可以使用git replace来执行移植。

git replace --graft e d

This will create a copy of commit e with d as the parent and add a reference that says to use the e' commit whenever it attempts to access e. 这将创建一个commit e的副本,其中d作为父项,并添加一个引用,表示在尝试访问e时使用e'commit。 Effectively making d the common ancestor for both and allowing you to perform a traditional merge (h). 有效地使d成为两者的共同祖先并允许您执行传统合并(h)。

在此输入图像描述

Then what? 那又怎样?

Keeping two repos without a common history in sync will consistently cause you problems like this, and they will get worse as the two slowly diverge (for example, as you resolve conflicts). 保持两个没有共同历史记录同步的回购将一直导致你这样的问题,并且随着两者的缓慢分歧(例如,当你解决冲突时)它们会变得更糟。 Over time both of these methods will require more and more resources to maintain the two repos. 随着时间的推移,这两种方法都需要越来越多的资源来维护两个回购。

I would recommend that once the two repos are synchronized, pick one of them and use that one exclusively from then on. 我建议,一旦两个repos同步,选择其中一个并从那时起专门使用那个。 If you require two remotes, just push that repo to both of them. 如果您需要两个遥控器,只需将该回购推送给它们。 You can then easily use any of the many tried and true workflows to maintain the two repos. 然后,您可以轻松地使用许多经过验证的真实工作流程中的任何一个来维护两个回购。

If this is not an option, I'd recommend being meticulous about checking the trees of the heads of your two repos to verify that they are bit-for-bit identical frequently. 如果这不是一个选项,我建议一丝不苟地检查两个回购头的树木,以验证它们是否经常相同。

You've two options to get this done: 你有两个选择来完成这个任务:

  1. If you trust the users, you can have them change their email (either only this git repo or all repos, add --global for all repos) 如果你信任用户,你可以让他们更改他们的电子邮件(只有这个git repo或所有repos,为所有repos添加--global
 git config user.email email@server.com 
  1. If you want to enforce it via a pre-commit git hook, that you will add to the second repository and have them all pull the new update. 如果你想通过预提交git钩子强制执行它,那么你将添加到第二个存储库并让它们全部拉出新的更新。 More about this can be found here and here . 有关这方面的更多信息,请点击 此处此处

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM