简体   繁体   English

如何使用 GitPython 在一段时间内跟踪 Git 提交

[英]How do I track Git commits for a time period using GitPython

I want to know how to track if a branch in Git is pushed or not.我想知道如何跟踪 Git 中的分支是否被推送。 Basically, I want to find all branches from a develop branch (not repo) and be able to check if any of those branches are pushed (after some changes) or not.基本上,我想从开发分支(不是 repo)中找到所有分支,并能够检查是否有任何分支被推送(在一些更改之后)。

So far, using GitPython , details in here , and here I could figure out the following:到目前为止,使用GitPython这里有详细信息,这里我可以弄清楚以下几点:

  import git
  from git import Repo

  repo = Repo('directory_of_repo') #points to develop not master
  paths = set()
  for item in repo.head.commit.diff('develop@{8 days ago}'):
    if item.a_path.find('a_certain_dirctory') != -1:
        paths.add(item.a_path)

while the mg repo is pointing to develop .而 mg repo 指向develop Now I am not sure that I should use HEAD@{8 days ago} or develop@{8 days ago} ( PS number of days can be different ).现在我不确定我应该使用HEAD@{8 days ago}还是develop@{8 days ago}PS 天数可能不同)。 However, not sure that I should use the HEAD or the develop ?但是,不确定我应该使用HEAD还是develop In the example of 8 days back, using HEAD , the number of the unique paths is 86 while using develop , the number is 15.在8天前的例子中,使用HEAD ,唯一路径的数量是 86 ,而使用develop ,数量是15。

What exactly l am looking for is to find all the paths that have been changed (ie some file inside them are updated) in a certain period of time (for example 8 days ago) in the develop branch.我正在寻找的是在develop分支中找到在特定时间段(例如8天前)内已更改的所有路径(即其中的某些文件已更新)。 Any guidance that which one should I use ( HEAD or develop ) to track changes for a certain time period on develop ?任何指导是哪一个,我应该使用( HEADdevelop )来跟踪在某个时间段的变化develop

I want to find all branches from a master branch (not repo)我想从主分支(不是 repo)中找到所有分支

This does not make sense, because branches do not branch from branches.这是没有意义的,因为分支不是从分支分支出来的。

... and be able to check if any of those branches are pushed (after some changes) or not. ...并且能够检查是否有任何分支被推送(在一些更改之后)。

This question could make sense, if rephrased somewhat.如果稍微改写一下,这个问题可能是有道理的。

The key to both of these is to understand what branches are, or are not.这两者的关键是了解什么是分支,什么不是。 But before you can do that, you have to realize that not everyone means the same thing by the word branch .但在你能做到这一点之前,你必须意识到并不是每个人都用branch这个词来表达同样的意思。 In fact, someone can say the word branch more than once in the same sentence and mean two or more different things.事实上,有人可以在同一个句子中多次说分支这个词,并表示两个或更多不同的东西。

(Related: What exactly do we mean by "branch"? ) (相关:我们所说的“分支”究竟是什么意思?

What really matters, in any repository, is not the branches.在任何存储库中,真正重要的不是分支。 The commits are what matter.提交才是最重要的。 Branches are just how you find commits.分支就是您查找提交的方式。 In this particular case, what I mean by the word branch is a branch name , such as master or develop .在这种特殊情况下,我所说的分支这个词是一个分支名称,例如masterdevelop These names are specific to this one repository.这些名称特定于这个存储库。 A clone of this repository, in some other Git, perhaps on some other computer or on some cloud-server or whatever, has its own branch names, independent of your own.这个存储库的克隆,在其他一些 Git 中,可能在其他一些计算机或某个云服务器或其他地方,有自己的分支名称,独立于您自己的。

When you connect two Git repositories to each other, using git fetch or git push , one Git sends commits to the other Git, which receives them.当您使用git fetchgit push将两个 Git 存储库相互连接时,一个 Git 将提交发送到另一个 Git,后者接收它们。 The receiving Git can see some or all of the sending Git's names (branch names, tag names, and other names), but what really matters are the commits.接收方 Git 可以看到部分或全部发送方 Git 的名称(分支名称、标签名称和其他名称),但真正重要的是提交。 Having sent or received some commits, though, we're left with the problem of finding the commits.但是,在发送或接收了一些提交之后,我们仍然面临着查找提交的问题。

Every commit has a unique hash ID.每个提交都有一个唯一的哈希 ID。 This hash ID is big and ugly and impossible for humans to remember.这个哈希 ID 又大又丑,人类不可能记住。 Fortunately, each commit remembers some set of previous commit hash IDs—usually exactly one.幸运的是,每次提交都会记住一些以前的提交哈希 ID——通常正好是一个。 Git calls this the parent commit. Git 将此称为提交。 The child commit remembers the hash ID of its parent.子提交会记住其父提交的哈希 ID。 When you make a new commit, Git assigns the new commit a new, unique hash ID, and puts the hash ID of the commit you were using, just now, into the new child commit as the child's parent.当你犯了一个新的提交,Git会指派新提交一个新的,独特的哈希ID,并提出的哈希ID提交您使用,就在刚才,进入新的孩子提交作为孩子的父母。 (And now you're using the child commit.) (现在您正在使用子提交。)

Of course, that parent commit is probably itself the child of some previous commit.当然,该父提交本身可能是某个先前提交的子提交。 So the parent remembers its parent—the grandparent of the child you just created—and that commit remembers its parent, and so on.所以父级记住它的父级——你刚刚创建的子级的祖父级——并且那个提交记住它的父级,依此类推。 The result is a long, backwards-pointing chain, in which the last commit is perhaps the most interesting:结果是一个长的、向后指向的链,其中最后一次提交可能是最有趣的:

... <-F <-G <-H

Here H stands for the hash ID of the last commit.这里H代表最后一次提交的哈希 ID。 Because H holds the hash ID of its parent G , we can use H to find G .因为H持有其父G的哈希 ID,我们可以使用H找到G Meanwhile, G holds the hash ID of its parent F , so we can use G to find F , and so on.同时, G持有其父F的哈希 ID,因此我们可以使用G找到F ,依此类推。 These backwards-pointing arrows mean that we only need to have Git remember for us the hash ID of the last commit in the chain.这些向后的箭头意味着我们只需要让 Git 记住链中最后一次提交的哈希 ID。

This is what branch names are for.这就是分支名称的用途。 They hold the hash ID of the last commit in the chain:它们持有链中最后一次提交的哈希 ID:

...--F--G--H   <-- master, branch2, branch3

Note that here, all three names identify commit H .请注意,在这里,所有三个名称都标识 commit H

If we git checkout master and then make a new commit, it will get some big ugly hash ID that we'll call I .如果我们git checkout master然后进行新的提交,它会得到一些我们称之为I丑陋的大哈希 ID。 New commit I will point back to existing commit H as its parent:新提交I将指向现有提交H作为其父项:

...--F--G--H
            \
             I

and now, because we picked master as our branch to git checkout , Git will update the name master to hold the hash ID of new commit I :现在,因为我们选择master作为git checkout的分支,Git 将更新名称master以保存新提交I的哈希 ID:

...--F--G--H   <-- branch2, branch3
            \
             I   <-- master (HEAD)

The attached (HEAD) is the way for us (and Git) to know which branch name to move when we make new commits.附加的(HEAD)是我们(和 Git)在进行新提交时知道要移动哪个分支名称的方式。 The other two branch names— branch2 and branch3 —have not changed.另外两个分支名称—— branch2branch3没有改变。 If we git checkout branch3 we get:如果我们git checkout branch3我们得到:

...--F--G--H   <-- branch2, branch3 (HEAD)
            \
             I   <-- master

and if we now make a new commit, we get:如果我们现在进行新的提交,我们会得到:

             J   <-- branch3 (HEAD)
            /
...--F--G--H   <-- branch2
            \
             I   <-- master

That's almost all there is to it: branch names are just pointers, pointing to commits.这几乎就是它的全部内容:分支名称只是指向提交的指针。

If we have our Git call up some other Git over the Internet-phone, our Git can tell their Git: Hey, I have commit I , do you have it?如果我们有我们的Git调用了一些其他的Git通过互联网,电话,我们的Git可以告诉他们的混帐:嘿,我有犯I ,你有吗? If they say no , our Git can give them commit I .如果他们说没有,我们的Git可以给他们承诺I All Gits in the universe will agree that commit I gets commit I 's hash ID, and no other commit gets this hash ID. Universe 中的所有 Git 都会同意 commit I获得 commit I的哈希 ID,并且没有其他提交获得此哈希 ID。 So they just have to exchange the hash IDs first: the actual contents of the commit—the snapshot of all files—can go later if needed (and can be compressed down to just what the other Git really needs), and it's just the hash IDs that matter here.所以他们只需要先交换散列 ID:提交的实际内容——所有文件的快照——可以在需要时稍后进行(并且可以压缩到其他 Git 真正需要的内容),这只是散列在这里很重要的 ID。

Once we have given them I —and H too if they need that, and G too if needed— they may have something that looks like this:一旦我们已经给了他们I -和H太多,如果他们需要,以及G太多,如果needed-他们可能有一些看起来像这样:

...--F   <-- master
      \
       G--H--I

That is, they have a name, master , pointing to their existing commit F , which has the same hash ID as our existing F and is therefore the same commit with the same files.也就是说,他们有一个名称master ,指向他们现有的提交F ,它与我们现有的F具有相同的哈希 ID,因此是具有相同文件的相同提交。 And now, they have GHI too, with I pointing back to H , H pointing back to G , and G pointing back to F .现在,他们也有GHII指向HH指向GG指向F But they have no name by which to find commit I .但是他们没有名字可以找到 commit I

So, our Git, having sent them commit I (and any earlier commits required), will now send them a polite request: Please, if it's OK, change your name master to point to commit I .因此,我们的 Git 已经向他们发送了提交I (以及任何需要的早期提交),现在将向他们发送一个礼貌的请求:请,如果可以,请将您的名称更改为master以指向提交I It is up to them to decide whether or not to obey this polite request.由他们来决定是否服从这个礼貌的要求。 If they do obey, they will stuff the raw hash ID—whatever that is—of commit I into their name master .如果他们确实服从,他们会将提交I的原始哈希 ID(无论是什么)填充到他们的名称master

So:所以:

... be able to check if any of those branches are pushed ...能够检查是否有任何分支被推送

This still isn't a sensible thing to do as phrased.这仍然不是一个明智的做法。 But what we can do is call them—this other Git—up, ask them about the hash ID in their name master , and compare it to the hash ID in our name master .但是,我们能做的就是给他们打电话,这个混帐其他向上,问他们在他们的名字的哈希ID master ,并将其与以我们的名义哈希ID master Are these the same hash IDs?这些是相同的哈希 ID 吗? If so, we're in sync.如果是这样,我们就同步了。 If not, we're not.如果没有,我们就没有。

Exactly how we're out of sync, we won't know.我们究竟是如何不同步的,我们不会知道。 We'll just know if we are in sync or not.我们只会知道我们是否同步。 That's probably the question you wanted here.这可能是你想要的问题。 (If you want to know exactly how we're out of sync, if we are out of sync, that's a more difficult question.) (如果你想确切地知道我们是如何同步的,如果我们不同步,这是一个更困难的问题。)

So, to answer this new and different question, we should call up their Git, have them list their branch names and contained raw hash IDs, and compare those to our branch names and contained raw hash IDs.所以,要回答这个新的和不同的问题,我们应该调用他们的 Git,让他们列出他们的分支名称和包含的原始哈希 ID,并将它们与我们的分支名称和包含的原始哈希 ID 进行比较。 They'll match, or not;他们会匹配,或者不匹配; or perhaps we'll have branch names they don't and vice versa.或者也许我们会有他们没有的分支名称,反之亦然。

Before you do any of this, though, consider one last feature of git fetch (implemented by Git anyway: this may or may not be in your Python library, depending on how precisely it mimics Git or if it uses Git directly).但是,在您执行任何这些操作之前,请考虑git fetch最后一个功能(无论如何由 Git 实现:这可能在您的 Python 库中,也可能不在您的 Python 库中,具体取决于它模仿 Git 的精确程度或是否直接使用 Git)。 I can use git fetch to have my Git connect to your Git:我可以使用git fetch我的Git 连接到你的Git:

git remote add my-name-for-you <url-for-your-git>
git fetch my-name-for-you

When I do this, your Git tells my Git all of its branch and tag names.当我这样做时,你的Git 会告诉我的Git 它所有的分支和标签名称。 My Git then lets me pick which names I like—the default is that I like all of them—and it gets the last commit from each of your branch names from you, and any earlier commits I need as well, so that I have all of your commits.然后我的 Git 让我选择我喜欢的名字——默认是我喜欢所有的名字——它从你的每个分支名称中获取最后一次提交,以及我需要的任何早期提交,这样我就拥有了所有你的承诺。 Then, in my Git, it creates or updates remote-tracking names for each of your branch names:然后,在我的Git 中,它为您的每个分支名称创建或更新远程跟踪名称:

  • my my-name-for-you/master holds the hash ID that your master holds;我的my-name-for-you/master认为,你的哈希ID master持有;
  • my my-name-for-you/develop holds the hash ID that your develop holds;我的my-name-for-you/develop持有的哈希ID,你的develop持有;
  • ... and so on, for every branch name you have. ... 等等,对于您拥有的每个分支名称。

So instead of calling your Git up again every time, I can just use my Git's memory of your Git's branch names.因此,无需每次都再次调用您的 Git,我可以使用我的 Git 对您 Git 分支名称的记忆

If my Git's memory is out of date, I just run git fetch my-name-for-you .如果我的 Git 内存过时,我只需运行git fetch my-name-for-you My Git calls up your Git, updates my memory of your names, and obtains all of your commits.我的 Git 会调用你的 Git,更新我对你名字的记忆,并获取你的所有提交。

If I'm giving you commits—if I run git push my-name-for-you master —I'll send you the commits and ask your Git to set your master .如果我你提交——如果我运行git push my-name-for-you master master——我会向你发送提交并让你的 Git 设置你的master Your Git will either obey, or say no and tell me a little bit about why it said no.你的 Git 要么服从,要么说不,然后告诉我为什么它说不。 If your Git obeys, my Git will update my my-name-for-you/master to remember that your master now stores the same hash ID I just sent you.如果您的 Git 服从,我的 Git 将更新我的my-name-for-you/master以记住您的master现在存储了我刚刚发送给您的相同哈希 ID。

So, in general, instead of connecting to some other Git, you just inspect the hash IDs of your own origin/* names.因此,一般而言,您无需连接到其他 Git,而只需检查您自己的origin/*名称的哈希 ID。 The name origin is the default name for the default remote, created when you first made your Git repository by cloning some other Git repository.名称origin是默认远程的默认名称,在您第一次通过克隆其他 Git 存储库创建 Git 存储库时创建。 If necessary, you can run git fetch origin before checking the origin/* names.如有必要,您可以在检查origin/*名称之前运行git fetch origin

(For some special tool purposes, it may sometimes be better to use git ls-remote instead of git fetch . This obtains the names and hash IDs—which is the first step of an actual git fetch —but then just prints them out and stops, instead of going on to do the rest of the git fetch work. The downside is that you'll probably eventually need to git fetch , but the upside is that you get a picture that's accurate for the moment, without waiting for git fetch to work. This moment may not be very long, depending on how active the other Git is.) (对于一些特殊的工具目的,有时使用git ls-remote而不是git fetch可能更好。这将获得名称和哈希 ID——这是实际git fetch第一步——但随后只是将它们打印出来并停止, 而不是继续执行其余的git fetch工作。缺点是您可能最终需要git fetch ,但好处是您可以获得当前准确的图片,而无需等待git fetch工作。这个时刻可能不会很长,这取决于另一个 Git 的活跃程度。)

using log and since option we can get the all changed files from a time ( --since ) along w/ the --name-only option.使用 log 和 since 选项,我们可以使用--name-only选项从某个时间( --since )获取所有更改的文件。

  from git import Git
  from datetime import date
  import datetime as DT

  def _get_the_changed_components(self):
      g = Git(self.repo_directory) # repo directory points to `develop`
      today = date.today()
      since = today - DT.timedelta(self.time_period) #some times ago
      loginfo = g.log('--since={}'.format(since), '--pretty=tformat:', '--name-only')
      files = loginfo.split('\n')
      for file in files:
          self.paths.add(file)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM