简体   繁体   中英

How can I get a list of all pull requests for a repo through the github API?

I want to obtain a list of all pull requests on a repo through the github API.

I've followed the instructions at http://developer.github.com/v3/pulls/ but when I query /repos/:owner/:repo/pulls it's consistently returning fewer pull requests than displayed on the website.

For example, when I query the torvalds/linux repo I get 9 open pull requests (there are 14 on the website). If I add ?state=closed I get a different set of 11 closed pull requests (the website shows around 20).

Does anyone know where this discrepancy arises, and if there's any way to get a complete list of pull requests for a repo through the API?

You can get all pull requests (closed, opened, merged) through the variable state .

Just set state=all in the GET query, like this->

https://api.github.com/repos/:owner/:repo/pulls?state=all

For more info: check the Parameters table at https://developer.github.com/v3/pulls/#list-pull-requests

Edit: As per Tomáš Votruba's comment:

the default value for, "per_page=30". The maximum is per_page=100. To get more than 100 results, you need to call it multiple itmes: "&page=1", "&page=2"...

PyGithub ( https://github.com/PyGithub/PyGithub ), a Python library to access the GitHub API v3, enables you to get paginated resources.

For example,

g = Github(login_or_token=$YOUR_TOKEN, per_page=100)
r = g.get_repo($REPO_NUMBER)

for pull in r.get_pulls('all'):
    # You can access pulls

See the documentation ( http://pygithub.readthedocs.io/en/latest/index.html ).

  1. If you want to retrieve all pull requests (commits, comments, issues etc) you have to use pagination. https://developer.github.com/v3/#pagination

  2. The GET request "pulls" will only return open pull-requests.

  3. If you want to get all pull-requests either you do set the parameter state to all, or you use issues.

Extra information

If you need other data from Github, such as issues, then you can identify pull-requests from issues, and you can then retrieve each pull-request no matter if it is closed or open. It will also give you a couple of more attributes (mergeable, merged, merge-commit-sha, nr of commits etc) If an issue is a pull-request, then it will contain that attribute. Otherwise, it is just an issue.

From the API: https://developer.github.com/v3/pulls/#labels-assignees-and-milestones

"Every pull request is an issue, but not every issue is a pull request. For this reason, “shared” actions for both features, like manipulating assignees, labels and milestones, are provided within the Issues API."

Edit I just found that issues behaves similar to pull-requests, so one would need to do retrieve all by setting the state parameter to all

There is a way to get a complete list and you're doing it. What are you using to communicate with the API? I suspect you may not be doing something correctly. For example (there are only 13 open pull requests currently) using my API wrapper (github3.py) I get all of the open pull requests. An example of how to do it without my wrapper in python is:

import requests
r = requests.get('https://api.github.com/repos/torvalds/linux/pulls')
len(r.json()) == 13

and I can also get that result (vaguely) in cURL by counting the results myself: curl https://api.github.com/repos/torvalds/linux/pulls .

If you, however, run into a repository with more than 25 (or 30) pull requests that's an entirely different issue but most certainly it is not what you're encountering now.

The search API shoul help: https://help.github.com/enterprise/2.2/user/articles/searching-issues/

q = repo:org/name is:pr ...

GitHub provides a "Link" header which specifies the previous, next and last URL to fetch the values.Eg, Link Header response, <https://api.github.com/repos/:owner/:repo/pulls?state=all&page=2>; rel="next", <https://api.github.com/repos/:owner/:repo/pulls?state=all&page=15>; rel="last" <https://api.github.com/repos/:owner/:repo/pulls?state=all&page=2>; rel="next", <https://api.github.com/repos/:owner/:repo/pulls?state=all&page=15>; rel="last" <https://api.github.com/repos/:owner/:repo/pulls?state=all&page=2>; rel="next", <https://api.github.com/repos/:owner/:repo/pulls?state=all&page=15>; rel="last" rel="next" suggests the next set of values.

You can also use GraphQL API v4 to request all pull requests for a repo. It requests all the pull requests by default if you don't specify the states field :

{
  repository(name: "material-ui", owner: "mui-org") {
    pullRequests(first: 100, orderBy: {field: CREATED_AT, direction: DESC}) {
      totalCount
      nodes {
        title
        state
        author {
          login
        }
        createdAt
      }
    }
  }
}

Try it in the explorer

With Github's new official CLI (command line interface):

gh pr list --repo OWNER/REPO

which would produce something like:

Showing 2 of 2 pull requests in OWNER/REPO

#62  Doing something    that-weird-branch-name
#58  My PR title        wasnt-inspired-branch

See additional details and options and installation instructions .

Here's a snippet of Python code that retrieves information of all pull requests from a specific GitHub repository and parses it into a nice DataFrame:

import pandas as pd

organization = 'pvlib'
repository = 'pvlib-python'
state = 'all'  # other options include 'closed' or 'open'

page = 1  # initialize page number to 1 (first page)
dfs = []  # create empty list to hold individual dataframes
# Note it is necessary to loop as each request retrieves maximum 30 entries
while True:
    url = f"https://api.github.com/repos/{organization}/{repository}/pulls?" \
        f"state={state}&page={page}"
    dfi = pd.read_json(url)
    if dfi.empty:
        break
    dfs.append(dfi)  # add dataframe to list of dataframes
    page += 1  # Advance onto the next page

df = pd.concat(dfs, axis='rows', ignore_index=True)

# Create a new column with usernames
df['username'] = pd.json_normalize(df['user'])['login']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM