简体   繁体   中英

How to scan GitHub repository?

Is there a way to build Node.js app that scans remote GitHub repository? I need to extract a specific file from each remote GitHub repository I have access to (eg Read.me files) and download them to a specific folder. Or should I clone each repo with Node.js app code first?

You can clone any repository of github member's with Node.js. BTW, Github API need User-Agent for request.

Dependencies: Request , Child Process

const request = require("request");
const cProcess = require("child_process");

const g_username = "afulsamet"
const u_agent = "Test User Agent"

request.get(`https://api.github.com/users/${g_username}/repos`, { headers: { "User-Agent": u_agent } }, function (err, res, body) {
    JSON.parse(body).map(x => {
        cProcess.spawn("git", ["clone", x.git_url, x.name]) // git clone {repos_git_url} {folder_name}
    })
});

If you just need one file from each repo, and they're all public, you can just do a http request to the raw git url in this format, https://raw.githubusercontent.com/{username}/{repo}/{branch}/{pathtofile} a simple example would be:

const http = require('http');

http.get('https://raw.githubusercontent.com/nodejs/node/master/README.md', function(response) {
  // do something with response, pipe to another file etc.
});

Use github api to get the URL of Readme file of a repository. Call the function with github repo in the format: owner/repo name. This example uses python requests package:

def get_readmeurl(repo):
  readmeurl = 'https://api.github.com/repos/'+repo+'/readme'
  readmecontent = requests.get(readmeurl,headers=headers)
  readmejson = json.loads(readmecontent.text)
  return readmejson['download_url']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM