简体   繁体   中英

Randomly sample Github repositories

I'm looking for a solution to randomly sample repos from Github. The final result is to perform some data analysis on the sample.

What I would like to do is sample by the repository's id: sample an int between 0 and 2.7 million and find the associated repo. After I have the username/repo-name, I'll use the api to get details.

The problem is I do not know how to search by repo id. Any suggestions? I'm open to webscraping or Python solutions.

You can use python to access GitHUb V3 Api (as in " Most suitable python library for Github API v3 ").

And you can access GitHub repos , from a certain id ( GET /repositories , with as parameter, integer ID of the last Repository that you've seen: so that can provide a roundabout way to access repos with their id.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM