简体   繁体   中英

Python append json to json file in a while loop

I'm trying to get all users information from GitHub API using Python Requests library. Here is my code:

import requests
import json

url = 'https://api.github.com/users'
token = "my_token"
headers = {'Authorization': 'token %s' % token}

r = requests.get(url, headers=headers)
users = r.json()
with open('users.json', 'w') as outfile:
    json.dump(users, outfile)

I can dump first page of users into a json file by now. I can also find the 'next' page's url:

next_url = r.links['next'].get('url')
r2 = requests.get(next_url, headers=headers)
users2 = r2.json()

Since I don't know how many pages yet, how can I append 2nd, 3rd... page to 'users.json' sequentially in a while loop as fast as possible?

Thanks!

Append the data you get from the requests query to a list and move on to the next query.

Once you have all of the data you want, then proceed to try to concatenate the data into a file or into an object. You can also use threading to do multiple queries in parallel, but most likely there is going to be rate limiting on the api.

First, you need to open file in 'a' mode, otherwise subsequence write will overwrite everything

import requests
import json

url = 'https://api.github.com/users'
token = "my_token"
headers = {'Authorization': 'token %s' % token}

outfile = open('users.json', 'a')

while True:
    r = requests.get(url, headers=headers)
    users = r.json()
    json.dump(users, outfile)
    url = r.links['next'].get('url')
    # I don't know what Github return in case there is no more users, so you need to double check by yourself
    if url == '':
        break

outfile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM