简体   繁体   English

python从URL请求json

[英]python requests json from url

I am using python to scrape a url such as in the code blow 我正在使用python抓取URL,例如代码打击

import requests
from bs4 import BeautifulSoup
import json

n_index = 10
base_link = 'http://xxx.xxx./getinfo?range=10&district_id=1&index='
for i in range (1,n_index+1):
    link = base_link+str(i)
    r = requests.get(link)
    pid = r.json()
    print (pid)

it's return ten result just like this blow 就这样一击就返回十个结果

{'product_info': [{'pid': '1', 'product_type': '2'}]}
{'product_info': [{'pid': '2', 'product_type': '2'}]}
{'product_info': [{'pid': '3', 'product_type': '2'}]}
{'product_info': [{'pid': '4', 'product_type': '2'}]}
{'product_info': [{'pid': '5', 'product_type': '2'}]}
{'product_info': [{'pid': '6', 'product_type': '2'}]}
{'product_info': [{'pid': '7', 'product_type': '2'}]}
{'product_info': [{'pid': '8', 'product_type': '2'}]}
{'product_info': [{'pid': '9', 'product_type': '2'}]}
{'product_info': [{'pid': '10', 'product_type': '2'}]}

and then i want to save the resulting 10 lines into a json file, as presented in the code below: 然后我想将结果的10行保存到json文件中,如下面的代码所示:

with open('sylist.json', 'w') as outfile:
    json.dump(r.json(), outfile, indent=4)

but only one result is saved into the json file local, who can help me to resolve,thanks a lot 但是只有一个结果保存到本地的json文件中,谁可以帮助我解决,非常感谢

On a typical way, try below way to write result line by line without open/close file at each time. 通常,尝试以下面的方式逐行写入结果,而不每次都打开/关闭文件。

with open('sylist.json', 'a+') as outfile:
    for i in range (1,n_index+1):
        link = base_link+str(i)
        r = requests.get(link)
        outfile.write("{}\n".format(json.dump(r.json(), outfile, indent=4)))

Let me extend Frank's answer a bit. 让我扩大弗兰克的回答。 You are sending the request inside the for loop, which means at every iteration of the loop, the value of pid is overwritten. 您正在for循环内发送请求,这意味着在循环的每次迭代中, pid的值都会被覆盖。 As a result, when you want to dump its content to an output file, pid holds only the contents from the very last iteration/request. 结果,当您要将其内容转储到输出文件时, pid只保留最后一次迭代/请求中的内容。 I would suggest to apply one of the following to address your issue: 我建议应用以下方法之一来解决您的问题:

  1. Include writing component inside the for loop (or vice-versa, as suggested in the answer by Frank AK). 将编写组件包括在for循环内(反之亦然,如Frank AK的答案所建议)。
  2. Instead of overwriting the content of pid each time, you may append it directly inside the for loop as follows: 您可以将其直接附加在for循环内,而不是每次都覆盖pid的内容,如下所示:

     my_list = [] for i in range (1,n_index+1): link = base_link+str(i) r = requests.get(link) pid = r.json() my_list.append(pid) with open('sylist.json', 'w') as outfile: json.dump(my_list, outfile, indent=4) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM