简体   繁体   English

在 requests.get 中循环分页参数

[英]Loop for pagination params in requests.get

I want to pars vacancies.我想填补空缺。 And my goal is to pars vacancies just one company我的目标是只为一家公司提供职位空缺

import requests
from tqdm import tqdm_notebook
import pandas as pd
r = requests.get('https://api.hh.ru/vacancies?employer_id=80').json() 
r

If I do so I get by default only 20 vacancies (0 page) though there are 488如果我这样做,我默认只有 20 个职位空缺(0 页),尽管有 488 个

'found': 488

and

'page': 0,
'pages': 25,
'per_page': 20

I can make loop我可以做循环

vac = []
for i in tqdm_notebook(range(0, 25)):
    vac.append(requests.get("https://api.hh.ru/vacancies?employer_id=80", params={'page': i}).json())

But I get just 25 vacancies (one for every page).但我只有 25 个空缺(每页一个)。 Or I can do或者我可以

vac = []
for j in tqdm_notebook(range(0, 20)):
    for i in tqdm_notebook(range(0, 500)):
        vac.append(requests.get("https://api.hh.ru/vacancies?employer_id=80", params={'page': i, 'per_page': j}).json())

But this is a very expensive way, we repeat a lot of actions.但这是一种非常昂贵的方式,我们重复了很多动作。 How to fix it?如何解决?

You will need to manually set the page and per_page parameters, per the API's documentation .您需要根据 API 的文档手动设置 page 和 per_page 参数。 However, you don't need a loop for the per_page parameter - it should be a static number (20):但是,您不需要 per_page 参数的循环 - 它应该是一个静态数字 (20):

vac = []
for i in tqdm_notebook(range(0, 25)):
    vac.append(requests.get("https://api.hh.ru/vacancies?employer_id=80", params={'page': i, 'per_page':20}).json())

Also, consider making the range of pages to iterate dynamic based on the first page of pagination results.此外,请考虑根据分页结果的第一页使页面范围进行动态迭代。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM