在 requests.get 中循环分页参数

Question

I want to pars vacancies.我想填补空缺。 And my goal is to pars vacancies just one company我的目标是只为一家公司提供职位空缺

import requests
from tqdm import tqdm_notebook
import pandas as pd
r = requests.get('https://api.hh.ru/vacancies?employer_id=80').json() 
r

If I do so I get by default only 20 vacancies (0 page) though there are 488如果我这样做，我默认只有 20 个职位空缺（0 页），尽管有 488 个

'found': 488

and和

'page': 0,
'pages': 25,
'per_page': 20

I can make loop我可以做循环

vac = []
for i in tqdm_notebook(range(0, 25)):
    vac.append(requests.get("https://api.hh.ru/vacancies?employer_id=80", params={'page': i}).json())

But I get just 25 vacancies (one for every page).但我只有 25 个空缺（每页一个）。 Or I can do或者我可以

vac = []
for j in tqdm_notebook(range(0, 20)):
    for i in tqdm_notebook(range(0, 500)):
        vac.append(requests.get("https://api.hh.ru/vacancies?employer_id=80", params={'page': i, 'per_page': j}).json())

But this is a very expensive way, we repeat a lot of actions.但这是一种非常昂贵的方式，我们重复了很多动作。 How to fix it?如何解决？

Answer 1

You will need to manually set the page and per_page parameters, per the API's documentation .您需要根据 API 的文档手动设置 page 和 per_page 参数。 However, you don't need a loop for the per_page parameter - it should be a static number (20):但是，您不需要 per_page 参数的循环 - 它应该是一个静态数字 (20)：

vac = []
for i in tqdm_notebook(range(0, 25)):
    vac.append(requests.get("https://api.hh.ru/vacancies?employer_id=80", params={'page': i, 'per_page':20}).json())

Also, consider making the range of pages to iterate dynamic based on the first page of pagination results.此外，请考虑根据分页结果的第一页使页面范围进行动态迭代。

在 requests.get 中循环分页参数

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-02-17 15:53:35

在 requests.get 中循环分页参数

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-02-17 15:53:35

解决方案1
1 已采纳 2018-02-17 15:53:35