如何使用 API 请求遍历并附加来自多个页面的数据？

Question

我正在使用来自 Rapid API 的 Indeed API 来收集工作数据。 提供的代码片段仅返回 1 页的结果。 我想知道如何设置一个 for 循环来遍历多个页面并将结果附加在一起。

url = "https://indeed11.p.rapidapi.com/"


payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

如上面的代码所示，键“page”设置为值 1。我将如何参数化该值，以及在附加每个页面的结果时如何构造 for 循环？

Answer 1

您可以借助有效负载以及 for 循环和范围功能进行分页

import requests

url = "https://indeed11.p.rapidapi.com/"

payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}
for page in range(1,11):
    payload['page'] = page

    response = requests.post(url, json=payload, headers=headers)

Answer 2

你可以试试这个：

max_page = 100
result = {}
for i in range(1, max_page + 1):
    try:
        payload.update({'page': i})
        
        if i not in result:
            result[i] = requests.request("POST", url, json=payload, headers=headers)
            
    except:
        continue

Answer 3

我认为你可以用一个while循环来做到这一点。 要实现这一点，您需要代码来检测何时没有更多页面可供阅读，但这可能是可能的。 这是我要做的：

url = "https://indeed11.p.rapidapi.com/"

payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}

responses = []
while not no_more_pages(): # no_more_pages() is a placeholder for code that detects when there are no more pages to read
    responses.append(requests.request("POST", url, json=payload, headers=headers))
    payload['page'] += 1

循环完成后，您可以使用responses列表来访问数据。

如何使用 API 请求遍历并附加来自多个页面的数据？

问题描述

3 个解决方案

解决方案1
2 已采纳 2022-07-09 17:07:49

解决方案2
1 2022-07-09 17:10:45

解决方案3
1 2022-07-09 17:25:08

如何使用 API 请求遍历并附加来自多个页面的数据？

问题描述

3 个解决方案

解决方案1 2 已采纳 2022-07-09 17:07:49

解决方案2 1 2022-07-09 17:10:45

解决方案3 1 2022-07-09 17:25:08

解决方案1
2 已采纳 2022-07-09 17:07:49

解决方案2
1 2022-07-09 17:10:45

解决方案3
1 2022-07-09 17:25:08