python循环request.get（）仅返回第一个循环

Question

Trying to scrape a table from multiple webpages and store in a list. 尝试从多个网页中抓取一个表格并将其存储在列表中。 The list prints out the results from the first webpage 3 times. 该列表将第一个网页的结果打印3次。

import pandas as pd
import requests
from bs4 import BeautifulSoup

dflist = []
for i in range(1,4):
    s = requests.Session()
    res = requests.get(r'http://www.ironman.com/triathlon/events/americas/ironman/world-championship/results.aspx?p=' + str(i) + 'race=worldchampionship&rd=20181013&agegroup=Pro&sex=M&y=2018&ps=20#axzz5VRWzxmt3')
    soup = BeautifulSoup(res.content,'lxml')
    table = soup.find_all('table')
    dfs = pd.read_html(str(table))
    dflist.append(dfs)
    s.close()

print(dflist)

Answer 1

You left out the & after '?p=' + str(i) , so your requests all have p set to ${NUMBER}race=worldchampionship , which ironman.com presumably can't make sense of and just ignores. 您在'?p=' + str(i)之后省略了& ，因此您的请求都将p设置为${NUMBER}race=worldchampionship ，Ironman.com可能无法理解而只是忽略了它。 Insert a & at the beginning of 'race=worldchampionship' . 在'race=worldchampionship'的开头插入& 。

To prevent this sort of mistake in the future, you can pass the URL's query parameters as a dict to the params keyword argument like so: 为了避免将来发生这种错误，您可以将URL的查询参数作为dict传递给params关键字参数，如下所示：

    params = {
        "p": i,
        "race": "worldchampionship",
        "rd": "20181013", 
        "agegroup": "Pro",
        "sex": "M",
        "y": "2018",
        "ps": "20",
    }

    res = requests.get(r'http://www.ironman.com/triathlon/events/americas/ironman/world-championship/results.aspx#axzz5VRWzxmt3', params=params)

python循环request.get（）仅返回第一个循环

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-10-31 22:10:14

python循环request.get（）仅返回第一个循环

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-10-31 22:10:14

解决方案1
2 已采纳 2018-10-31 22:10:14