[英]Python requests.get() loop returns nothing
當試圖抓取本網站的多個頁面時,我沒有得到任何內容作為回報。 我通常會檢查以確保我創建的所有lists
的長度相等,但所有列表都返回len = 0
。
我用過類似的代碼來抓取其他網站,那么為什么這段代碼不能正常工作呢?
我嘗試過的一些解決方案,但沒有達到我的目的: requests.Session()
解決方案如本答案中所建議, .json
建議。
for page in range(100, 350):
page = requests.get("https://www.ghanaweb.com/GhanaHomePage/election2012/parliament.constituency.php?ID=" + str(page) + "&res=pm")
page.encoding = page.apparent_encoding
if not page:
pass
else:
soup = BeautifulSoup(page.text, 'html.parser')
ghana_tbody = soup.find_all('tbody')
sleep(randint(2,10))
for container in ghana_tbody:
#### CANDIDATES ####
candidate = container.find_all('div', class_='can par')
for data in candidate:
cand = data.find('h4')
for info in cand:
if cand is not None:
can2 = info.get_text()
can.append(can2)
#### PARTY NAMES ####
partyn = container.find_all('h5')
for data in partyn:
if partyn is not None:
partyn2 = data.get_text()
pty_n.append(partyn2)
#### CANDIDATE VOTES ####
votec = container.find_all('td', class_='votes')
for data in votec:
if votec is not None:
votec2 = data.get_text()
cv1.append(votec2)
#### CANDIDATE VOTE SHARE ####
cansh = container.find_all('td', class_='percent')
for data in cansh:
if cansh is not None:
cansh2 = data.get_text()
cvs1.append(cansh2)
#### TOTAL VOTES ####`
tfoot = soup.find_all('tr', class_='total')
for footer in tfoot:
fvote = footer.find_all('td', class_='votes')
for data in fvote:
if fvote is not None:
fvote2 = data.get_text()
fvoteindiv = [fvote2]
fvotelist = fvoteindiv * (len(pty_n) - len(vot1))
vot1.extend(fvotelist)
在此先感謝您的幫助!
我做了一些簡化的改變。 需要改變的主要變化是:
ghana_tbody = soup.find_all('table', class_='canResults')
can2 = info # not info.get_text()
我只針對第 112 頁對此進行了測試; 人生如此短暫。
import requests
from bs4 import BeautifulSoup
from random import randint
from time import sleep
can = []
pty_n = []
cv1 = []
cvs1 = []
vot1 = []
START_PAGE = 112
END_PAGE = 112
for page in range(START_PAGE, END_PAGE + 1):
page = requests.get("https://www.ghanaweb.com/GhanaHomePage/election2012/parliament.constituency.php?ID=112&res=pm")
page.encoding = page.apparent_encoding
if not page:
pass
else:
soup = BeautifulSoup(page.text, 'html.parser')
ghana_tbody = soup.find_all('table', class_='canResults')
sleep(randint(2,10))
for container in ghana_tbody:
#### CANDIDATES ####
candidate = container.find_all('div', class_='can par')
for data in candidate:
cand = data.find('h4')
for info in cand:
can2 = info # not info.get_text()
can.append(can2)
#### PARTY NAMES ####
partyn = container.find_all('h5')
for data in partyn:
partyn2 = data.get_text()
pty_n.append(partyn2)
#### CANDIDATE VOTES ####
votec = container.find_all('td', class_='votes')
for data in votec:
votec2 = data.get_text()
cv1.append(votec2)
#### CANDIDATE VOTE SHARE ####
cansh = container.find_all('td', class_='percent')
for data in cansh:
cansh2 = data.get_text()
cvs1.append(cansh2)
#### TOTAL VOTES ####`
tfoot = soup.find_all('tr', class_='total')
for footer in tfoot:
fvote = footer.find_all('td', class_='votes')
for data in fvote:
fvote2 = data.get_text()
fvoteindiv = [fvote2]
fvotelist = fvoteindiv * (len(pty_n) - len(vot1))
vot1.extend(fvotelist)
print('can = ', can)
print('pty_n = ', pty_n)
print('cv1 = ', cv1)
print('cvs1 = ', cvs1)
print('vot1 = ', vot1)
印刷:
can = ['Kwadwo Baah Agyemang', 'Daniel Osei', 'Anyang - Kusi Samuel', 'Mary Awusi']
pty_n = ['NPP', 'NDC', 'IND', 'IND']
cv1 = ['14,966', '9,709', '8,648', '969', '34292']
cvs1 = ['43.64', '28.31', '25.22', '2.83', '\xa0']
vot1 = ['34292', '34292', '34292', '34292']
請務必先將 START_PAGE 和 END_PAGE 分別更改為 100 和 350。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.