[英]how to scrape website in which page url is not changed but the next button add data below the same url page
[英]How to add a loop to scrape the next page of a website
我下面的代碼有效,但我希望它做同樣的事情,但是對於 URL 變量的下一頁,這將通過根據頁面添加數字 1、2、3 來完成。
該代碼實質上是抓取具有各種視頻縮略圖的網站,然后返回每個視頻的鏈接。 我希望它為每個可用頁面執行此操作
from bs4 import BeautifulSoup
import requests
import re
import urllib.request
from urllib.request import Request, urlopen
URL = "domain.com/"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
endof = soup.find_all('div',class_="th-image")
links = [a['href'] for a in soup.find_all('a', href=True)]
endoflinks = links[8:-8]
index = 0
for a in endoflinks:
index+=1
dwnlink = "domain.com"+ endoflinks[index]
r = requests.get(dwnlink)
f = open("output.txt", "a")
print(r.url, file=f)
f.close()
這應該可以幫助您開始:
URL = "domain.com/"
for i in list(range(0,10)):
print("domain.com/"+str(i))
r = requests.get(URL+str(i))
f = open("output.txt", "a")
print(r.url, file=f)
f.close()
domain.com/0
domain.com/1
domain.com/2
domain.com/3
domain.com/4
domain.com/5
domain.com/6
domain.com/7
domain.com/8
domain.com/9
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.