簡體   English   中英

通過 request.post 發送數據字段

[英]Sending data fields through request.post

from bs4 import BeautifulSoup as bs  #importing the required libraries
from urllib.request import  urlopen
import requests
urls1="https://www.makemytrip.com/hotels/" #initial url which contains the form where we could give our preferences.

#passing the data parameters
data={'checkin': '08152020',
'city': 'CTGOI',
'checkout': '08162020',
'roomStayQualifier': '2e0e',
'locusId': 'CTGOI',
'country': 'IN',
'locusType': 'city',
'searchText': 'Goa, India',
'visitorId': '5c68c2fb-0551-4ef2-8dae-1a55bb744e66'
}
req=requests.post(urls1,data, headers={'User-Agent': 'XYZ/3.0'})
page_soup = bs(req.content,"html.parser")
print(page_soup)

實際上我想抓取上述數據字段下的酒店,這就是為什么我用 requests.post 方法將數據參數發送到初始 url,這樣當我收到響應 object 時,我會得到內容下一頁將包含符合上述要求標准的酒店。

您正在抓取的網站使用 GET 方法執行搜索。

它還使用不同的 URL 進行酒店搜索, https://www.makemytrip.com/hotels/hotel-listing/

稍微修改您的示例以應用 GET 請求而不是 POST 請求,我們能夠獲得酒店列表結果。

from bs4 import BeautifulSoup as bs

headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"}
# setting a "browser" header seems to be required for this site.

data = {'checkin': '08192020',
        'city': 'CTGOI',
        'checkout': '08202020',
        'roomStayQualifier': '2e0e',
        'locusId': 'CTGOI',
        'country': 'IN',
        'locusType': 'city',
        'searchText': 'Goa, India',
        'visitorId': 'aaab4f61-2069-4033-bb97-0791f0f70'}

url = 'https://www.makemytrip.com/hotels/hotel-listing/'

# adding the params argument and supplying the dictionary of search data formats the resulting URL into something that makemytrip.com can understand. 
# adding a timeout just in case makemytrip.com doesn't respond
req = requests.get(url, params=data, headers=headers, timeout=5)


page_soup = bs(req.content,'html.parser')

# this finds all the divs in the result with a class name of "listingRow".  
listing_results = page_soup.findAll('div', class_='listingRow')

# this results array can then be looped through to find more details about each listing.
for listing in listing_results:
    print(listing.find("p", itemprop="name").getText())

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM