![](/img/trans.png)
[英]Canopy UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-32: ordinal not in range(128)
[英]UnicodeEncodeError: 'ascii' codec can't encode characters in position 30-31: ordinal not in range(128)
我目前正在研究網頁抓取,僅供測試! 我不知道為什么會出現此錯誤,請問您看一下代碼中我做錯了什么,可以幫助我解決問題嗎?
from urllib.request import urlopen
from bs4 import BeautifulSoup as bs
from urllib.request import HTTPError
import sys
html = urlopen("https://www.expedia.co.kr/Hotel-Search?destination=서울&startDate=2019.06.06&endDate=2019.06.07&rooms=1&adults=2")
soup = bs(html,"html.parser")
section = soup.find_all(class_="cf flex-1up flex-listing flex-theme-light cols-nested")
card = soup.find_all(class_="flex-card")
infoprice = soup.find_all(class_="flex-content info-and-price MULTICITYVICINITY avgPerNight")
rows = soup.find_all(class_="flex-area-primary")
hotelinfo = soup.find_all('ul',class_="hotel-info")
hotelTitles = soup.find_all('li',class_="hotelTitle")
for hotelTitle in hotelTitles:
hotellist = hotelTitle.find('h4',class_="hotelName fakeLink")
h = hotellist.get.text().strip()
print(h)
為什么不使用requests
代替:
import requests
html = requests.get("https://www.expedia.co.kr/Hotel-Search?destination=서울&startDate=2019.06.06&endDate=2019.06.07&rooms=1&adults=2")
soup = BeautifulSoup(html.content,'html.parser')
我發現它避免了可能的編碼問題,在您的情況下,其余代碼保持不變。
您可以模仿頁面發出的POST請求並使用請求。 您會收到包含所有酒店數據的json響應。 在此處查看示例json響應。
import requests
headers = {'User-Agent' : 'Mozilla/5.0', 'Referer' : 'https://www.expedia.co.kr/Hotel-Search?destination=%EC%84%9C%E'}
r = requests.post("https://www.expedia.co.kr/Hotel-Search-Data?responsive=true&destination=%EC%84%9C%EC%9A%B8&startDate=2019.06.06&endDate=2019.06.07&rooms=1&adults=2&timezoneOffset=3600000&langid=1042&hsrIdentifier=HSR&?1555393986866", headers = headers, data = '').json()
for hotel in r['searchResults']['retailHotelModels']:
print(hotel['retailHotelInfoModel']['hotelName'])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.