簡體   English   中英

Python Web Scraper打印問題

[英]Python Web Scraper print issue

我已經在python中創建了一個Web抓取工具,但是在最后打印時,我想打印的代碼(“ Bakerloo:” + info_from_website)已下載,就像您在代碼中看到的那樣,但它總是以info_from_website的形式出現,而忽略了“ Bakerloo:“字符串。 無論如何都找不到解決方案。

import urllib
import urllib.request
from bs4 import BeautifulSoup
import sys

url = 'https://tfl.gov.uk/tube-dlr-overground/status/'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page,"html.parser")

try:
   bakerlooInfo = (soup.find('li',{"class":"rainbow-list-item bakerloo "}).find_all('span')[2].text)
except:
   bakerlooInfo = (soup.find('li',{"class":"rainbow-list-item bakerloo disrupted expandable "}).find_all('span')[2].text)

bakerloo = bakerlooInfo.replace('\n','')
print("Bakerloo     : " + bakerloo)

我會改用CSS選擇器 ,使元素具有disruption-summary類:

import requests
from bs4 import BeautifulSoup

url = 'https://tfl.gov.uk/tube-dlr-overground/status/'
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")

service = soup.select_one('li.bakerloo .disruption-summary').get_text(strip=True)
print("Bakerloo: " + service)

打印:

Bakerloo: Good service

(在此處使用requests )。


請注意,如果您只想列出所有帶有干擾摘要的電台,請執行以下操作:

import requests
from bs4 import BeautifulSoup

url = 'https://tfl.gov.uk/tube-dlr-overground/status/'
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")

for station in soup.select("#rainbow-list-tube-dlr-overground-tflrail-tram ul li"):
    station_name = station.select_one(".service-name").get_text(strip=True)
    service_info = station.select_one(".disruption-summary").get_text(strip=True)

    print(station_name + ": " + service_info)

打印:

Bakerloo: Good service
Central: Good service
Circle: Good service
District: Good service
Hammersmith & City: Good service
Jubilee: Good service
Metropolitan: Good service
Northern: Good service
Piccadilly: Good service
Victoria: Good service
Waterloo & City: Good service
London Overground: Good service
TfL Rail: Good service
DLR: Good service
Tram: Good service

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM