I am currently working on a Python project in which the script visits a website ( https://service.berlin.de/dienstleistung/120686/ ), clicks the link "Termin berlinweit suchen und buchen", then keep refreshing the page (after a specified time) until there is a change on the webpage. The change on the website is detected by comparing the hash values before and after the refresh. If there has been a change, I should receive an email. The problem is that there have been clear changes to the site, but I do not receive an email. The code is a working example.
I have tried:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time, hashlib, smtplib, ssl, requests
driver = webdriver.Firefox(executable_path=r'C:\Users\Me\AppData\Local\Programs\Python\Python37\geckodriver.exe') # Loads Geckodriver.exe
driver.get("https://service.berlin.de/dienstleistung/120686/") # Loads initial page
appointmentPageLink = WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, "/html[1]/body[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[4]/div[3]/div[1]/div[2]/div[9]/div[1]/p[1]/a[1]")))
driver.execute_script("arguments[0].click();", appointmentPageLink) # Clicks the link for appointments
while True:
currentHash = hashlib.sha256(driver.page_source).hexdigest() # Get hash
time.sleep(100) # Wait
driver.refresh() # Refresh page
newHash = hashlib.sha256(driver.page_source).hexdigest() # Get new hash to comapre
if newHash == currentHash: # Time to compare hashes!
continue # If the hashes are the same, continue
else: # If the hashes are different, send email
port = 587 # For starttls
smtp_server = "smtp.gmail.com"
sender_email = "OMITTED" # Enter your address
receiver_email = "OMITTED" # Enter receiver address
password = "OMITTED" # Enter sender email password
message = """\
Subject: New change detected for Anmeldung!
Visit https://service.berlin.de/dienstleistung/120686/ now!""" # Add a message
context = ssl.create_default_context() # Send the email!
with smtplib.SMTP(smtp_server, port) as server:
server.ehlo() # Can be omitted
server.starttls(context=context)
server.ehlo() # Can be omitted
server.login(sender_email, password)
server.sendmail(sender_email, receiver_email, message)
server.quit()
Error Message:
Traceback (most recent call last):
File "C:/Users/Me/PycharmProjects/ServiceBerlin/ServiceBEMonitor.py", line 14, in <module>
currentHash = hashlib.sha256(driver.page_source).hexdigest() # Get hash
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 2780: ordinal not in range(128)
In one iteration of your while loop, you send a get request to your desired URL by using requests
(which has no relation to selenium) and store it in appointmentPage
, then you calculate its hash, then refresh the driver and calculate the hash on the same appointmentPage
which is not modified at all since driver.refresh()
refreshes your driver and not the appointmentPage
which is an HTTP request from requests
library. Hence, the currentHash
is always equal to your newHash
in one iteration. The value of newHash
and currentHash
probably changes in every iteration, but they are always equal in an iteration of your while loop and hence no mail is sent.
Now to solve your problem, we first need to get the source code of the page inside your driver, then refresh the page and get the source code again and check their respective hashes. So maybe the following code can work:
while True:
currentHash = hashlib.sha256(driver.page_source).hexdigest()
time.sleep(100)
driver.refresh()
newHash = hashlib.sha256(driver.page_source).hexdigest()
if newHash == currentHash: # Time to compare hashes!
continue # If the hashes are different, send email
else:
#send mail
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.