I am scraping numerous similar webpages, however; to get the same information lies beneath different XPATHs on some of the pages.
Here are the two XPATHs that I am tryin to alternate.
city_e = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.XPATH, "//div/h4"))
)
alternative_name_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//*[@id='SFbizctc53fcd34ec260b1442c7bd7b4']/div/div[1]"))
)
FULL CODE BELOW:
results = []
for i in range(6):
links = WebDriverWait(driver, 60).until(
EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".SFpne a"))
)
links[i].click()
time.sleep(4)
name_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//div/h3"))
)
alternative_name_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//*[@id='SFbizctc53fcd34ec260b1442c7bd7b4']/div/div[1]"))
)
city_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//div/h4"))
)
jobTitle_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//div[@itemprop='contactPoint']/div"))
)
address_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//div[@itemprop='contactPoint']/address"))
)
cell_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//div/a[@class='SFbizctcphn']"))
)
email_e = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//*[@id='SFbizpne0']/div[3]/div/a"))
)
full_member_content = {
'Member_Name': [alternative_name_e.text],
'member_Name': [name_e.text],
'member_City': [city_e.text],
'member_JobTitle': [jobTitle_e.text],
'member_Address': [address_e.text],
'member_Cell': [cell_e.text],
'member_Email': [email_e.text]
}
results.append(full_member_content)
time.sleep(4)
driver.back()
print(results)
Just curios if there is a try catch of an if statement that I could add to accomplish this.
Second Challenge : If this is an easy fix, I am also curious of whether I could re-run city_e= WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//div/h4")))
when more that one h4
tag appears.
Thank you!
You can't look for non existance of an object. For your first question you're looking at:
Forcefully sleeping.
time.sleep(4)
Then doing something similar to what you did with links
exists =EC.presence_of_all_elements_located((your selector))
Before doing anything check the length of exists. If it's more than 0 it is there otherwise it is the other one.
For your bonus question do the same thing with all elements located for the h4 tag and go over them one by one just like youre doing now with links
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.