i am extracting the Group Id from internal site.
The URL it takes from csv file which is located in my desktop.
The below code currently extracts Group ID without any problem until and unless the URL is valid.
But i want to run this code till the end even though there is invalid URLs in the csv file also it should say "invalid url" in my output xls file located in the desktop
Below is my code:
from selenium import webdriver
import pandas as pd
import time
import os
c=1
user = os.getlogin()
path = "C:/Users/"+user+"/Desktop/groupid.csv"
path1 = "C:/Users/"+user+"/Desktop/groupid.xlsx"
print(path)
reader = pd.read_csv(path)
driver =webdriver.Chrome('C:/chromedriver.exe')
driver.maximize_window()
reader['groupid'] = ''
for line in reader['URL']:
print(line)
driver.get(line)
if c==1:
time.sleep(60)
time.sleep(5)
groupid = driver.find_element_by_xpath('//*[@id="Xpath"]').text
print(groupid)
reader['groupid'][reader['URL']==line] = groupid
c=c+1
reader.to_excel(path1)
print("extraction Complete")
Edit: I'm not very familiar with pandas. If you want an "invalid URL" column, can you not use the same approach that you're using for "groupid"?
reader['invalid_url'] = 'No'
reader['groupid'] = ''
for line in reader['URL']:
try:
driver.get(line)
except WhateverExceptionYouNeedToHandle:
reader['invalid_url'][reader['URL']==line] = 'Yes'
c += 1
continue
...
Since you did not tell which error is being raised and at which point, it is hard to tell you what you should do.
But I will assume that you face a NoSuchElementException
raised by Selenium:
for line in reader['URL']:
print(line)
driver.get(line)
# ...
try:
groupid = driver.find_element_by_xpath(
'//*[@id="Xpath"]'
).text
except NoSuchElementException:
print("Could not find element by xpath. Maybe a bad URL?")
c += 1
# Tell python to go to next element in loop
continue
print(groupid)
# ...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.