I am trying to scrape this website enter link description here . we have almost ten different opportunities on each page. each one has its own title and details. I want to get all this information. I have written a python code that can locate other required tags and information but I can't locate the paragraphs that contain a description in it.
here is my code.
base_url = "https://www.enabel.be/content/enabel-tenders"
driver.get(base_url)
WebDriverWait(driver , 10).until(EC.visibility_of_element_located(
(By.XPATH , "//*[@id='block-views-tenders-block']/div/div/div[@class='view-content']/div")))
current_page_tag = driver.find_element(By.XPATH ,
"//*[@id='block-views-tenders-block']/div/div/div[3]/ul/li[2]").text.strip()
all_divs = driver.find_elements(By.XPATH ,
"//*[@id='block-views-tenders-block']/div/div/div[@class ='view-content' "
"]/div")
for each_div in all_divs :
singleData = {
# could not detect
"language" : 107 ,
# means open
"status" : 0 ,
"op_link" : "" ,
"website" : website_name ,
"close_date" : '' ,
# means not available
"organization" : website_name ,
"description" : "" ,
"title" : '' ,
"checksum" : "" ,
# means not available
"country" : '' ,
"published_date" : ''
}
singleData['title'] = each_div.find_element(By.XPATH ,
".//span[@class='title-accr no-transform']").text.strip()
singleData['country'] = each_div.find_element(By.XPATH ,
".//div[1]/div/div/div[@class ='field-items']/div").text.strip()
close_date = each_div.find_element(By.XPATH , ".//div//div[1]/div").text.strip()
#description always returns me empty text.
description = each_div.find_element(By.XPATH, ".//div/div[2]/div[3]/div[2]/div/p").text.strip()
download = each_div.find_elements_by_xpath('.//div//div[2]/div[4]/div[2]//a')
download_file_link = []
for eachfile in download :
download_file_link.append(eachfile.get_attribute('href'))
my code can get the title, country, deadline, and its attachment but can't get the description part. it returns me an empty text but when I see it on the website it has text in it.
can anyone help me with the issue and solution. thanks in advance
Use a try except to catch it if it's there.There's some
so might need to remove it.
for each_div in all_divs :
#description always returns me empty text.
try:
description = each_div.find_element(By.XPATH, ".//div[contains(text(),'Description')]/parent::div/div[2]//p[1]").get_attribute('innerHTML')
print(description)
except:
print('none')
Outputs
This is the annual publication of information on recipients of funds for the TVET Project.
none
At the latest 14 calendar days before the final date for receipt of tenders (up to 4th January 2021), tenderers may ask questions about the tender documents and the contract in accordance with Art. 64 of the Law of 17 June 2016. Questions shall be addressed in writing to:
Pour tout besoin d'information complémentaire, veuillez contacter: <a href="mailto:adama.dianda@enabel.be">adama.dianda@enabel.be</a>
none
none
none
Marché relatif à la fourniture, l’installation, la mise en marche et formation des utilisateurs et techniciens chargé de la maintenance des équipements de Laboratoire destinés au CERMES.
Pour tout besoin d'information complémentaire, veuillez contacter: <a href="mailto:adama.dianda@enabel.be">adama.dianda@enabel.be</a>
Tenders should request the price schedule in xls from Ms. Eva Matovu. email: <a href="mailto:eva.matovu@enabel.be">eva.matovu@enabel.be</a>
You could use
for each_div in all_divs :
#description always returns me empty text.
try:
description = each_div.find_elements(By.XPATH, ".//div[contains(text(),'Description')]/parent::div/div[2]//p")
for desc in description:
print(desc.get_attribute('textContent'))
except:
print('none')
Outputs
This is the annual publication of information on recipients of funds for the TVET Project.
At the latest 14 calendar days before the final date for receipt of tenders (up to 4th January 2021), tenderers may ask questions about the tender documents and the contract in accordance with Art. 64 of the Law of 17 June 2016. Questions shall be addressed in writing to:
Françoise MUSHIMIYIMANA, National Expert in Contractualization & Administration _National ECA (francoise.mushimiyimana@enabel.be ), with copy to
denise.nsanga@enabel.be
evariste.sibomana@enabel.be
They shall be answered in the order received. The complete overview of questions asked shall be available as of at the latest 7 calendar days before the final date for receipt of tenders at the address mentioned above.
Pour tout besoin d'information complémentaire, veuillez contacter: adama.dianda@enabel.be
Marché relatif à la fourniture, l’installation, la mise en marche et formation des utilisateurs et techniciens chargé de la maintenance des équipements de Laboratoire destinés au CERMES.
Pour tout besoin d'information complémentaire, veuillez contacter: adama.dianda@enabel.be
Tenders should request the price schedule in xls from Ms. Eva Matovu. email: eva.matovu@enabel.be
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.