This class ("class":"contenu") contain href link. I tried.get('href') but didn't work. here is my full code
browser.get("https://www.usine-digitale.fr/annuaire-start-up/")
# Wait 20 seconds for page to load
timeout = 20
try:
WebDriverWait(browser, timeout).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='texteContenu3']")))
except TimeoutException:
print("Timed out waiting for page to load")
browser.quit()
soup = BeautifulSoup(browser.page_source, "html.parser")
product_items = soup.find_all("a",{"class":"contenu"})
for item in product_items:
item_url = item.find("a",{"class":"contenu"}).get('href')
print(item_url)
from csv import writer
def AddToCSV(List):
with open("Output.csv", "a+", newline='') as output_file:
csv_writer = writer(output_file)
csv_writer.writerow(List)
# this can be used within your for loop
row_list = [item_url]
AddToCSV(row_list)
browser.quit()
getting this error massage
item_url = item.find("a",{"class":"contenu"}).get('href')
AttributeError: 'NoneType' object has no attribute 'get'
but when I run the single line code one python shell I am geeting href
>>> soup.find("a",{"class":"contenu"}).get('href')
'/annuaire-start-up/ausha-by-icreo,941436'
why it's not working on my full code?
You have already called find_all()
.Just iterate and get the href.
soup = BeautifulSoup(browser.page_source, "html.parser")
product_items = soup.find_all("a",{"class":"contenu"})
for item in product_items:
item_url = item.get('href')
print(item_url)
Output :
/annuaire-start-up/telegrafik,941441
/annuaire-start-up/ausha-by-icreo,941436
/annuaire-start-up/gamersorigin,962251
/annuaire-start-up/fabulabox,962231
/annuaire-start-up/nyctale,962226
/annuaire-start-up/lizee,962221
/annuaire-start-up/isybot,961726
/annuaire-start-up/sarus-technologies,961716
/annuaire-start-up/beeldi,941426
/annuaire-start-up/energie-ip,961706
/annuaire-start-up/easyblue,961421
/annuaire-start-up/braam,940806
/annuaire-start-up/spareka,961311
/annuaire-start-up/chance,961306
/annuaire-start-up/cosmoz,961296
/annuaire-start-up/adrenalead,961291
/annuaire-start-up/demand-side-instruments,940786
/annuaire-start-up/ividata,960926
/annuaire-start-up/tekyn,960921
/annuaire-start-up/siga,960916
In your example that works, you are searching for an a
element in soup
. In your code, you find all the matching a
elements in soup
, then unnecessarily search for another one in link
. Change link.find...
to link.get('href')
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.