简体   繁体   中英

not getting href link on selenium python bs4

This class ("class":"contenu") contain href link. I tried.get('href') but didn't work. here is my full code

    browser.get("https://www.usine-digitale.fr/annuaire-start-up/")

    # Wait 20 seconds for page to load
    timeout = 20
    try:
        WebDriverWait(browser, timeout).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='texteContenu3']")))
    except TimeoutException:
        print("Timed out waiting for page to load")
        browser.quit()

    soup = BeautifulSoup(browser.page_source, "html.parser")

    product_items = soup.find_all("a",{"class":"contenu"})
    for item in product_items:
        item_url = item.find("a",{"class":"contenu"}).get('href')
        print(item_url)


        from csv import writer
        def AddToCSV(List):
                with open("Output.csv", "a+", newline='') as output_file:
                    csv_writer = writer(output_file)
                    csv_writer.writerow(List)


        # this can be used within your for loop
        row_list = [item_url]
        AddToCSV(row_list)


    browser.quit() 

getting this error massage

    item_url = item.find("a",{"class":"contenu"}).get('href')
AttributeError: 'NoneType' object has no attribute 'get'

but when I run the single line code one python shell I am geeting href

>>> soup.find("a",{"class":"contenu"}).get('href')
'/annuaire-start-up/ausha-by-icreo,941436'

why it's not working on my full code?

You have already called find_all() .Just iterate and get the href.

soup = BeautifulSoup(browser.page_source, "html.parser")
product_items = soup.find_all("a",{"class":"contenu"})
for item in product_items:
        item_url = item.get('href')
        print(item_url)

Output :

/annuaire-start-up/telegrafik,941441
/annuaire-start-up/ausha-by-icreo,941436
/annuaire-start-up/gamersorigin,962251
/annuaire-start-up/fabulabox,962231
/annuaire-start-up/nyctale,962226
/annuaire-start-up/lizee,962221
/annuaire-start-up/isybot,961726
/annuaire-start-up/sarus-technologies,961716
/annuaire-start-up/beeldi,941426
/annuaire-start-up/energie-ip,961706
/annuaire-start-up/easyblue,961421
/annuaire-start-up/braam,940806
/annuaire-start-up/spareka,961311
/annuaire-start-up/chance,961306
/annuaire-start-up/cosmoz,961296
/annuaire-start-up/adrenalead,961291
/annuaire-start-up/demand-side-instruments,940786
/annuaire-start-up/ividata,960926
/annuaire-start-up/tekyn,960921
/annuaire-start-up/siga,960916

In your example that works, you are searching for an a element in soup . In your code, you find all the matching a elements in soup , then unnecessarily search for another one in link . Change link.find... to link.get('href') .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM