简体   繁体   中英

Using selenium in python for capturing the links in a web

I'm trying to capture the links of a webpage using Selenium in Python. My initial code is:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
import time
from tqdm import tqdm
from selenium.common.exceptions import NoSuchElementException
driver.get('https://www.lovecrave.com/shop/')

Then, I identified all the products (12) in the web by using:

perso_flist = driver.find_elements_by_xpath("//p[@class='excerpt']")

Then, I want to capture the links for each product by using:

listOflinks = []
for i in perso_flist:
    link_1=i.find_elements_by_xpath(".//a[@href[1]]")
    listOflinks.append(link_1)
print(listOflinks

And my output looks like:

print(listOflinks)  # 12 EMPTY VALUES
[[], [], [], [], [], [], [], [], [], [], [], []]

What is wrong with my code? I'll appreciate your help.

I am making some assumptions about this xpath //p[@class='excerpt'] if below does not work please add an htlm example of the element.

You can get a list of link element by making this update:

perso_flist = driver.find_elements_by_xpath("//li//a[@class='full-link']")

Then loop through the list using element.get_attribute()

listOflinks = []
for i in perso_flist:
    link_1=i.get_attribute("href")
    listOflinks.append(link_1)
print(listOflinks)

Basically you loop through the a tags and get the attribute href.

hrefs=[x.get_attribute("href") for x in driver.find_elements_by_xpath("//p[@class='excerpt']/following-sibling::a[1]")]
print(hrefs)

or xpath //li/a[@class='full-link']

Outputs

['https://www.lovecrave.com/products/duet-pro/',
 'https://www.lovecrave.com/products/vesper/',
 'https://www.lovecrave.com/products/wink/',
 'https://www.lovecrave.com/products/duet/',
 'https://www.lovecrave.com/products/duet-flex/',
 'https://www.lovecrave.com/products/flex/',
 'https://www.lovecrave.com/products/pocket-vibe/',
 'https://www.lovecrave.com/products/bullet/',
 'https://www.lovecrave.com/products/cuffs/',
 'https://www.lovecrave.com/shop/gift-card/',
 'https://www.lovecrave.com/shop/leather-case/',
 'https://www.lovecrave.com/shop/vesper-replacement-charger/']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM