简体   繁体   中英

Xpath Not Selecting Correct Element Using Splinter/Selenium Python 3

Not sure if I'm making a stupid mistake here, I've searched all over but I can't figure this one out. I'd really appreciate the help.

I'm trying to make a scraper to scrape Google Map Pack data. I'm using Splinter to do so. I've managed to select the div of each map pack item but I want to then iterate through and select the title (and other elements) of each of the divs.

However, when I try to do that it always selects the title of the first element even though I am running the find_by_xpath on the individual element.

Here's my code:

from splinter import Browser
from selenium import webdriver
import time

chrome_options = webdriver.ChromeOptions()
browser = Browser('chrome', options=chrome_options)


browser.visit("https://google.com")

browser.fill('q', 'roofing laredo tx')
# Find and click the 'search' button
time.sleep(5)
button = browser.find_by_name('btnK')
# Interact with elements
button.click()
time.sleep(5)
maps_elements = browser.find_by_xpath("//div[contains(@class,'VkpGBb')]")

for map_element in maps_elements:
    # print(map_element.text)
    title = map_element.find_by_xpath("//div[contains(@class,'dbg0pd')]/span").text
    print(title)

So what I want is: JJ Flores Roofing & Construction HBC Roofing McAllen Valley Roofing Co

but instead I get

JJ Flores Roofing & Construction JJ Flores Roofing & Construction JJ Flores Roofing & Construction

change your code:

maps_elements = browser.find_by_xpath("//div[contains(@class,'VkpGBb')]")

for map_element in maps_elements:
    # print(map_element.text)
    title = maps_elements.find_by_xpath("//div[contains(@class,'dbg0pd')]/span").text
    print(title)

to

title_elements = browser.find_by_xpath("//div[contains(@class,'dbg0pd')]/span")

for title_element in title_elements:
    title = title_element.text
    print(title)

Edit:

You got repeated result because from loop it selecting root element // it should be relative or ./ to select the childs but it still not work and maybe splinter bug. but try to use CSS selector

for map_element in maps_elements: 
    # select relative but failed
    #title = map_element.find_by_xpath("./div[contains(@class,'dbg0pd')]/span")
    title = map_element.find_by_css("div[class*='dbg0pd'] > span").text
    print(title)

typo in variable, remove s from

title = maps_elements.....
#title = map_element.....

This is correct because you can not declare a variable at the for loop and then create that variable inside it. You would need to create the variable before initializing the loop in order for it to work.

title_elements = browser.find_by_xpath("//div[contains(@class,'dbg0pd')]/span")

for title_element in title_elements:
    title = title_element.text
    print(title)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM