简体   繁体   English

用Python在Selenium中进行Web抓取-通过xpath或id查找元素返回空列表

[英]Web scraping in Selenium in Python - find elements via xpath or id return empty list

So I am trying to scrape a list of email addresses from my User Explorer page in Google Analytics. 因此,我尝试从Google Analytics(分析)的“用户资源管理器”页面中抓取电子邮件地址列表。

which 哪一个

I obtained the x-path via here 我通过这里获得了x路径

The item's X-path is //*[@id="ID-explorer-table-dataTable-key-0-0"]/div 该项目的X路径是//*[@id="ID-explorer-table-dataTable-key-0-0"]/div

But no matter how I do: 但是无论我怎么做:

driver.find_elements_by_xpath(`//*[@id="ID-explorer-table-dataTable-key-0-0"]/div`)

or 要么

driver.find_elements_by_xpath('//*[@id="ID-reportContainer"]')

or 要么

driver.find_elements_by_id(r"ID-explorer-table-dataTable-key-0-0")

it returns an empty list. 它返回一个空列表。

Can anyone tell me where I have gone wrong? 谁能告诉我我哪里出问题了?

I also tried using: 我也尝试使用:

html = driver.page_source

but of course I couldnt find the list of the emails as well. 但是我当然也找不到电子邮件列表。

I am also thinking, if this doesnt work, whether there is a way I can automate control + a and copy all the text displayed into a string in Python and then use re.findall() to find the email addresses? 我也在想,如果这不起作用,是否有一种方法可以自动控制+ a并将所有显示的文本复制到Python中的字符串中,然后使用re.findall()查找电子邮件地址?

email = driver.find_element_by_xpath( //*[@id="ID-explorer-table-dataTable-key-0-0"]/div ) 电子邮件= driver.find_element_by_xpath( //*[@id="ID-explorer-table-dataTable-key-0-0"]/div

print("email", email.get_attribute("innerHTML")) print(“ email”,email.get_attribute(“ innerHTML”))

Thanks for the help of @Guy! 感谢@Guy的帮助!

It was something related to iframe and this worked and detected which frame the item i need belong to: 这与iframe有关,并且可以正常工作并检测到我需要的物品属于哪个框架:

iframelist=driver.find_elements_by_tag_name('iframe')
for i in range(len(iframelist)):
    driver.switch_to.frame(iframelist[i])
    if len(driver.find_elements_by_xpath('//*[@id="ID-explorer-table-dataTable-key-0-0"]/div'))!=0:
        print('it is item {}'.format(i))
        break
    else:
        driver.switch_to.default_content()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM