使用 python selenium 下载 pdf 无法在嵌入的框架中检索 Z572D4E421E5E6B9BC112D815E8A02

Question

我正在使用 selenium 导航网页，目的是检索 pdf 的源 url 以便我可以下载它。 我已经能够登录网站和 go 加载 pdf 的页面，但是我在嵌入 url 时遇到问题。 我不是程序员或任何东西，所以请原谅任何缺乏细节。 我的代码是：

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options

import time

PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
myUsername="xxxx"
myPassword="xxxx"
driver.get("www.xxxxxx.com")

#login
driver.find_element_by_xpath("//*[@id='tbUserName']").send_keys(myUsername);
driver.find_element_by_xpath("//*[@id='tbPassword']").send_keys(myPassword);
driver.find_element_by_xpath("//*[@id='ctl00_cp_Content_spLogin']").click()
time.sleep(2)

#select report
driver.get("www.xxxxxx.com")
driver.find_element_by_xpath("//*[@id='Repeater_ReportCategory_ctl00_LinkButton_ReportCategory']").click()
time.sleep(2)

# //*[@id="Repeater_QuizType_ctl00_LinkButton_QuizTypeLink"]
driver.find_element_by_xpath("//*[@id='Repeater_AdditionalReports_ctl06_LinkButton_AdditionalReportName']").click()
time.sleep(2)

#sort the report options
driver.find_element_by_xpath("//*[@id='RadioButton_WordCountSortByWCHighToLow']").click()
driver.find_element_by_xpath("//*[@id='mButton_Next']").click()
time.sleep(2)

#get the pdf url
mydata = driver.switch_to.frame("mBottomFrame").get_attribute("src")
print("url: ",mydata)

当我在我想要的页面时，嵌入了 pdf 文件。 当我检查嵌入式 pdf 时，我得到的详细信息是：

<embed id="plugin" type="application/x-google-chrome-pdf" src="https://z11reports.renlearn.co.uk/JSRPT0238PR/jsfileserver/reports/1125/f06db71a2a424474bad540778d952816.pdf" stream-url="chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai/5224347f-1444-4cbe-8eba-603149c683c0" headers="Cache-Control: private
    Content-Length: 12574
    Content-Type: application/pdf
    Date: Fri, 19 Feb 2021 11:27:06 GMT
    Server: Microsoft-IIS/8.5
    X-AspNet-Version: 4.0.30319
    X-Powered-By: ASP.NET
    " background-color="0xFF525659" top-toolbar-height="56" javascript="allow" top-level-url="undefined">

当我检查嵌入的 pdf 之外的页面时，我得到了（我在框架中阅读，我需要修改我所做的代码）：

<frameset id="mFrameset" rows="85,*" framespacing="0" style="border:0px;" frameborder="yes" onload="ResizeWindow()">
<frame id="mTopFrame" style="margin:0px;" scrolling="no" src="ReportsController.rli?OK=70ef071f-29ed-4b3e-8fb9-ebaa02297e6e">
<frame id="mBottomFrame" style="margin:0px;" scrolling="auto" src="https://z11reports.renlearn.co.uk/JSRPT0233PR/jsfileserver/reports/1105/6dc259000b5c469283e8ab41ca151c21.pdf" cd_frame_id_="cca6f2eb4da021471005d2ad897038a5">
<noframes>
<body>
<p>
This page uses frames, but your browser doesn't support them.
</p>
</body>
</noframes>
</frameset>

我想检索显示在这两个代码位上的 pdf 的源 url （它们将不再存在，它们似乎在运行后不久就删除了）。 当我运行我的代码时，我得到了错误：

Traceback (most recent call last):
  File "c:/Users/me/Documents/emailreport/emailreport.py", line 33, in <module>
    mydata = driver.switch_to.frame("mBottomFrame").get_attribute("src")
  File "C:\Users\aoifereid7\AppData\Local\Programs\Python\Python38\lib\site-packages\selenium\webdriver\remote\switch_to.py", line 87, in frame
    raise NoSuchFrameException(frame_reference)
selenium.common.exceptions.NoSuchFrameException: Message: mBottomFrame

任何帮助都将不胜感激，因为我已经工作了几个小时，而这却一无所获，谢谢

Answer 1

如果它在嵌套的 iframe 中，则切换到以前的 go 向下。

driver.switch_to.frame("mTopFrame")
#get the pdf url
mydata = driver.find_element_by_id("mBottom").get_attribute("src")
print("url: ",mydata)

Answer 2

使用以下方法解决了它：

mydata = driver.find_element_by_xpath('//*[@id="mBottomFrame"]').get_attribute("src")

它得到了 url，我不需要先 select 框架，感谢您的帮助。

使用 python selenium 下载 pdf 无法在嵌入的框架中检索 Z572D4E421E5E6B9BC112D815E8A02

问题描述

2 个解决方案

解决方案1
0 2021-02-20 00:49:08

解决方案2
0 2021-02-20 11:55:51

使用 python selenium 下载 pdf 无法在嵌入的框架中检索 Z572D4E421E5E6B9BC112D815E8A02

问题描述

2 个解决方案

解决方案1 0 2021-02-20 00:49:08

解决方案2 0 2021-02-20 11:55:51

解决方案1
0 2021-02-20 00:49:08

解决方案2
0 2021-02-20 11:55:51