使用Selenium Webdriver使用javascript呈现的表格来抓取网页

Question

the source html of the page i am trying to scrape 我要抓取的页面的源html

Iam trying to scrape a webtable that is rendered using certain javascripts using Selenium Webdriver Iam尝试使用Selenium Webdriver刮取使用某些JavaScript呈现的Web表

driver.get("http://xxxxx:xxxxxxxx@xxxxxx-
xxxxxx.grid.xxxxxx.com/Windchill/app/#ptc1/comp/queue.table");
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']"));
System.out.println(k.size());
System.out.println(k.get(0).getText());

k.size() returns 1 and when i run get text it returns only some entries from the table k.size（）返回1，当我运行get text时，它仅返回表中的某些条目

Actual table and entries the total rows are 135 实际表和条目的总行数为135

after running i get as follows 运行后，我得到如下

              Queue Management
 Loading...

 Name
 Type
 Status
 Enabled
 Group
 Total Entries
 Waiting Entries
 Severe/Failed Entries
 DeleteCompletedWorkItemsQueu e
 Process
 Started
 Enabled
 Default
 0
 0
 0
 DeliveryStatusOnStartup
 Process
 Started
 Enabled
 Default
 0
 0
 0
 DTODeliverablesQueue
 Process
 Started
 Enabled
 Default
 0
 0
 0
 DTOOffPeakQueue
 Process
 Started
 Enabled
 Default
 0
 0
 0
Loading.........

I get 25 entries of the table and rest is not present I am unable to understand why am i getting "Loading....." 我得到该表的25个条目，其余的不存在，我无法理解为什么我会得到“正在加载.......”

Answer 1

I think by using List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']")); 我认为通过使用List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']")); we are trying to make out a list with too many unwanted items in the list. 我们正在尝试列出一个列表，其中有太多不需要的项目。 Rather, I feel it would be effective to get hold of the nodes within <td> tags which contains the indented values and save into the list. 相反，我认为在<td>标记中保留包含缩进值的节点并将其保存到列表中将是有效的。 Next we can iterate over the list and use either getText() method or getAttribute() method to retrieve the text as follows: 接下来，我们可以遍历列表，并使用getText()方法或getAttribute()方法来检索文本，如下所示：

driver.get("http://xxxxx:xxxxxxxx@xxxxxx-xxxxxx.grid.xxxxxx.com/Windchill/app/#ptc1/comp/queue.table");
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
List<WebElement> k = driver.findElements(By.xpath("//*[@id='queue.table']//tr"));
System.out.println(k.size());
for (WebElement my_element:k)
    {
        String innerhtml = my_element.getAttribute("innerHTML");
        System.out.println("Value from Table is : "+innerhtml); 
    }

使用Selenium Webdriver使用javascript呈现的表格来抓取网页

问题描述

1 个解决方案

解决方案1
0 2017-09-21 11:56:06

使用Selenium Webdriver使用javascript呈现的表格来抓取网页

问题描述

1 个解决方案

解决方案1 0 2017-09-21 11:56:06

解决方案1
0 2017-09-21 11:56:06