[英]Scraping webpage with a table rendered using javascript utilizing Selenium Webdriver
the source html of the page i am trying to scrape 我要抓取的页面的源html
Iam trying to scrape a webtable that is rendered using certain javascripts using Selenium Webdriver Iam尝试使用Selenium Webdriver刮取使用某些JavaScript呈现的Web表
driver.get("http://xxxxx:xxxxxxxx@xxxxxx-
xxxxxx.grid.xxxxxx.com/Windchill/app/#ptc1/comp/queue.table");
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']"));
System.out.println(k.size());
System.out.println(k.get(0).getText());
k.size() returns 1 and when i run get text it returns only some entries from the table k.size()返回1,当我运行get text时,它仅返回表中的某些条目
Actual table and entries the total rows are 135 实际表和条目的总行数为135
after running i get as follows 运行后,我得到如下
Queue Management
Loading...
Name
Type
Status
Enabled
Group
Total Entries
Waiting Entries
Severe/Failed Entries
DeleteCompletedWorkItemsQueu e
Process
Started
Enabled
Default
0
0
0
DeliveryStatusOnStartup
Process
Started
Enabled
Default
0
0
0
DTODeliverablesQueue
Process
Started
Enabled
Default
0
0
0
DTOOffPeakQueue
Process
Started
Enabled
Default
0
0
0
Loading.........
I get 25 entries of the table and rest is not present I am unable to understand why am i getting "Loading....." 我得到该表的25个条目,其余的不存在,我无法理解为什么我会得到“正在加载.......”
I think by using List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']"));
我认为通过使用List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']"));
we are trying to make out a list with too many unwanted items in the list. 我们正在尝试列出一个列表,其中有太多不需要的项目。 Rather, I feel it would be effective to get hold of the nodes within <td>
tags which contains the indented values and save into the list. 相反,我认为在<td>
标记中保留包含缩进值的节点并将其保存到列表中将是有效的。 Next we can iterate over the list and use either getText()
method or getAttribute()
method to retrieve the text as follows: 接下来,我们可以遍历列表,并使用getText()
方法或getAttribute()
方法来检索文本,如下所示:
driver.get("http://xxxxx:xxxxxxxx@xxxxxx-xxxxxx.grid.xxxxxx.com/Windchill/app/#ptc1/comp/queue.table");
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
List<WebElement> k = driver.findElements(By.xpath("//*[@id='queue.table']//tr"));
System.out.println(k.size());
for (WebElement my_element:k)
{
String innerhtml = my_element.getAttribute("innerHTML");
System.out.println("Value from Table is : "+innerhtml);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.