简体   繁体   中英

Scraping webpage with a table rendered using javascript utilizing Selenium Webdriver

the source html of the page i am trying to scrape

Iam trying to scrape a webtable that is rendered using certain javascripts using Selenium Webdriver

driver.get("http://xxxxx:xxxxxxxx@xxxxxx-
xxxxxx.grid.xxxxxx.com/Windchill/app/#ptc1/comp/queue.table");
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']"));
System.out.println(k.size());
System.out.println(k.get(0).getText());

k.size() returns 1 and when i run get text it returns only some entries from the table

Actual table and entries the total rows are 135

after running i get as follows

              Queue Management
 Loading...

 Name
 Type
 Status
 Enabled
 Group
 Total Entries
 Waiting Entries
 Severe/Failed Entries
 DeleteCompletedWorkItemsQueu e
 Process
 Started
 Enabled
 Default
 0
 0
 0
 DeliveryStatusOnStartup
 Process
 Started
 Enabled
 Default
 0
 0
 0
 DTODeliverablesQueue
 Process
 Started
 Enabled
 Default
 0
 0
 0
 DTOOffPeakQueue
 Process
 Started
 Enabled
 Default
 0
 0
 0
Loading.........

I get 25 entries of the table and rest is not present I am unable to understand why am i getting "Loading....."

I think by using List<WebElement> k=driver.findElements(By.xpath("//*[@id='queue.table']")); we are trying to make out a list with too many unwanted items in the list. Rather, I feel it would be effective to get hold of the nodes within <td> tags which contains the indented values and save into the list. Next we can iterate over the list and use either getText() method or getAttribute() method to retrieve the text as follows:

driver.get("http://xxxxx:xxxxxxxx@xxxxxx-xxxxxx.grid.xxxxxx.com/Windchill/app/#ptc1/comp/queue.table");
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
List<WebElement> k = driver.findElements(By.xpath("//*[@id='queue.table']//tr"));
System.out.println(k.size());
for (WebElement my_element:k)
    {
        String innerhtml = my_element.getAttribute("innerHTML");
        System.out.println("Value from Table is : "+innerhtml); 
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM