简体   繁体   English

使用Selenium Webdriver从网页中提取表数据

[英]Extracting a table data from Webpage using Selenium webdriver

Am using Selenium webdriver (in Eclipse) to automate a web app however now the requirement is to capture a table data displayed in one of the html page. 我正在使用Selenium Webdriver(在Eclipse中)来自动化Web应用程序,但是现在的要求是捕获显示在html页面之一中的表数据。 I tried with the solutions given here , here and few other websites however our webpage seems to have bit different way of displaying table 我尝试了此处此处和其他一些网站提供的解决方案,但是我们的网页似乎在显示表格方面有些不同 在此处输入图片说明

Tried to get the values using div class names as String Text = driver.findElements(By.xpath("//div[@class='ag-row ag-row-even ag-row-level-0']//tr")).get(0).getText(); 试图使用div类名称作为String Text = driver.findElements(By.xpath("//div[@class='ag-row ag-row-even ag-row-level-0']//tr")).get(0).getText(); however it did not work, Index out of bounds exception is thrown 但是它不起作用,抛出索引超出范围异常

From what I see, you seem to have a custom table built. 从我看来,您似乎已经构建了一个自定义表。 And from the HTML excerpt in the attached image, the structure is something like: 从所附图像的HTML摘录中,结构类似于:

<div class="ag-body-container" ...>
    <div class="row_1_class" ...>
        <div class="column_1_class" ...>
        <div class="column_2_class" ...>
        <div class="column_3_class" ...>
        <div class="column_4_class" ...>
        ... etc
    <div class="row_2_class" ...>
        <div class="column_1_class" ...>
        <div class="column_2_class" ...>
        <div class="column_3_class" ...>
        <div class="column_4_class" ...>
        ... etc

But your xPath is assuming that you have table rows (and I'm guessing maybe table cells afterwards): 但是您的xPath假设您具有表行(并且我猜想以后可能是表单元格):

By.xpath("//div[@class='ag-row ag-row-even ag-row-level-0']//tr")

causing your array to be empty (funny enough that you don't get a NoSuchElement exception, perhaps there are some tr tags somewhere in your html tree). 导致您的数组为空(很有趣,您没有得到NoSuchElement异常,因此您的html树中可能有一些tr标记)。

Now, I'm not sure what data you're trying to extract from that table, but your best try would be to get all the rows, based on the class attribute and for each row to get all columns data based on, again, class attribute (or you can even use the col attribute for these). 现在,我不确定您要从该表中提取什么数据,但是您最好的办法是根据class属性获取所有行,并再次为每一行获取所有基于列的数据, class属性(或者甚至可以使用col属性)。

EDIT: To get all the elements, you could take all rows, and afterwards for each row get all column data: 编辑:要获取所有元素,您可以占用所有行,然后为每一行获取所有列数据:

//Get all the rows from the table
List<WebElement> rows = driver.findElements(By.xpath("//div[contains(@class, 'ag-row')));

//Initialize a new array list to store the text
List<String> tableData = new ArrayList<String>();

//For each row, get the column data and store into the tableData object
for (int i=0; i < rows.size(); i++) {
    //Since you also have some span tags inside (and maybe something else)
    //we first get the div columns
    WebElement tableCell = rows.get(i).findElements(By.xpath("//div[contains(@class, 'ag-cell')]"));
    tableData.add(tableCell.get(0).getText());
}

You could also store your data into bi-directional array (or any of this sort) and afterwards access the data based on the row and column number position. 您还可以将数据存储到双向数组(或任何此类数组)中,然后根据行号和列号的位置访问数据。

I'm not sure but probably your webElements array is empty that why You get Index out of bounds exception. 我不确定,但是您的webElements数组可能为空,这就是为什么您使Index超出范围异常的原因。

If you try to get value from entire WW_SALES row I suppose that find_elements should pint out parent div - class="ag-row ag-row-even ag-row-level-0" 如果您尝试从整个WW_SALES行中获取价值,我想find_elements应该指定父div-class =“ ag-row ag-row-even ag-row-level-0”

It is only my supposition based on description and image attached. 这只是基于描述和图像的我的假设。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM