简体   繁体   English

如何将 HTML 表数据“反序列化”为二维数组?

[英]How can I "deserialize" an HTML table data into a bi-dimensional array?

I need to perform data validation of two tables using Selenium .我需要使用Selenium对两个表执行数据验证

Given a properly marked-up HTML table filled with data:给定一个正确标记的 HTML 表,其中填充了数据:

<table>
    <tbody>
        <tr>
            <td>A</td>
            <td>B</td>
            <td>C</td>
        </tr>

        <tr>
            <td>1</td>
            <td>2</td>
            <td>3</td>
        </tr>
    </tbody>
</table>

And I want to "deserialize" this table (gather its data) into a bi-dimensional array ( String[][] ) using Selenium. The reason I want to do so is that I have another HTML table (on the other web-page) that contains supposedly the same data stored in it - and I need to perform data validation between those two tables.我想使用 Selenium 将这个表(收集它的数据)“反序列化”成一个二维数组( String[][] )。我想这样做的原因是我有另一个 HTML 表(在另一个 web-页),其中包含据称存储在其中的相同数据 - 我需要在这两个表之间执行数据验证

I have tried lots of options on how to solve this problem, and iterative cell-by-cell data gathering (locating cells either using the getTable() or getText() methods) is not one of them - since it takes enormous amounts of time to complete a big table on an overloaded web-page.我已经尝试了很多关于如何解决这个问题的选项,并且迭代逐个单元格数据收集(使用getTable()getText()方法定位单元格)不是其中之一 - 因为它需要大量时间在超载的网页上完成一张大桌子。

JavaScript injection (using the getEval() method) is not available in my case since the table resides in an <iframe> that has an origin (base URL) that differs from the one of the main page. JavaScript 注入(使用getEval()方法)在我的案例中不可用,因为该表位于<iframe>中,其来源(基本 URL)与主页不同。 And according to same origin policy this cannot be performed.并且根据同源策略,这是无法执行的。

Guys, any idea on how to solve the given problem?伙计们,关于如何解决给定问题的任何想法?

You could use JAXB to deserialize the HTML text into plain java object hierarchy and then construct a 2D array from those objects.您可以使用 JAXB 将 HTML 文本反序列化为普通的 java object 层次结构,然后从这些对象构造一个二维数组。

Another option: parse the text as XML into a org.w3c.dom.Document and use XPath in Java to find and iterate over the elements.另一种选择:将文本作为 XML 解析为 org.w3c.dom.Document 并在 Java 中使用XPath来查找和迭代元素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM