简体   繁体   English

如何使用DOM / SAX或Java中的任何解析器解析bcompare xml报告

[英]How to parse the bcompare xml report using DOM/SAX or any parser in java

In our project, we wants to generate the excel report from xml which has all differences of two folders. 在我们的项目中,我们希望从xml生成具有两个文件夹所有不同之处的excel报告。

I tried to get the full path of a file from xml nodes, but i am confusing with node names because all parent nodes(foldercomp) have same name. 我试图从xml节点获取文件的完整路径,但是由于所有父节点(foldercomp)都具有相同的名称,所以我与节点名称混淆。

And can create xsd for this xml format , inner classes with same class names are not accepting in xsd complex types. 并且可以为此xml格式创建xsd,在xsd复杂类型中不接受具有相同类名的内部类。

Can you please help me on this 你能帮我吗

Below is the bcompare xml report: 以下是bcompare xml报告:

<?xml version="1.0" encoding="utf-8"?>
<bcreport created="16-11-2017 20:20:54">
<foldercomp>
    <ltpath>E:\compare\CODE1</ltpath>
    <rtpath>E:\compare\CODE2</rtpath>
    <mode>Differences</mode>
    <foldercomp>
        <lt>
            <name>Dir1</name>
            <size>696</size>
        </lt>
        <rt>
            <name>Dir1</name>
            <size>846</size>
        </rt>
        <foldercomp>
            <lt>
                <name>Dir3</name>
                <size>424</size>
            </lt>
            <rt>
                <name>Dir3</name>
                <size>431</size>
            </rt>
            <foldercomp>
                <lt>
                    <name>Dir4</name>
                    <size>281</size>
                </lt>
                <rt>
                    <name>Dir4</name>       <!-- E:\compare\CODE2\Dir1\Dir3\Dir4  -->
                    <size>288</size>
                </rt>
                <filecomp status="rtnewer">
                    <lt>
                        <name>File5 (2).txt</name>  <!-- E:\compare\CODE1\Dir1\Dir3\Dir4\File5 (2).txt -->
                        <size>139</size>
                    </lt>
                    <rt>
                        <name>File5 (2).txt</name>   <!-- E:\compare\CODE2\Dir1\Dir3\Dir4\File5 (2).txt -->
                        <size>146</size>
                    </rt>
                </filecomp>
            </foldercomp>
        </foldercomp>
        <filecomp status="rtonly">
            <rt>
                <name>File1 (1).txt</name>  <!-- E:\compare\CODE2\File1 (1).txt -->
                <size>143</size>
            </rt>
        </filecomp>
    </foldercomp>
    <foldercomp>
        <lt>
            <name>Dir2</name>
            <size>286</size>
        </lt>
        <rt>
            <name>Dir2</name>
            <size>296</size>
        </rt>
        <filecomp status="rtnewer">
            <lt>
                <name>File2.txt</name>   <!-- E:\compare\CODE1\Dir2\File2.txt -->
                <size>143</size>
            </lt>
            <rt>
                <name>File2.txt</name>   <!-- E:\compare\CODE2\Dir2\File2.txt -->
                <size>153</size>
            </rt>
        </filecomp>
    </foldercomp>
    <filecomp status="rtnewer">
        <lt>
            <name>File1 (2).txt</name>   <!-- E:\compare\CODE1\File1 (2).txt -->
            <size>132</size>
        </lt>
        <rt>
            <name>File1 (2).txt</name>  <!-- -E:\compare\CODE1\File1 (2).txt -->
            <size>139</size>
        </rt>
    </filecomp>
    <filecomp status="rtnewer">
        <lt>
            <name>File1 (3).txt</name>   <!-- E:\compare\CODE1\File1 (3).txt -->
            <size>144</size>
        </lt>
        <rt>
            <name>File1 (3).txt</name>  <!-- E:\compare\CODE2\File1 (3).txt -->
            <size>150</size>
        </rt>
    </filecomp>
</foldercomp>

lt - CODE1, rt- CODE2, foldercomp and filecomp tags for diff in folder and file lt-CODE1,rt- CODE2,foldercomp和filecomp标记,用于文件夹和文件中的差异

The XML output makes sense to me. XML输出对我来说很有意义。 What I understand from this is: 我从中了解到的是:

Dir1
 |
 +-- Dir3
 |    |
 |    +Dir4
 |      |
 |      +File5
(etc)

A folder may have more folders, and a folder may have many files. 一个文件夹可能有更多文件夹,一个文件夹可能有很多文件。 For each comparison, there is left and right items. 对于每个比较,都有左项和右项。 So,what you need to do is to re-think about your strategy of parsing. 因此,您需要做的是重新考虑您的解析策略。 Each foldercomp has at least 2 (and possibly more) children, and each filecomp tag has two (or more) children. 每个foldercomp至少有2个(可能更多)子级,每个filecomp标记有两个(或更多)子级。

If I were you, I would use foldercomp and filecomp open tags to increment row and column value by 1, close tags to decrement column while incrementing row by 1 in a similar fashion. 如果您是我,我将使用foldercomp和filecomp打开标记将行和列的值增加1,关闭标记以使列减少,同时以类似的方式将行增加1。 And lt and rt open tags to increment row values (not column) and ignore close tag of the same. lt和rt打开标记可增加行值(而非列),而忽略相同的关闭标记。 I would print folder and file names in bold, and leave the differences in normal. 我将以粗体显示文件夹和文件名,并保留正常的差异。

The status on filecomp give you an idea about the nature of the difference. filecomp上的状态可让您了解差异的性质。 So, if it is rtnew, it means, it is added. 因此,如果它是rtnew,则表示已添加。 I would use green etc. 我会用绿色等。

It should be too hard to implement it via sax parser. 通过sax解析器实现它应该太困难了。

I hope it makes sense. 我希望这是有道理的。

EDIT: 编辑:

If you need example code for SAX parser, here it is . 如果您需要SAX解析器的示例代码, 则为

I have given you the lead but I won't do your job. 我已经带头了,但是我不会做你的工作。 Sorry. 抱歉。

EDIT2: EDIT2:

Using SAX parser is really easy. 使用SAX解析器非常简单。 Check out the documentation and example above. 查看上面的文档和示例。

Think about parsing XML with a sax parser is as a switch/case statement. 考虑使用sax解析器解析XML是作为switch / case语句。 When current tag is something, do what you need to do, when something else, do whatever is needed etc. You may need to keep the context as well. 当当前标记是某件事时,请执行所需的操作,在其他事情时,请执行所需的操作等。您可能还需要保留上下文。

switch(tag){

  foldercomp: 
    ops
  filecomp:
    ops
  rt:
    ops
  lt:
    ops

}

Try it yourself to see. 自己尝试看看。 If you run into wall by implementing it, fellow stackoverflow users, including myself would be pleased to help. 如果您通过实施它遇到麻烦,那么包括我自己在内的其他stackoverflow用户将很乐意为您提供帮助。 But you need to try first. 但是您需要先尝试。

Cheers. 干杯。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM