简体   繁体   English

使用Java从表创建树数据结构

[英]Create Tree Data Structure from Table in Java

Given the following CSV, which represents a generalization hierarchy (think: zip-code anonymization eg in the second step the zip code 46072 becomes 460** ): 给定以下CSV,它表示通用层次结构(请考虑:邮政编码匿名化,例如,第二步中的邮政编码46072变为460 ** ):

A1, A*, *
A2, A*, *
A3, A*, *
B1, B*, *
B2, B*, *
B3, B*, *
B4, B*, *

I create an array of arrays by parsing it first. 我首先通过解析创建一个数组数组。

I would now like to turn this into a tree representation: 我现在想将其转换为树表示形式:

                *

        /                \

      A*                 B*

   /  |   \         /   |   |   \

 A1   A2   A3     B1   B2   B3  B4

As you can see it is a tree with each node having an arbitrary amount of children. 如您所见,它是一棵树,每个节点都有任意数量的子代。

I have the following classes: 我有以下课程:

Table , TableRow , TableCell as well as Tree and Node . TableTableRowTableCell以及TreeNode Obviously a table has multiple rows, which in turn have multiple cells. 显然,一个表具有多个行,而这些行又具有多个单元格。 A tree has a root node and a node has various operations such as addChild(node) , getParent() , getChildren() etc. 一棵树有一个根节点,一个节点有各种操作,例如addChild(node)getParent()getChildren()等。

I'm trying to figure out, how to iterate over my table, in order to span a tree as illustrated above. 我试图弄清楚如何遍历我的桌子,以便跨越一棵树,如上所示。 So far I've only driven myself into confusion... 到目前为止,我只是让自己陷入混乱...

Help is much appreciated! 非常感谢帮助!

OK. 好。 So, I'm basing my answer on these assumptions: 因此,我将基于以下假设做出回答:

  1. The rightmost column of the matrix always has the same value throughout. 矩阵的最右列始终始终具有相同的值。
  2. All rows have exactly the same number of columns. 所有行的列数完全相同。
  3. Same values in the same column are in consecutive rows. 同一列中的相同值位于连续的行中。 That is, there will not be an "A*" row followed by a "B*" and then again by "A*". 也就是说,将不会有“ A *”行,然后是“ B *”,然后再是“ A *”。 The two "A*" have to be consecutive. 两个“ A *”必须连续。
  4. On the leftmost column, all values are unique. 在最左列,所有值都是唯一的。

I did not know exactly what are your classes Table, Tree and Node can or cannot do. 我不知道您的表Table,Tree和Node可以做什么或不能做什么。 So I worked with a base 2d array (which you also said is what you have after parsing), and used just a rudimentary Node as my tree structure. 因此,我使用了一个基本的2d数组(您也说这是解析后的数组),并且仅使用一个基本Node作为我的树结构。

The idea is to work recursively from the head matrix. 这个想法是从头矩阵递归地工作。 Recursion and tree go together well... 递归和树结合得很好...

The tree is defined as having the value in the rightmost column as its root value, and its children are created the same way by eliminating the rightmost column, and cutting the matrix into pieces such that those pieces' rightmost column has an identical value. 树被定义为具有最右边一列中的值作为其根值,并且通过删除最右边一列并将矩阵切成小块以使这些小块的最右列具有相同的值,以相同的方式创建其子级。 The matrix 矩阵

A1, A*, *
A2, A*, *
A3, A*, *
B1, B*, *
B2, B*, *
B3, B*, *
B4, B*, *

Is split into a value "*" and the two sub-matrices: 分为值“ *”和两个子矩阵:

A1, A*
A2, A*
A3, A*

B1, B*
B2, B*
B3, B*
B4, B*

The same is done for the A* matrix, and its sub-matrices are single-cell matrices, A1 , A2 and A3 , at which point the recursion ends. A*矩阵执行相同的操作,其子矩阵为单单元矩阵A1A2A3 ,此时递归结束。

So assuming you created a class that represents a hierarchy builder, and you have a 2D array called data in it, you'll have a nice public method without parameters, that calls a "dirty" private method that has parameters representing the boundaries of the matrix for the current sub-matrix. 因此,假设您创建了一个表示层次结构构建器的类,并且其中有一个名为data的2D数组,那么您将拥有一个不错的没有参数的公共方法,该方法将调用一个“脏”私有方法,该方法具有代表该对象边界的参数。当前子矩阵的矩阵。

public Node<String> createTree() {
    return this.createTree(0,data.length-1,data[0].length-1);
}

The arguments it passes to the private method are the top row, bottom row, leftmost column and rightmost column. 它传递给private方法的参数是最上一行,最下一行,最左列和最右列。 Only since in this case a sub-matrix always starts from column 0, we don't need to pass the leftmost column as parameter. 仅因为在这种情况下,子矩阵始终从列0开始,我们才不需要传递最左边的列作为参数。

And this is your private createTree method: 这是您的私有createTree方法:

private Node<String> createTree( int firstRow, int lastRow, int lastCol) {

    // Recursion end. If we are at the leftmost column, just return a simple
    // node with the value at the current row and column.
    if ( lastCol == 0 ) {
        return new Node<String>( data[firstRow][0] );
    }

    // Create a node with the value of the top-right cell in our range.
    Node<String> result = new Node<String>(data[firstRow][lastCol]);

    // The next column from the right will have the values for the child nodes.
    // Split it into ranges (start row -> end row) and recursively build
    // the tree over the sub-matrix that goes column 0 -> lastCol-1 over each
    // range of rows.

    int childFirstRow = firstRow;
    String childVal = data[firstRow][lastCol-1];

    for( int candidateRow = firstRow; candidateRow <= lastRow; candidateRow ++ ) {
        // If the next value in the column is different from what we had so far, it's
        // the end of a row range, build the child tree, and mark this row as
        // the beginning of the next range.
        if ( ! data[candidateRow][lastCol-1].equals(childVal) ) {
            result.addChild(createTree( childFirstRow, candidateRow - 1, lastCol - 1));
            childFirstRow = candidateRow;
            childVal = data[childFirstRow][lastCol-1];
        }
        // In the special case of the last row, it's always the end of a range.
        if ( candidateRow == lastRow ) {
            result.addChild(createTree(childFirstRow,lastRow,lastCol - 1));
        }
    }

    return result;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM