简体   繁体   English

如何使用Apache POI解析保存在Excel文件中的树结构

[英]How to parse a tree structure saved in an Excel file using Apache POI

All, 所有,

Good Morning! 早上好!

I have an excel file with data listed as the following, I'm trying to parse down using POI 我有一个Excel文件,其数据列出如下,我正在尝试使用POI进行解析

A           
    B       
        C   
            D1
            D2
        F   
            G1
            G2
            G3
        M   
            S1
    R       
        T   
    U       
L           
    X       
        Y   
            Z

is it possible to generate an output like the following 是否有可能产生如下输出

A
A-->B
A-->B-->C
A-->B-->C-->D1
A-->B-->C-->D2
A-->B-->F
A-->B-->F-->G1
A-->B-->F-->G2
A-->B-->F-->G3
A-->B-->M
A-->B-->M-->S1
A-->R
A-->R-->T
A-->U
L
L-->X
L-->X-->Y
L-->X-->Y-->Z

I have been trying from quite some time but havent figured out the logic 我已经尝试了很长时间了,但是还没有弄清楚逻辑

Thanks 谢谢

Solution in Java, using Apache POI: 使用Apache POI的Java解决方案:

import java.io.File;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ParseTreeDemo 
{
    private static final int NUM_COLUMNS = 4;

    public static void main(String[] args)
    {

        try
        {
            FileInputStream file = new FileInputStream(new File("Test.xlsx"));

            XSSFWorkbook workbook = new XSSFWorkbook(file);
            XSSFSheet sheet = workbook.getSheetAt(0);

            // Use a column marker to save the 'farthest' column so far
            int currColMarker = -1;
            List<String> list = new ArrayList<>();

            //Iterate through each rows one by one
            Iterator<Row> rowIterator = sheet.iterator();
            while (rowIterator.hasNext()) 
            {
                Row row = rowIterator.next();

                for(int currCol = 0; currCol < NUM_COLUMNS; currCol++)
                {
                    Cell cell = row.getCell(currCol);
                    if(cell == null)
                        continue;

                    if(cell.getCellType() == Cell.CELL_TYPE_STRING) {

                        if(currCol > currColMarker) {

                            // A farther column, simply append and
                            // update column marker
                            currColMarker = currCol;

                            list.add(cell.getStringCellValue());
                        }
                        else if (currCol == currColMarker) {

                            // At same level as column marker
                            // Remove old value at same level, before appending
                            list.remove(list.size() - 1);
                            list.add(cell.getStringCellValue());
                        }
                        else {

                            // At a 'nearer' column, remove those values beyond
                            // this level before appending
                            currColMarker = currCol;

                            list = list.subList(0, currCol);
                            list.add(cell.getStringCellValue());
                        }
                    }
                }

                // For displaying the current contents
                StringBuilder sb = new StringBuilder();
                for(String s : list) {
                    if(sb.length() != 0) {
                        sb.append("-->");
                    }
                    sb.append(s);
                }
                System.out.println(sb.toString());

            }
            file.close();
        } 
        catch (Exception e) 
        {
            e.printStackTrace();
        }
    }
}

Output: 输出:

A
A-->B
A-->B-->C
A-->B-->C-->D1
A-->B-->C-->D2
A-->B-->F
A-->B-->F-->G1
A-->B-->F-->G2
A-->B-->F-->G3
A-->B-->M
A-->B-->M-->S1
A-->R
A-->R-->T
A-->U
L
L-->X
L-->X-->Y
L-->X-->Y-->Z

The idea: 这个想法:

  • Use a 'column marker' to keep track of the active column 使用“列标记”来跟踪活动列
  • If the new value is at a column, with a larger column value, append 如果新值位于具有较大列值的列,请追加
  • If it has the same column value, remove the last value, and append 如果它具有相同的列值,请删除最后一个值,然后追加
  • If it has a smaller column value, remove all current values beyond the new column value, before appending 如果列值较小,请在添加前删除所有超出新列值的当前值

Note: Test.xlsx contains the values as stated in the question. 注意: Test.xlsx包含问题中所述的值。

If the listed data is in a variable called data , the following will work in Tcl: 如果列出的数据在名为data的变量中,则以下内容将在Tcl中工作:

proc merge {a b} {
    set res {}
    foreach ac [split $a {}] bc [split $b {}] {
        if {![string is space $ac] && [string is space -strict $bc]} {
            append res $ac
        } else {
            append res $bc
        }
    }
    set res
}

set current {}
foreach line [split [string trim $data] \n] {
    set current [merge $current [string trimright $line]]
    puts [join $current -->]
}

I originally went with a pseudo-stack approach, but it seemed simpler to "merge" each new line with the accumulated line ( current ) such that non-blank text in the new line would overwrite text in the accumulated line, and that the accumulated line would be truncated if the new line was shorter (after trimming off trailing whitespace from it). 我最初使用伪堆栈方法,但是将每个新行与累积行( current )“合并”似乎更简单,这样新行中的非空白文本将覆盖累积行中的文本,并且如果新行更短,则该行将被截断(在修剪掉行尾的空白之后)。

Once I had the merged line, I could take advantage of the fact that (most) strings in Tcl are also lists, and print it as a string formed by joining the words using "-->" tokens. 一旦有了合并行,我就可以利用Tcl中的(大多数)字符串也是列表这一事实,并将其打印为通过使用“->”标记连接单词形成的字符串。

Documentation: append , foreach , if , proc , puts , set , split , string 文档: appendforeachifprocputssetsplitstring

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM