Apache POI跳过从未更新过的行

Question

While processing an excel file in Apache POI, I noticed that it would skip certain set of empty rows. 在Apache POI中处理excel文件时，我注意到它会跳过某些空行集。 After a lot of trial and error, I noticed that Apache POI would only read from those rows whose cells have ever been updated. 经过大量的试验和错误后，我注意到Apache POI仅从那些单元已更新的行中读取。

I have written a short program to read if a row in an XLSX (XSSF model) file is empty. 我编写了一个简短的程序来读取XLSX（XSSF模型）文件中的行是否为空。 This is my input excel file: 这是我输入的excel文件：

在此处输入图片说明

private static boolean isRowEmpty(Row row) {
        boolean isRowEmpty = true;
        if (row != null) {
            for (Cell cell : row) {
                if (cell != null) {
                    System.out.println("Row:" + cell.getRowIndex() + " Col:"
                            + cell.getColumnIndex() + " Len:"
                            + cell.toString().length());
                    isRowEmpty = false;
                } else {
                    System.out.println("Cell is Null at Row "+row.getRowNum());
                }
            }
        } else {
            System.out.println("Row is Null");
        }
        return isRowEmpty;
}

for (Row row : sheet) {
    if (isRowEmpty(row)) {
        System.out.println("Empty row at " + row.getRowNum());
    }
}

OUTPUT OUTPUT

Row:0 Col:0 Len:1
Row:2 Col:0 Len:1
Row:3 Col:0 Len:1
Row:4 Col:0 Len:1
Row:5 Col:0 Len:1
Row:6 Col:1 Len:1
Row:7 Col:0 Len:1
Row:8 Col:2 Len:1

In cell A5 , I have entered a space, which gets detected by Apache POI. 在单元格A5 ，我输入了一个空格，Apache POI会检测到该空格。 As you can see from the Output, it does not process row 2 (rownum 1). 从输出中可以看到，它不处理第2行（第1行）。

Is there any workaround to this so that it gives the following output: 是否有任何解决方法，以便提供以下输出：

Row:0 Col:0 Len:1
Empty Row at 1
Row:2 Col:0 Len:1
Row:3 Col:0 Len:1
Empty Row at 4
Row:5 Col:0 Len:1
Row:6 Col:1 Len:1
Row:7 Col:0 Len:1
Row:8 Col:2 Len:1

Thanks! 谢谢！

UPDATE 1 更新1

Using (cell != null && StringUtils.isNotBlank(cell.toString())) instead of (cell != null) gives me the following output: 使用(cell != null && StringUtils.isNotBlank(cell.toString()))而不是(cell != null)给我以下输出：

Row:0 Col:0 Len:1
Row:2 Col:0 Len:1
Row:3 Col:0 Len:1
Cell is Null for Row 4
Empty row at 4
Row:5 Col:0 Len:1
Row:6 Col:1 Len:1
Row:7 Col:0 Len:1
Row:8 Col:2 Len:1

Answer 1

This is entirely as expected, as explained in the documentation ! 如文档中所述，这完全符合预期！

The iterators are there to make life easy to grab the rows and cells with content in them (plus a few others that Excel has randomly still included in the file...). 使用迭代器可以轻松地获取其中包含内容的行和单元格（以及Excel仍随机包含在文件中的其他几行...）。

If you want to fetch every row and cell, no matter if they are defined or not, then you need to follow the advice in the documentation and loop by row and cell number, eg 如果要获取每个行和单元格，无论是否定义，都需要遵循文档中的建议并按行和单元格号循环，例如

// Decide which rows to process
int rowStart = Math.min(15, sheet.getFirstRowNum());
int rowEnd = Math.max(1400, sheet.getLastRowNum());

for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
   Row r = sheet.getRow(rowNum);
   if (r == null) {
      // Handle there being no cells defined for this row
      continue;
   }

   // Decide how many columns to fetch for this row
   int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);

   for (int cn = 0; cn < lastColumn; cn++) {
      Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
      if (c == null) {
         // The spreadsheet is empty in this cell
      } else {
         // Do something useful with the cell's contents
      }
   }
}

Apache POI跳过从未更新过的行

问题描述

1 个解决方案

解决方案1
0 2015-05-29 09:24:30

Apache POI跳过从未更新过的行

问题描述

1 个解决方案

解决方案1 0 2015-05-29 09:24:30

解决方案1
0 2015-05-29 09:24:30