簡體   English   中英

獲取智能Excel表與Apache個POI

[英]Get smart Excel table with Apache POI

我正在使用下面的代碼將 .csv 文件轉換為 .xlsx。 它工作正常,但客戶想要獲得“智能 Excel 表”(帶有過濾器等,格式為 Microsoft Excel 中的表)。

使用 Apache NiFI 和 Groovy:

@Grab("org.apache.poi:poi:3.16")
@Grab("org.apache.poi:poi-ooxml:3.16")
import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.streaming.*;
import org.apache.commons.io.IOUtils
import java.nio.charset.StandardCharsets

def flowFile = session.get()
if(!flowFile)
   return

flowFile = session.write(flowFile, { inputStream, outputStream ->
try {
        SXSSFWorkbook workBook = new SXSSFWorkbook();
        workBook.setCompressTempFiles(true);

        SXSSFSheet sheet = workBook.createSheet("Sheet");
        sheet.setRandomAccessWindowSize(1000);

        String currentLine = null;
        int RowNum = 0;
        BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
        while ((currentLine = br.readLine()) != null) {
            String[] str = currentLine.split(",");            

            Row currentRow = sheet.createRow(RowNum);
            for(int i=0;i<str.length;i++){
                currentRow.createCell(i).setCellValue(str[i]);
            }
            RowNum++;

            if (RowNum % 1000 == 0) {
                println RowNum;
            }
        }
        workBook.write(outputStream);
        fileOutputStream.close();
    } catch (Exception ex) {
        ex.printStackTrace();
    }


} as StreamCallback)

session.transfer(flowFile, REL_SUCCESS)

我的版本看起來像: 在此處輸入圖像描述

客戶想要得到這樣的東西:

在此處輸入圖像描述

我怎樣才能做到這一點?

一開始:把CSV個文件當成純文本文件逐行讀取,用分隔符分割容易出錯。 此方法未考慮 CSV 條規則。 例如:值可能被引號括起來。 逗號和下一個值之間可能有空格,那么它不應該是值的一部分。 等等。 CSV 文件應該使用為此制作的庫來讀取。 例如有opencsv

在 Excel 中創建表是 Apache POI 的一部分。 XSSFSheet.createTable 不幸的是沒有SXSSFSheet.createTable 由於 CSV 的大小,您需要流媒體版本SXSSF ,對嗎?

為了克服這個問題,可以從SXSSFWorkbook中獲取底層XSSFWorkbook以在那里XSSFTable 這種方法的問題是,當流式傳輸到SXSSFSheet時,底層XSSFSheet不包含任何數據。 這就是為什么XSSFSheet.createTable(AreaReference)AreaReference的第一行中找不到任何列名稱並創建一個列名稱為“Column1”、“Column2”、“Column3”...的表。但是,這些與表的實際內容。 這就是為什么我們需要在創建表后更新表頭。

完整示例:

import java.io.*;

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.xssf.streaming.*;
import org.apache.poi.ss.SpreadsheetVersion;
import org.apache.poi.ss.util.AreaReference;
import org.apache.poi.ss.util.CellReference;

import com.opencsv.CSVReader;

class CreateTableFromCSV {
    
 static XSSFTable createTable(SXSSFSheet sxssfSheet, AreaReference areaReference, String[] strHeaders) {
  XSSFWorkbook xssfWorkbook = sxssfSheet.getWorkbook().getXSSFWorkbook();
  XSSFSheet xssfSheet = xssfWorkbook.getSheet(sxssfSheet.getSheetName());
  XSSFTable xssfTable = xssfSheet.createTable(areaReference);
  System.out.println(xssfTable.getCTTable()); // wrong column names since xssfSheet does not contain any data until now
  //xssfTable.updateHeaders(); // this cannot work since xssfSheet does not contain any data until now
  for (int i = 0; i < strHeaders.length; i++) {
   String columnHeader = strHeaders[i];
   if (xssfTable.getCTTable().getTableColumns().getTableColumnList().size() > i) xssfTable.getCTTable().getTableColumns().getTableColumnList().get(i).setName(columnHeader); 
  }
  System.out.println(xssfTable.getCTTable()); // headers updated
  return xssfTable;
 }
    
 public static void main(String[] args) throws Exception {

  try (
   SXSSFWorkbook sxssfWorkbook = new SXSSFWorkbook(); FileOutputStream fileout = new FileOutputStream("./Excel.xlsx");
   CSVReader csvReader = new CSVReader(new FileReader("./cities.csv"));
   //CSVReader csvReader = new CSVReader(new FileReader("./annual-enterprise-survey-2021-financial-year-provisional-csv.csv"));
   //CSVReader csvReader = new CSVReader(new FileReader("./overseas-trade-indexes-September-2022-quarter-provisional-csv.csv"));
   ) {

   sxssfWorkbook.setCompressTempFiles(true);
   SXSSFSheet sxssfSheet = sxssfWorkbook.createSheet("Sheet");
   sxssfSheet.setRandomAccessWindowSize(100);
   
   String[] strHeaders = null;
   String[] dataRow = null;
   int rowNum = 0;
   while ((dataRow = csvReader.readNext()) != null) {
    if (rowNum == 0) strHeaders = dataRow;
    Row currentRow = sxssfSheet.createRow(rowNum);
    for (int i = 0; i < dataRow.length; i++) {
     String cellValue = dataRow[i];
     currentRow.createCell(i).setCellValue(cellValue);
    }
    rowNum++;
    if (rowNum % 1000 == 0) {
     System.out.println(rowNum);
    }
   }

   int lastRow = rowNum -1;
   int lastCol = strHeaders.length -1;
   AreaReference areaReference = new AreaReference(new CellReference(0, 0), new CellReference(lastRow, lastCol), SpreadsheetVersion.EXCEL2007);
   System.out.println(areaReference);
   XSSFTable xssfTable = createTable(sxssfSheet, areaReference, strHeaders);
   //this styles the table as Excel would do per default
   xssfTable.getCTTable().addNewTableStyleInfo();
   XSSFTableStyleInfo style = (XSSFTableStyleInfo)xssfTable.getStyle();
   style.setName("TableStyleLight13");
   style.setShowColumnStripes(false);
   style.setShowRowStripes(true);
   xssfTable.getCTTable().addNewAutoFilter().setRef(areaReference.formatAsString());
   
   sxssfWorkbook.write(fileout);
   sxssfWorkbook.dispose(); 
  }
 }
}

此代碼已使用當前 Apache POI 版本apache poi 5.2.3進行測試和工作。

cities.csv的內容在這里作為文本:

"LatD", "LatM", "LatS", "NS", "LonD", "LonM", "LonS", "EW", "City", "State"
   41,    5,   59, "N",     80,   39,    0, "W", "Youngstown", OH
   42,   52,   48, "N",     97,   23,   23, "W", "Yankton", SD
   46,   35,   59, "N",    120,   30,   36, "W", "Yakima", WA
   42,   16,   12, "N",     71,   48,    0, "W", "Worcester", MA
   43,   37,   48, "N",     89,   46,   11, "W", "Wisconsin Dells", WI
   36,    5,   59, "N",     80,   15,    0, "W", "Winston-Salem", NC
   49,   52,   48, "N",     97,    9,    0, "W", "Winnipeg", MB
   39,   11,   23, "N",     78,    9,   36, "W", "Winchester", VA
   34,   14,   24, "N",     77,   55,   11, "W", "Wilmington", NC
   39,   45,    0, "N",     75,   33,    0, "W", "Wilmington", DE
   48,    9,    0, "N",    103,   37,   12, "W", "Williston", ND
   41,   15,    0, "N",     77,    0,    0, "W", "Williamsport", PA
   37,   40,   48, "N",     82,   16,   47, "W", "Williamson", WV
   33,   54,    0, "N",     98,   29,   23, "W", "Wichita Falls", TX
   37,   41,   23, "N",     97,   20,   23, "W", "Wichita", KS
   40,    4,   11, "N",     80,   43,   12, "W", "Wheeling", WV
   26,   43,   11, "N",     80,    3,    0, "W", "West Palm Beach", FL
   47,   25,   11, "N",    120,   19,   11, "W", "Wenatchee", WA
   41,   25,   11, "N",    122,   23,   23, "W", "Weed", CA
   31,   13,   11, "N",     82,   20,   59, "W", "Waycross", GA
   44,   57,   35, "N",     89,   38,   23, "W", "Wausau", WI
   42,   21,   36, "N",     87,   49,   48, "W", "Waukegan", IL
   44,   54,    0, "N",     97,    6,   36, "W", "Watertown", SD
   43,   58,   47, "N",     75,   55,   11, "W", "Watertown", NY
   42,   30,    0, "N",     92,   20,   23, "W", "Waterloo", IA
   41,   32,   59, "N",     73,    3,    0, "W", "Waterbury", CT
   38,   53,   23, "N",     77,    1,   47, "W", "Washington", DC
   41,   50,   59, "N",     79,    8,   23, "W", "Warren", PA
   46,    4,   11, "N",    118,   19,   48, "W", "Walla Walla", WA
   31,   32,   59, "N",     97,    8,   23, "W", "Waco", TX
   38,   40,   48, "N",     87,   31,   47, "W", "Vincennes", IN
   28,   48,   35, "N",     97,    0,   36, "W", "Victoria", TX
   32,   20,   59, "N",     90,   52,   47, "W", "Vicksburg", MS
   49,   16,   12, "N",    123,    7,   12, "W", "Vancouver", BC
   46,   55,   11, "N",     98,    0,   36, "W", "Valley City", ND
   30,   49,   47, "N",     83,   16,   47, "W", "Valdosta", GA
   43,    6,   36, "N",     75,   13,   48, "W", "Utica", NY
   39,   54,    0, "N",     79,   43,   48, "W", "Uniontown", PA
   32,   20,   59, "N",     95,   18,    0, "W", "Tyler", TX
   42,   33,   36, "N",    114,   28,   12, "W", "Twin Falls", ID
   33,   12,   35, "N",     87,   34,   11, "W", "Tuscaloosa", AL
   34,   15,   35, "N",     88,   42,   35, "W", "Tupelo", MS
   36,    9,   35, "N",     95,   54,   36, "W", "Tulsa", OK
   32,   13,   12, "N",    110,   58,   12, "W", "Tucson", AZ
   37,   10,   11, "N",    104,   30,   36, "W", "Trinidad", CO
   40,   13,   47, "N",     74,   46,   11, "W", "Trenton", NJ
   44,   45,   35, "N",     85,   37,   47, "W", "Traverse City", MI
   43,   39,    0, "N",     79,   22,   47, "W", "Toronto", ON
   39,    2,   59, "N",     95,   40,   11, "W", "Topeka", KS
   41,   39,    0, "N",     83,   32,   24, "W", "Toledo", OH
   33,   25,   48, "N",     94,    3,    0, "W", "Texarkana", TX
   39,   28,   12, "N",     87,   24,   36, "W", "Terre Haute", IN
   27,   57,    0, "N",     82,   26,   59, "W", "Tampa", FL
   30,   27,    0, "N",     84,   16,   47, "W", "Tallahassee", FL
   47,   14,   24, "N",    122,   25,   48, "W", "Tacoma", WA
   43,    2,   59, "N",     76,    9,    0, "W", "Syracuse", NY
   32,   35,   59, "N",     82,   20,   23, "W", "Swainsboro", GA
   33,   55,   11, "N",     80,   20,   59, "W", "Sumter", SC
   40,   59,   24, "N",     75,   11,   24, "W", "Stroudsburg", PA
   37,   57,   35, "N",    121,   17,   24, "W", "Stockton", CA
   44,   31,   12, "N",     89,   34,   11, "W", "Stevens Point", WI
   40,   21,   36, "N",     80,   37,   12, "W", "Steubenville", OH
   40,   37,   11, "N",    103,   13,   12, "W", "Sterling", CO
   38,    9,    0, "N",     79,    4,   11, "W", "Staunton", VA
   39,   55,   11, "N",     83,   48,   35, "W", "Springfield", OH
   37,   13,   12, "N",     93,   17,   24, "W", "Springfield", MO
   42,    5,   59, "N",     72,   35,   23, "W", "Springfield", MA
   39,   47,   59, "N",     89,   39,    0, "W", "Springfield", IL
   47,   40,   11, "N",    117,   24,   36, "W", "Spokane", WA
   41,   40,   48, "N",     86,   15,    0, "W", "South Bend", IN
   43,   32,   24, "N",     96,   43,   48, "W", "Sioux Falls", SD
   42,   29,   24, "N",     96,   23,   23, "W", "Sioux City", IA
   32,   30,   35, "N",     93,   45,    0, "W", "Shreveport", LA
   33,   38,   23, "N",     96,   36,   36, "W", "Sherman", TX
   44,   47,   59, "N",    106,   57,   35, "W", "Sheridan", WY
   35,   13,   47, "N",     96,   40,   48, "W", "Seminole", OK
   32,   25,   11, "N",     87,    1,   11, "W", "Selma", AL
   38,   42,   35, "N",     93,   13,   48, "W", "Sedalia", MO
   47,   35,   59, "N",    122,   19,   48, "W", "Seattle", WA
   41,   24,   35, "N",     75,   40,   11, "W", "Scranton", PA
   41,   52,   11, "N",    103,   39,   36, "W", "Scottsbluff", NB
   42,   49,   11, "N",     73,   56,   59, "W", "Schenectady", NY
   32,    4,   48, "N",     81,    5,   23, "W", "Savannah", GA
   46,   29,   24, "N",     84,   20,   59, "W", "Sault Sainte Marie", MI
   27,   20,   24, "N",     82,   31,   47, "W", "Sarasota", FL
   38,   26,   23, "N",    122,   43,   12, "W", "Santa Rosa", CA
   35,   40,   48, "N",    105,   56,   59, "W", "Santa Fe", NM
   34,   25,   11, "N",    119,   41,   59, "W", "Santa Barbara", CA
   33,   45,   35, "N",    117,   52,   12, "W", "Santa Ana", CA
   37,   20,   24, "N",    121,   52,   47, "W", "San Jose", CA
   37,   46,   47, "N",    122,   25,   11, "W", "San Francisco", CA
   41,   27,    0, "N",     82,   42,   35, "W", "Sandusky", OH
   32,   42,   35, "N",    117,    9,    0, "W", "San Diego", CA
   34,    6,   36, "N",    117,   18,   35, "W", "San Bernardino", CA
   29,   25,   12, "N",     98,   30,    0, "W", "San Antonio", TX
   31,   27,   35, "N",    100,   26,   24, "W", "San Angelo", TX
   40,   45,   35, "N",    111,   52,   47, "W", "Salt Lake City", UT
   38,   22,   11, "N",     75,   35,   59, "W", "Salisbury", MD
   36,   40,   11, "N",    121,   39,    0, "W", "Salinas", CA
   38,   50,   24, "N",     97,   36,   36, "W", "Salina", KS
   38,   31,   47, "N",    106,    0,    0, "W", "Salida", CO
   44,   56,   23, "N",    123,    1,   47, "W", "Salem", OR
   44,   57,    0, "N",     93,    5,   59, "W", "Saint Paul", MN
   38,   37,   11, "N",     90,   11,   24, "W", "Saint Louis", MO
   39,   46,   12, "N",     94,   50,   23, "W", "Saint Joseph", MO
   42,    5,   59, "N",     86,   28,   48, "W", "Saint Joseph", MI
   44,   25,   11, "N",     72,    1,   11, "W", "Saint Johnsbury", VT
   45,   34,   11, "N",     94,   10,   11, "W", "Saint Cloud", MN
   29,   53,   23, "N",     81,   19,   11, "W", "Saint Augustine", FL
   43,   25,   48, "N",     83,   56,   24, "W", "Saginaw", MI
   38,   35,   24, "N",    121,   29,   23, "W", "Sacramento", CA
   43,   36,   36, "N",     72,   58,   12, "W", "Rutland", VT
   33,   24,    0, "N",    104,   31,   47, "W", "Roswell", NM
   35,   56,   23, "N",     77,   48,    0, "W", "Rocky Mount", NC
   41,   35,   24, "N",    109,   13,   48, "W", "Rock Springs", WY
   42,   16,   12, "N",     89,    5,   59, "W", "Rockford", IL
   43,    9,   35, "N",     77,   36,   36, "W", "Rochester", NY
   44,    1,   12, "N",     92,   27,   35, "W", "Rochester", MN
   37,   16,   12, "N",     79,   56,   24, "W", "Roanoke", VA
   37,   32,   24, "N",     77,   26,   59, "W", "Richmond", VA
   39,   49,   48, "N",     84,   53,   23, "W", "Richmond", IN
   38,   46,   12, "N",    112,    5,   23, "W", "Richfield", UT
   45,   38,   23, "N",     89,   25,   11, "W", "Rhinelander", WI
   39,   31,   12, "N",    119,   48,   35, "W", "Reno", NV
   50,   25,   11, "N",    104,   39,    0, "W", "Regina", SA
   40,   10,   48, "N",    122,   14,   23, "W", "Red Bluff", CA
   40,   19,   48, "N",     75,   55,   48, "W", "Reading", PA
   41,    9,   35, "N",     81,   14,   23, "W", "Ravenna", OH 

在文本編輯器中復制/粘貼。 然后另存為cities.csv

您可以從此處下載其他 CSV 文件進行測試: https://www.stats.govt.nz/large-datasets/csv-files-for-download/

另一個問題是使用Cell.setCellValue總是使用字符串值,而 Excel 在字符串和數字單元格值之間存在差異。 但這是使用 CSV 的一個眾所周知的問題。需要一個額外的定義文件來顯示哪個 CSV 列是哪種數據類型。

您可以看看這些方法以供參考:

sheet.setAutoFilter()

row.setHeightInPoints()

對於顏色,您可能必須使用單元格樣式。

CellStyle.setFillBackgroundColor()

所有這些都記錄在案,也許 go 以此作為開始。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM