简体   繁体   English

使用opencsv(java)读取.csv文件时跳过空行

[英]Skip blank lines while reading .csv file using opencsv (java)

Good day everyone, My target is to make csv reader to skip the blank lines while parsing a file, do nothing basically.大家好,我的目标是让 csv 阅读器在解析文件时跳过空白行,基本上什么都不做。 only get me the rows with at least one value, At the moment I have two methods -> 1st is just reading all rows as List of Strings array and returns it, 2nd converts the result into List of Lists of Strings: both are bellow:只获取至少有一个值的行,目前我有两种方法 -> 第一种只是将所有行读取为字符串列表数组并返回它,第二种将结果转换为字符串列表列表:两者如下所示:

private List<String[]> readCSVFile(File filename) throws IOException {

    CSVReader reader = new CSVReader(new FileReader(filename));
    List<String[]> allRows = reader.readAll();

    return allRows;

}

public List<List<String>> readFile(File filename) throws IOException {

        List<String[]> allRows = readCSVFile(filename);     
        List<List<String>> allRowsAsLists = new ArrayList<List<String>>();      
        for (String[] rowItemsArray :  allRows) {
            List<String> rowItems = new ArrayList<String>();
            rowItems.addAll(Arrays.asList(rowItemsArray));
            allRowsAsLists.add(rowItems);

        }
    return allRowsAsLists;

}

My first thought was to check (in the 2'nd method) the length of an array if its 0 just to ignore it - which would be something like this:我的第一个想法是检查(在第二种方法中)数组的长度,如果它是 0 只是为了忽略它——这将是这样的:

for (String[] rowItemsArray :  allRows) {
            **if(rowItemArray.length == 0) continue;**
            List<String> rowItems = new ArrayList<String>();
            rowItems.addAll(Arrays.asList(rowItemsArray));
            allRowsAsLists.add(rowItems);

}  

Unfortunately that didn't work for the reason that even if the row is blank it still returns an array of elements - empty Strings in fact.不幸的是,这没有用,因为即使该行是空白的,它仍然返回一个元素数组——实际上是空字符串。 Checking an individual String is not an option as there are 100+ columns and this is variable.检查单个 String 不是一个选项,因为有 100 多个列并且这是可变的。 Please suggest what's the best way to achieve this.请建议实现此目标的最佳方法是什么。 Thanks.谢谢。

Sorted it out this way:整理出来是这样的:

    public List<List<String>> readFile(File filename) throws IOException {

            List<String[]> allRows = readCSVFile(filename, includeHeaders, trimWhitespacesInFieldValues);       
            List<List<String>> allRowsAsLists = new ArrayList<List<String>>();      
            for (String[] rowItemsArray :  allRows) {
                **if(allValuesInRowAreEmpty(rowItemsArray)) continue;**
                List<String> rowItems = new ArrayList<String>();
                rowItems.addAll(Arrays.asList(rowItemsArray));
                allRowsAsLists.add(rowItems);

            }
            return allRowsAsLists;

        }

    private boolean allValuesInRowAreEmpty(String[] row) {
        boolean returnValue = true;
        for (String s : row) {
            if (s.length() != 0) {
                returnValue = false;
            }
        }
        return returnValue;
    }

You could check the length and the first element.您可以检查长度和第一个元素。 If the line contains only a field separator then the lenght > 1. If the line contains a single space character then the first element is not empty.如果该行仅包含字段分隔符,则长度 > 1。如果该行包含单个space字符,则第一个元素不为空。

if (rowItemsArray.length == 1 && rowItemsArray[0].isEmpty()) {
    continue;
}

For opencsv 5.0 there is an API-option to read CSV lines directly into a Bean.对于 opencsv 5.0,有一个 API 选项可以将 CSV 行直接读入 Bean。

For people who prefer using the "CsvToBean" feature, the following solution is using the (sadly deprecated) #withFilter(..) method on CsvToBeanBuilder to skip blank lines in the Inputstream:对于喜欢使用“CsvToBean”功能的人,以下解决方案是使用 CsvToBeanBuilder 上的(遗憾地已弃用)#withFilter(..) 方法来跳过 Inputstream 中的空行:

InputStream inputStream; // provided
List<MyBean> data = new CsvToBeanBuilder(new BufferedReader(new InputStreamReader(inputStream)))
    .withType(MyBean.class)
    .withFilter(new CsvToBeanFilter() {
        /*
         * This filter ignores empty lines from the input
         */
        @Override
        public boolean allowLine(String[] strings) {
            for (String one : strings) {
                if (one != null && one.length() > 0) {
                    return true;
                }
            }
            return false;
        }
    }).build().parse();

Update : With opencsv Release 5.1 (dated 2/2/2020), CsvToBeanFilter got undeprecated as per feature request #120 .更新:随着 opencsv 5.1 版(日期为 2/2/2020), CsvToBeanFilter 根据功能请求#120不再推荐。

You can use a filter with lambda: like below:您可以使用带有 lambda: 的过滤器,如下所示:

CsvToBean<T> csvToBean = new CsvToBeanBuilder<T>(new StringReader(CSV_HEADER + "\n" + lines))
    .withType(clazz)
    .withFieldAsNull(CSVReaderNullFieldIndicator.EMPTY_SEPARATORS)
    .withSeparator(delimiter)
    .withSkipLines(skipLines)
    .withIgnoreLeadingWhiteSpace(true).withFilter(strings -> {
      for (String r : strings) {
        if (r != null && r.length() > 0) {
          return true;
        }
      }
      return false;
    }).build();

Your lambda filter:您的 lambda 过滤器:

.withFilter(strings -> {
      for (String r : strings) {
        if (r != null && r.length() > 0) {
          return true;
        }
      }
      return false;
    })

You could summarize all string values per row after trimming them.您可以在修剪后总结每行的所有字符串值。 If the resulting string is empty, there are no values in any cell.如果结果字符串为空,则任何单元格中都没有值。 In that case ignore the line.在这种情况下,请忽略该行。
Something like this:像这样的东西:

private boolean onlyEmptyCells(ArrayList<String> check) {
    StringBuilder sb = new StringBuilder();
    for (String s : check) {
        sb.append(s.trim());
    }
    return sb.toString().isEmpty(); //<- ignore 'check' if this returns true
}

Here is an updated solution with lambdas based on @Martin 's solution :这是基于@Martin解决方案的 lambda 更新解决方案

InputStream inputStream; // provided
List<MyBean> data = new CsvToBeanBuilder(new BufferedReader(new InputStreamReader(inputStream)))
    .withType(MyBean.class)
    // This filter ignores empty lines from the input
    .withFilter(stringValues -> Arrays.stream(stringValues)
        .anyMatch(value -> value != null && value.length() > 0))
    .build()
    .parse();

If you do not parse into a Bean, you can use Java Streams API to help you with filtering of invalid CSV rows.如果您不解析为 Bean,则可以使用 Java Streams API 来帮助您过滤无效的 CSV 行。 My approach is like this (where is is java.io.InputStream instance with CSV data and YourBean map(String[] row) is your mapping method that maps a CSV row to a your Java object:我的方法是这样的(其中is带有 CSV 数据的java.io.InputStream实例,而YourBean map(String[] row)是将 CSV 行映射到 Java 对象的映射方法:

CSVParser csvp = new CSVParserBuilder()
    .withSeparator(';')
    .withFieldAsNull(CSVReaderNullFieldIndicator.BOTH)
    .build();
CSVReader csvr = new CSVReaderBuilder(new InputStreamReader(is))
    .withCSVParser(csvp)
    .build();
List<YourBean> result = StreamSupport.stream(csvr.spliterator(), false)
    .filter(Objects::nonNull)
    .filter(row -> row.length > 0)
    .map(row -> map(row))
    .collect(Collectors.toList());

The JavaDoc for CsvToBeanFilter states "Here's an example showing how to use CsvToBean that removes empty lines. Since the parser returns an array with a single empty string for a blank line that is what it is checking." CsvToBeanFilter的 JavaDoc 声明“这是一个示例,展示如何使用 CsvToBean 删除空行。因为解析器返回一个包含单个空字符串的数组,用于检查它的空行。” and lists an example of how to do this:并列出了如何执行此操作的示例:

private class EmptyLineFilter implements CsvToBeanFilter {

    private final MappingStrategy strategy;

    public EmptyLineFilter(MappingStrategy strategy) {
        this.strategy = strategy;
    }

    public boolean allowLine(String[] line) {
        boolean blankLine = line.length == 1 && line[0].isEmpty();
        return !blankLine;
    }

 }

 public List<Feature> parseCsv(InputStreamReader streamReader) {
    HeaderColumnNameTranslateMappingStrategy<Feature> strategy = new HeaderColumnNameTranslateMappingStrategy();
    Map<String, String> columnMap = new HashMap();
    columnMap.put("FEATURE_NAME", "name");
    columnMap.put("STATE", "state");
    strategy.setColumnMapping(columnMap);
    strategy.setType(Feature.class);
    CSVReader reader = new CSVReader(streamReader);
    CsvToBeanFilter filter = new EmptyLineFilter(strategy);
    return new CsvToBean().parse(strategy, reader, filter);
 }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用java opencsv阅读时如何跳过csv标题中的其他(,)逗号 - how to skip additional (,)commas in csv headers while reading with java opencsv 使用openCSV CSVReader读取csv文件时出现Java应用程序错误... java.lang.ArrayIndexOutOfBoundsException - Java application Error while reading a csv file using openCSV CSVReader … java.lang.ArrayIndexOutOfBoundsException 在JAVA 6中使用CSVReader(OpenCSV)读取CSV时避免ArrayIndexOutOfBoundsException - Avoiding ArrayIndexOutOfBoundsException while reading a CSV using CSVReader(OpenCSV) in JAVA 6 在Java中使用openCSV读取CSV文件的最后一行 - Reading the last line of CSV file using openCSV in Java 在Java中读取CSV文件时为空行 - Blank lines when reading CSV file in Java 在空白的csv文件中使用opencsv将csv转换为bean时如何解决运行时异常 - How to solve the runtime exception while converting csv to bean using opencsv for blank csv file 读取 Java 中的 csv 文件时如何跳过空白单元格? - How to skip blank cells when reading a csv file in Java? 在 java 中使用 openCSV 从 CSV 文件读取并写入新的 CSV 文件 - Reading from a CSV file and writing to a new CSV file using openCSV in java 使用OpenCSV读取文件 - File reading using OpenCSV 读取文本文件时跳过一定数量的行-BufferedReader Java - Skip certain number of lines while reading a text file - BufferedReader Java
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM