简体   繁体   English

Ignore Specific Columns Parsing a CSV File with Jackson CSV

[英]Ignore Specific Columns Parsing a CSV File with Jackson CSV

My problem is that I need to parse CSV files with arbitrary columns/order into a known domain POJO (say Person).我的问题是我需要将具有任意列/顺序的 CSV 文件解析为已知域 POJO(比如 Person)。 I can identify which columns that I need to process, ignoring the rest.我可以识别我需要处理的列,忽略 rest。

The option CsvParser.Feature.IGNORE_TRAILING_UNMAPPABLE" seemed to be exactly what I need, but the columns that I need to process are not necessarliy grouped at the start of the CSV file, and I cannot force the user to "re-order" the columns of their uploaded CSV files. Also, sometimes I do not get a header row, but the UI forces the user to identify columns & passes this information over. CsvParser.Feature.IGNORE_TRAILING_UNMAPPABLE 选项似乎正是我所需要的,但我需要处理的列不一定在 CSV 文件的开头进行分组,我不能强制用户“重新排序”列他们上传的 CSV 个文件。此外,有时我没有得到 header 行,但 UI 强制用户识别列并将此信息传递过来。

For example, I have the following CSV file:例如,我有以下 CSV 文件:

First Name,Last Name,Nickname,DOB,Occupation,Postal Code
Freddy,Benson,Ruprecht,08/14/45,Con Artist,76701
Lawrence,Jamieson,Prince,03/14/33,Con Artist,5201
Janet,Colgate,Jackal,03/13/55,Con Artist,90401

I only need 4 of the 6 columns (First Name, Last Name, DOB, Postal Code), as my Person POJO only includes those fields:我只需要 6 列中的 4 列(名字、姓氏、出生日期、邮政编码),因为我的 Person POJO 只包含这些字段:

public class Person {
    private String firstName;
    private String lastName;
    private LocalDate dob;
    private String postalCode;
}

I have defined a CsvSchema typed for Person and specify the columns I'm interested in order (First Name, Last Name, IGNORE, DOB, IGNORE2, Postal Code), as I would like to skip columns (Nickname, Occupation).我已经定义了一个为 Person 键入的 CsvSchema,并指定了我感兴趣的列(名字、姓氏、IGNORE、DOB、IGNORE2、邮政编码),因为我想跳过列(昵称、职业)。 Hoever, the "IGNORE" columns get ignored during mapping in the deserializer, and I end up getting "Nickname" values for "DOB", resulting in invalid values for the DOB field.然而,“忽略”列在反序列化器中的映射过程中被忽略,我最终得到“DOB”的“昵称”值,导致 DOB 字段的值无效。

My mistake was defining the schema as follows, which apparently strongly couples the schema to the domain POJO:我的错误是如下定义模式,这显然将模式与域 POJO 强耦合:

CsvSchema schema = mapper
    .typedSchemaFor(Person.class)
    .withSkipFirstDataRow(hasHeader)
    .sortedBy(columnOrder.toArray(new String[columnOrder.size()]));

Resolved by defining schema/columns as follows, which seems to loosly couple the schema to the domain POJO:通过如下定义架构/列来解决,这似乎将架构松散地耦合到域 POJO:

CsvSchema schema = CsvSchema.builder()
    .addColumn("firstName")
    .addColumn("lastName")
    .addColumn("ignore1")
    .addColumn("dob")
    .addColumn("ignore2")
    .addColumn("postalCode")
    .build();

    CsvMapper mapper = new CsvMapper();
    MappingIterator<Person> personIter = mapper
            .readerFor(Person.class)
            .with(schema)
            .readValues(csvFile);

Please refer to this link provided请参考提供的此链接

you should be able to solve this https://github.com/FasterXML/jackson-dataformat-csv/issues/82你应该能够解决这个https://github.com/FasterXML/jackson-dataformat-csv/issues/82

Ignoring unknown can be achieved as shown below(tested using jackson 2.13):忽略未知可以实现如下图(使用jackson 2.13测试):

  1. Annotate POJO注释POJO
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonProperty;
import java.time.LocalDate;

@JsonIgnoreProperties(ignoreUnknown = true) //this is the trick to ignore all undeclared columns
public class Person {
    @JsonProperty("First Name") //declare what's needed
    private String firstName;
    private String lastName;
    private LocalDate dob;
    private String postalCode;
}
  1. Configure csvSchema配置 csvSchema
CsvMapper csvMapper = new CsvMapper();
CsvSchema csvSchema = csvMapper
    .schemaFor(Person.class)
    .withHeader()//header defines the order
    .withColumnReordering(true)//this allows columns to be in any order as long as there is a header
    ;
  1. Finally, use it to read csv file最后用它读取csv文件
csvMapper.readerFor(Person.class).with(csvSchema).readValues(csvFile)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM