简体   繁体   中英

Apache commons CSV | How can I ignore/include semicolon, comma in a field?

I am trying to parse a log a file and store it in a CSV file. Here is a sample line below:

218.1.111.50 - - [13/Mar/2005:10:36:11 -0500] "GET http://www.yahoo.com/ HTTP/1.1" 403 2898 "-" "Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)"

For this, I am using the Apach commons CSV library. The problem is that some fields have in the special character ; their value, and they get interpreted as a separator.

If we look for example at the field value Mozilla/4.0 (compatible; MSIE 4.01; Windows 95) . This single field is assigned to 3 different values because of the ; .

在此处输入图片说明

I don't know the ideal method to go around this. Please see below, a snapchot of the code related to the library I use :

  CSVPrinter printer = new CSVPrinter(writer, CSVFormat.DEFAULT
                    .withHeader(HEADERS));
//
//
Matcher m = p.matcher(line);
                    Date date=formatter.parse(m.group("Time"));

            try {

                printer.printRecord(date.getMonth(), date.getDate(), date.getHours(), date.getMinutes(), date.getSeconds(), m.group("NetworkSrcIpv4"),
                        m.group("ApplicationHttpStatus"),m.group("ApplicationLen"),m.group("ApplicationHttpUserAgent"),
                        m.group("ApplicationHttpQueryString"));

                printer.flush();

            } catch (IOException e) {

                e.printStackTrace();

            }
//

Is there any possibility of automatically ignoring the ; , or perhaps replacing them with some values which won't affect the desired result? Is there any options I might add the my CSVprinter ?

Thank you for your feedback.

You can configure TAB as delimiter instead of using DEFAULT delimiter -

CSVPrinter printer = new CSVPrinter(writer, CSVFormat.TDF.withHeader(HEADERS));

https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/CSVFormat.html#TDF

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM