简体   繁体   中英

OpenCSV not escaping the quotes(")

I have a CSV file which will have delimiter or unclosed quotes inside a quotes, How do i make CSVReader ignore the quotes and delimiters inside quotes. For example:

123|Bhajji|Maga|39|"I said Hey|" I am "5|'10."|"I a do "you"|get that"

This is the content of file.

The below program to read the csv file.

@Test
public void readFromCsv() throws IOException {
    FileInputStream fis = new FileInputStream(
            "/home/netspurt/awesomefile.csv");
    InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
    CSVReader reader = new CSVReader(isr, '|', '\"');

    for (String[] row; (row = reader.readNext()) != null;) {
        System.out.println(Arrays.toString(row));
    }
    reader.close();
    isr.close();
    fis.close();
}

I get the o/p something like this.

[123, Bhajji, Maga, 39, I said Hey| I am "5|'10., I am an idiot do "you|get that]

what happened to quote after you

Edit: The Opencsv dependency com.opencsv opencsv 3.4

from the source code of com.opencsv:opencsv:

  /**
     * Constructs CSVReader.
     *
     * @param reader    the reader to an underlying CSV source.
     * @param separator the delimiter to use for separating entries
     * @param quotechar the character to use for quoted elements
     * @param escape    the character to use for escaping a separator or quote
     */

    public CSVReader(Reader reader, char separator,
                     char quotechar, char escape) {
        this(reader, separator, quotechar, escape, DEFAULT_SKIP_LINES, CSVParser.DEFAULT_STRICT_QUOTES);
    }

see http://sourceforge.net/p/opencsv/source/ci/master/tree/src/main/java/com/opencsv/CSVReader.java

There is a constructor with an additional parameter escape which allows to escape separators and quotes (as per the javadoc).

As the CSV format specifies the quotes(") if its inside a field we need to precede it by another quote("). So this solved my problem.

123|Bhajji|Maga|39|"I said Hey|"" I am ""5|'10."|"I a do ""you""|get that"

Refrence: https://www.ietf.org/rfc/rfc4180.txt

Sorry but I don't have enough rep to add a comment so I will have to add an answer.

For your original question of what happened to the quote after the you the answer is the same as what happened to the quote before the I.

For CSV data the quote immediately before and after the separator is the start and end of the field data and is thus removed. That is why those two quotes are missing.

You need to escape out the quotes that are part of the field. The default escape character is the \\

Taking a guess as to which quotes you want to escape the string should look like

123|Bhajji|Maga|39|"I said \"Hey I am \"5'10. Do \"you\" get that?\""

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM