如何使用没有DEFAULT_QUOTE_CHARACTER的openCSV CSVReader？

Question

I am using CSVReader to read from a tab delimited text file which has a field called "user_comments". 我正在使用CSVReader从制表符分隔的文本文件中读取，该文件具有名为“ user_comments”的字段。 In this column we can find all kinds of free form text which users have entered. 在此列中，我们可以找到用户输入的各种自由格式文本。

Here is the code where I declare my parser... 这是我声明解析器的代码...

import au.com.bytecode.opencsv.CSVReader;

CSVReader csv = new CSVReader(new FileReader(opt.f),'\t' as char, '~' as char, '\0' as char);

The third argument to the constructor there is the "DEFAULT_QUOTE_CHARACTER". 构造函数的第三个参数是“ DEFAULT_QUOTE_CHARACTER”。 The default value is... 默认值为...

 public static final char DEFAULT_QUOTE_CHARACTER = '\"';

I set it to '~' because that "user_comments" column has values with double quotes inside of it (which should not be treated as actual quotes but should just be read as data from the column). 我将其设置为“〜”，因为“ user_comments”列中的值带有双引号（不应将其视为实际引号，而应仅将其作为列中的数据读取）。

Problem is that column also has "~" and "|". 问题在于该列还具有“〜”和“ |”。

So can I create an instance of CSVReader without a default quote character? 那么我可以创建没有默认引号字符的CSVReader实例吗？ If not can you suggest a character I can use which is very rare and likely not found in this "user_comments" column? 如果不能，那么您可以建议一个我可以使用的字符，这种字符非常罕见，并且很可能在此“ user_comments”列中找不到？

Answer 1

Inspect Unicode's BMP plane ( http://unicode.org/roadmaps/bmp/ ) back to front. 从头到尾检查Unicode的BMP平面（ http://unicode.org/roadmaps/bmp/ ）。 You're bound to find one that is "unlikely to be used in your data". 您一定会找到一个“不太可能在您的数据中使用”的数据。 Then use \\u.... to code it in your pgm source. 然后使用\\ u ....在pgm源代码中对其进行编码。

Or better still, use a codepoint that doesn't even represent a Unicode char, eg \퟇. 或更妙的是，使用甚至不表示Unicode字符的代码点，例如\\ ud7c7。

如何使用没有DEFAULT_QUOTE_CHARACTER的openCSV CSVReader？

问题描述

1 个解决方案

解决方案1
0 2016-06-30 14:13:43

如何使用没有DEFAULT_QUOTE_CHARACTER的openCSV CSVReader？

问题描述

1 个解决方案

解决方案1 0 2016-06-30 14:13:43

解决方案1
0 2016-06-30 14:13:43