简体   繁体   English

用Java写入CSV文件

[英]Writing to a csv file in java

I am writing a java program to write data to a csv file which fetches a key's count value from database and writes the count corresponding to each key in the file. 我正在编写一个Java程序,将数据写入csv文件,该文件从数据库中获取密钥的计数值,并将与每个密钥相对应的计数写入文件中。 I have accomplished it using FileWriter whose pseudocode goes like below 我已经使用FileWriter完成了它的伪代码如下

while (keys.hasNext()) {
    writer.append(keys.next().getCount());
    writer.append(',');
}

// where keys is the list of the keys

The headers are also appended in the above way. 标头也以上述方式附加。 Now I have come across open source libraries such as OpenCSV and CommonsCSV for writing to csv files. 现在,我遇到了用于写入csv文件的开源库,例如OpenCSV和CommonsCSV。

So now am wondering whether using the libraries is better or using the above mentioned way of writing to CSV file. 因此,现在想知道使用库是更好还是使用上述写入CSV文件的方式。 Can someone please tell me which way is better in terms of readability and efficiency? 有人可以告诉我哪种方法在可读性和效率上更好吗?

There is an engineering principle - "If it works - don't touch it" . 有一个工程原理- “如果有效-请勿触摸”

Of course using a mature open source library will often get you a benefit in terms of code stability and flexibility. 当然,使用成熟的开源库通常会使您在代码稳定性和灵活性方面受益。 But you will spend your time to learn this library and it may lead to some refactorings in your code to adapt it nicely. 但是您将花费时间来学习该库,并且它可能会导致代码中进行一些重构以使其很好地适应。

In your case what you can achieve is a greater control over field separators and encodings. 在您的情况下,您可以实现对字段分隔符和编码的更好控制。

It's pretty much up to you. 这完全取决于您。 Here's the OpenCSV equivalent of your code: 这是与代码等效的OpenCSV:

 CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"), '\t');
 ...
 String[] row = new String[];
 int i=0;
 while(keys.hasNext()) {
     row[i++] = keys.next().getCount();
 }
 writer.writeNext(entries);

Is that more or less readable than yours? 它比您的可读性强吗? That's subjective and up to you. 这是主观的,取决于您。 I can tell you that yours is not inefficient. 我可以告诉你,你的效率并不低。

It's worth noting that your code will write "," at the end of each line. 值得注意的是,您的代码将在每一行的末尾写入“,”。 The library will not. 图书馆不会。 Your code could be changed like this: 您的代码可以像这样更改:

boolean more = keys.hasNext();
while (more) {
   writer.append(keys.next().getCount());
   more = keys.hasNext();
   if(more) {
      writer.append(',');
   }
}

CSV seems simple, and usually is, until you start encountering more complex situations, such as quoted fields containing commas or escaped quotes: CSV似乎很简单,而且通常是这样,直到您开始遇到更复杂的情况,例如包含逗号的引号字段或转义的引号:

 A field,"another field","a field, containing a comma","A \"field\""

If your program encounters a situation like this, it will break, and you'd need to enhance your CSV algorithms to handle it. 如果您的程序遇到这样的情况,它将崩溃,您需要增强CSV算法来处理它。 If you were using a library, you could have some reasonable expectation that it would handle quotes and quoted commas from the outset. 如果您使用的是库,则可以合理地期望它从一开始就处理引号和引号逗号。 It's up to you how likely you think that situation is. 取决于您认为这种情况的可能性。

Writing CSV code is usually simple, but there are pitfalls and it's always good to have less code to maintain. 编写CSV代码通常很简单,但是有一些陷阱,要维护的代码更少总是一件好事。

Using a library has its own overheads -- managing the dependencies and so on. 使用库有其自身的开销-管理依赖项等。

You probably don't need a library for the simple stuff you're doing now. 您可能不需要库来存储您现在正在做的简单事情。 You might consider using one if your own code evolves to get more complicated, or if you start to need features like exporting beans to CSV, or handling CSV containing quoted commas. 如果您自己的代码变得越来越复杂,或者开始需要诸如将bean导出为CSV或处理包含引号的CSV之类的功能,则可以考虑使用一种。

Using an open source library has few considerations: 使用开放源代码库的注意事项很少:

Pros: 优点:

  • No doubt the open source library must have gone through the scrutiny of the community and hence its available as one of the most efficient options . 毫无疑问,开源库必须经过社区的审查,因此,它是最有效的选择之一
  • Saves a lot of boilerplate code and gives you a head-start. 节省了大量样板代码,并为您提供了一个开端。
  • The library is packed with more features than you need. 该库包含了比您需要更多的功能 This helps extending the application in future. 这有助于将来扩展应用程序。
  • Generally, the open source libraries are optimized for performance . 通常,开源库针对性能进行了优化 This saves effort on your side. 这样可以节省您的精力。

Cons: 缺点:

  • Another dependency is added in your application. 在您的应用程序中添加了另一个依赖项
  • A small learning curve involved in using the library. 使用该库所涉及的学习曲线很小。 I would personally discount this as there will be ready-made code available for library usage. 我个人会对此予以打折,因为将有可供图书馆使用的现成代码。
  • Slightly overkill if the use case is very trivial. 如果用例非常琐碎,则略为过度

CSV file format is not simply separating your column names or values with commas. CSV文件格式不只是用逗号分隔列名或值。 If there is a comma (,) or double quote (") in your data, that needs to be escaped properly. 如果您的数据中有逗号(,)或双引号(“),则需要对其进行正确的转义

For example if you have two columns name and address. 例如,如果您有两列名称和地址。 The values you need to write are name:aarish and address:"MyHome",Chicago, MI 您需要编写的值是名称:aarish和地址:“ MyHome”,芝加哥,密歇根州

Then if you write that in CSV like: 然后,如果您以CSV格式编写,例如:

name,address
aarish,"MyHome",Chicago, MI

The values will be parsed as four different fields. 这些值将被解析为四个不同的字段。

If you use the libraries, it will give you output as CSV like 如果使用库,它将以CSV格式输出

"name","address"
"aarish","""MyHome"",Chicago, MI"

This file will be other parsers or opened in other editors properly. 该文件将是其他解析器或在其他编辑器中正确打开的。

So I would suggest you to use libraries if you have some characters like comma(,) or double quote(") in your data, if you don't have any such characters, you can go with your simpler approach. 因此,如果数据中包含诸如comma(,)或双引号(“)之类的字符,我建议您使用库,如果没有任何此类字符,则可以采用更简单的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM