[英]What's the best way to read a tab-delimited text file in C#
We have a text file with about 100,000 rows, about 50 columns per row, most of the data is pretty small (5 to 10 characters or numbers). 我们有一个大约100,000行的文本文件,每行大约50列,大多数数据都很小(5到10个字符或数字)。
This is a pretty simple task, but just wondering what the best way would be to import this data into a C# data structure (for example a DataTable)? 这是一个非常简单的任务,但只是想知道将这些数据导入C#数据结构(例如DataTable)的最佳方法是什么?
I would read it in as a CSV with the tab column delimiters: 我会将其作为带有制表符分隔符的CSV读取:
Edit: 编辑:
Here's a barebones example of what you'd need: 以下是您需要的准系统示例:
DataTable dt = new DataTable();
using (CsvReader csv = new CsvReader(new StreamReader(CSV_FULLNAME), false, '\t')) {
dt.Load(csv);
}
Where CSV_FULLNAME is the full path + filename of your tab delimited CSV. 其中CSV_FULLNAME是制表符分隔的CSV的完整路径+文件名。
Use .NET's built in text parser. 使用.NET的内置文本解析器。 It is free, has great error handling, and deals with a lot of odd ball cases.
它是免费的,具有很好的错误处理能力,并且处理很多奇怪的球案例。
http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser(VS.80).aspx http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser(VS.80).aspx
What about FileHelpers , you can define the tab as a delimiter. 那么FileHelpers ,您可以将选项卡定义为分隔符。 HEad on over to that site by the link supplied and have a peeksy.
通过提供的链接访问该网站,并有一个peeksy。
Hope this helps, Best regards, Tom. 希望这会有所帮助,最好的问候,汤姆。
Two options: 两种选择:
System.Data.OleDb
namespace. System.Data.OleDb
命名空间中的类。 This has the advantage of reading directly into a datatable like you asked with very little code, but it can be tricky to get right because it's tab rather than comma delimited. However you parse the lines, make sure you use something that supports forwarding and rewinding, being the data source of your data grid. 但是,您解析行,确保使用支持转发和倒带的东西,作为数据网格的数据源。 You don't want to load everything into memory first, do you?
您不想先将所有内容加载到内存中,对吗? How about if the amount of data should be ten-fold the next time?
如果下次数据量应该是十倍,怎么样? Make something that uses file.seek deep down, don't read everything to memory first.
制作一些内容使用file.seek的东西,不要先读取内存中的所有内容。 That's my advice.
这是我的建议。
Simple, but not the necessarily a great way: 简单,但不一定是一个很好的方式:
Read the file using a text reader into a string 使用文本阅读器将文件读入字符串
Use String.Split to get the rows 使用String.Split获取行
use String.Split with a tab character to get field values 使用带有制表符的String.Split来获取字段值
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.