简体   繁体   English

C#使用CSV值内的逗号将CSV读取到DataTable

[英]C# Read CSV to DataTable with comma inside CSV values

I'm using OleDBConnection to read a CSV file and turn it into a DataTable. 我正在使用OleDBConnection读取CSV文件并将其转换为DataTable。 This is my function: 这是我的功能:

private DataTable csv2datatable(string caminho)
        {
            criarCsvSchema(caminho);

            DataTable dt = new DataTable("data");
            using (OleDbConnection conexao = new OleDbConnection(
                    "Provider=Microsoft.Jet.OLEDB.4.0;" +
                    "Data Source=\"" + Path.GetDirectoryName(caminho) + "\";" +
                    "Extended Properties='text;HDR=yes;'"
                )
            )
            {
                using (OleDbCommand cmd = new OleDbCommand(
                        string.Format("select * from [{0}]",new FileInfo(caminho).Name),
                        conexao
                    )
                )
                {
                    conexao.Open();
                    using(OleDbDataAdapter adaptador = new OleDbDataAdapter(cmd))
                    {
                        adaptador.Fill(dt);
                    }
                }
            }
            return dt;
        }

The method "criarCsvSchema" creates the schema.ini with this configuration: 方法“ criarCsvSchema”使用以下配置创建schema.ini:

[CAM jan.csv]
ColNameHeader=True
Format=Delimited(;)
DecimalSymbol=,

My CSV file have this type of structure (It doesn't have quotes): 我的CSV文件具有这种类型的结构(它没有引号):

510,54;0,00;0,00;0,00;15,31; 510,54; 0,00; 0,00; 0,00; 15,31;

So, the decimal symbol is ',' and the delimiter is ';'. 因此,十进制符号为“,”,分隔符为“;”。 When I run this project, I get this DataTable: 当我运行该项目时,我得到以下数据表:

输出

What I can't understand is: Why the first column "510,54" is correct and the others are returnerd as date? 我无法理解的是:为什么第一列“ 510,54”是正确的,而其他列仍返回日期?

Thank you! 谢谢!

@edit - the first 5 rows of the csv file (including the header): @edit-CSV文件的前5行(包括标题):
https://github.com/rponciano/just-show/blob/master/shared-copy.csv https://github.com/rponciano/just-show/blob/master/shared-copy.csv

Your whole result is wierd. 您的整个结果很奇怪。 The cells "510,54" "0,00" "0,00" "0,00" and 15,31 are interpreted as: 1 Float and 4 datetimes? 单元格“ 510,54”,“ 0,00”,“ 0,00”,“ 0,00”和15,31解释为:1个浮动时间和4个日期时间?

My best guess is that the other values all evaluate to 30/12/1899 00:00, with the hours being truncated during output for being "propably irrelevant". 我最好的猜测是,其他所有值的总和为30/12/1899 00:00,并且在输出过程中由于“适当无关”而将小时数截断。 So whatever you are doing wrong here, you are propably doing wrong with everything past the 1st value. 因此,无论您在此处做错什么,都可能会超出1st值的一切都做错了。

It is really hard to know without knowing the actuall values of the DataTable and how they are transformed before output as strings (wich code, wich culture). 如果不知道DataTable的实际值以及在输出为字符串之前如何转换它们(三明治代码,三明治文化),真的很难知道。 Something as simple as the display technology used might help, as each of the 5+ has a different DataTable. 像所使用的显示技术一样简单的方法可能会有所帮助,因为5+中的每个都有不同的DataTable。

As @RajN commet, I just need force DateTimeFormat in the schema.ini. 作为@RajN commet,我只需要在schema.ini中强制使用DateTimeFormat。

"Try forcing the DateTimeFormat = YYYY-MM-DD in your schema.ini file." “尝试在schema.ini文件中强制使用DateTimeFormat = YYYY-MM-DD。”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM