简体   繁体   English

尝试读取csv文件时出错

[英]Error trying to read csv file

Good Day, 美好的一天,

i am having trouble reading csv files on my asp.net project. 我在读取asp.net项目上的csv文件时遇到问题。

it always returns the error index out of range cannot find column 6 它总是返回错误索引超出范围,无法找到第6列

before i go on explaining what i did here is the code: 在继续解释我在这里所做的工作之前,代码是:

                string savepath;  
                HttpPostedFile postedFile = context.Request.Files["Filedata"];
                savepath = context.Server.MapPath("files");
                string filename = postedFile.FileName;
                todelete = savepath + @"\" + filename;
                string forex = savepath + @"\" + filename;
                postedFile.SaveAs(savepath + @"\" + filename);
                DataTable tblcsv = new DataTable();
                tblcsv.Columns.Add("latitude");
                tblcsv.Columns.Add("longitude");
                tblcsv.Columns.Add("mps");
                tblcsv.Columns.Add("activity_type");
                tblcsv.Columns.Add("date_occured");
                tblcsv.Columns.Add("details");
                string ReadCSV = File.ReadAllText(forex);

                foreach (string csvRow in ReadCSV.Split('\n'))
                {
                    if (!string.IsNullOrEmpty(csvRow))
                    {
                        //Adding each row into datatable  
                        tblcsv.Rows.Add();
                        int count = 0;
                        foreach (string FileRec in csvRow.Split('-'))
                        {
                            tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
                            count++;
                        }
                    }


                }

i tried using comma separated columns but the string that comes with it contains comma so i tried the - symbol just to make sure that there are no excess commas on the text file but the same error is popping up. 我尝试使用逗号分隔的列,但随附的字符串包含逗号,因此我尝试使用-符号只是为了确保文本文件上没有多余的逗号,但弹出相同的错误。

am i doing something wrong? 难道我做错了什么?

thank you in advance 先感谢您

Your excel file might have more columns than 6 for one or more rows. 您的excel文件中一行或多行的列数可能超过6。 For this reason the splitting in inner foreach finds more columns but the tblcsv does not have more columns than 6 to assign the extra column value. 因此,在内部foreach中拆分会发现更多列,但tblcsv的列数不能超过6以分配额外的列值。

Try something like this: 尝试这样的事情:

foreach (string FileRec in csvRow.Split('-'))
{
    if(count > 5)
        return;
    tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
    count++;
}

However it would be better if you check for additional columns before processing and handle the issue. 但是,最好在处理并处理问题之前检查其他列。

StringBuilder errors = new StringBuilder();   //// this will hold the record for those array which have length greater than the 6

foreach (string csvRow in ReadCSV.Split('\n'))
                {
                    if (!string.IsNullOrEmpty(csvRow))
                    {
                        //Adding each row into datatable  
                       DataRow dr = tblcsv.NewRow(); and then  
                        int count = 0;
                        foreach (string FileRec in csvRow.Split('-'))
                        {
                     try
                      {
                           dr[count] = FileRec;
                            tblcsv.Rows.Add(dr);
                      }
                     catch (IndexOutOfRangeException i)
                     {
                     error.AppendLine(csvRow;)
                     break;
                     }
                            count++;
                    }
                    }
                }

Now in this case we will have the knowledge of the csv row which is causing the errors, and rest will be processed successfully. 现在,在这种情况下,我们将了解导致错误的csv行,其余的将成功处理。 Validate the row in errors whether its desired input, if not then correct value in csv file. 验证错误的行是否是所需的输入,如果不是,则更正csv文件中的值。

You can't treat the file as a CSV if the delimiter appears inside a field. 如果分隔符出现在字段中,则不能将文件视为CSV。 In this case you can use a regular expression to extract the first five fields up to the dash, then read the rest of the line as the sixth field. 在这种情况下,您可以使用正则表达式提取前五个字段到破折号,然后将行的其余部分读取为第六个字段。 With a regex you can match the entire string and even avoid splitting lines. 使用正则表达式,您可以匹配整个字符串,甚至可以避免分割线。

Regular expressions are also a lot faster than splits and consume less memory because they don't create temporary strings. 正则表达式也比拆分快得多,并且由于不创建临时字符串,因此占用更少的内存。 That's why they are used extensively to parse log files. 这就是为什么它们被广泛用于解析日志文件的原因。 The ability to capture fields by name doesn't hurt either 通过名称捕获字段的能力也不会受到损害

The following sample parses the entire file and captures each field in a named group. 下面的示例分析整个文件,并捕获命名组中的每个字段。 The last field captures everything to the end of the line: 最后一个字段捕获到行尾的所有内容:

var pattern="^(?<latitude>.*?)-(?<longitude>.*?)-(?<mps>.*?)-(?<activity_type>.*?)-" +
            "(?<date_occured>.*?)-(?<detail>.*)$";
var regex=new Regex(pattern,RegexOptions.Multiline);
var matches=regex.Matches(forex);

foreach (Match match in matches)
{
    DataRow dr = tblcsv.NewRow();
    row["latitude"]=match.Groups["latitude"].Value);
    row["longitude"]=match.Groups["longitude"].Value);
    ...
    tblcsv.Rows.Add(dr);
}

The (?<latitude>.*?)- pattern captures everything up to the first dash into a group named latitude . (?<latitude>.*?)-模式将直到第一个破折号的所有内容捕获到名为latitude的组中。 The .*? .*? pattern means the matching isn't greedy ie it won't try to capture everything to the end of the line but will stop when the first - is encountered. 模式意味着匹配是不贪,即它不会尝试捕捉一切该行的结束,但将停止在第一-遇到。

The column names match the field names, which means you can add all fields with a loop: 列名与字段名匹配,这意味着您可以使用循环添加所有字段:

foreach (Match match in matches)
{
    var row = tblCsv.NewRow();
    foreach (Group group in match.Groups)
    {
        foreach (DataColumn col in tblCsv.Columns)
        {
            row[col.ColumnName]=match.Groups[col.ColumnName].Value;                        
        }                    
    }
    tblCsv.Rows.Add(row);
}
tblCsv.Rows.Add(row);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM