简体   繁体   中英

C# StreamReader ReadLine breaking lines at internal linebreaks within cells

I have a row of data in a csv where some of the cells may contain line breaks

在此处输入图片说明

I'm uploading this file using an Asp:FileUpload and trying to read through each line with a StreamReader:

var file = btnFileUpload.PostedFile;
using (StreamReader sr = new StreamReader(file.InputStream))
{
    string currentLine;
    var line = 1;
    // currentLine will be null when the StreamReader reaches the end of file
    while ((currentLine = sr.ReadLine()) != null)
    {
          ....do stuff...
    }
}

However, in debugging I found that sr.ReadLine() is breaking the lines at the line breaks within the cells, such as in the Category cell. For example, when I read line 2 (the first line of data after the header), the value is:

"/Home/Blog/2018/november/power,English : English,Erica Stockwell-Alpert,/Home/Blog/Categories/Accounts Payable Automation;"

and then the next sr.ReadLine():

"/Home/Blog/Categories/Financial Services;"

and then

"/Home/Blog/Categories/Robotic Process Automoation,<p>[the rest of the line]"

How can I prevent sr.ReadLine() from breaking on the new line characters within cells? Or if I can't, how else can I read the file line by line?

Note: I cannot use a csv reader ClassMap anad csvReader.GetRecords because the tool I am working on needs to be able to handle any different fields in the header, it is not associated with one specific class. So I need to read through the file line by line.

You are confusing lines with records . You say you want to read your file line-by-line, but what you really want to do is read it record-by-record. Since your data can have line breaks in the middle of a record, then using ReadLine isn't going to give you what you want, because that method doesn't know where the end of the record is. It only knows how to find the next line break.

You are going to need to use a proper CSV reader to solve this. But, don't worry, there are CSV readers out there which do not require you to map the data to a fixed class. One I have used many times is Lumenworks CSV Reader . It is free (open source, MIT license), supports multi-line fields within a record and is easy to use.

Here is an example of how you would use it to process a file record-by-record:

using (StreamReader sr = new StreamReader(file.InputStream))
using (CsvReader csv = new CsvReader(sr, hasHeaders: true))
{
    csv.SupportsMultiline = true;

    // read the first record of the file as column headers and put them into an array
    string[] headers = csv.GetFieldHeaders();

    // read each data record one by one - this returns false when there is no more data
    while (csv.ReadNextRecord())
    {
        // 0-based index of the current CSV record (excluding the headers) if you need it
        var recordNumber = csv.CurrentRecordIndex;

        // loop over the columns in the row and process them
        for (int i = 0; i < csv.FieldCount; i++)
        {
            string fieldName = headers[i];
            string fieldValue = csv[i];      // may contain line breaks

            // ...do stuff...
        }
    }
}

Working demo: https://dotnetfiddle.net/ZYSA7r

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM