简体   繁体   中英

Filter CSV file and create new files based on contents of the original

The contents of the original file DOC.csv is the following:

AH08B/001319;F09351812;F09351812;F09351812;20131112;101009;10;3.30;15.00;0
AH08B/001319;F09351812;F09351812;F09351812;20131112;101009;10;3.30;15.00;0
AH08B/001320;F09351812;F09351812;F09351812;20131112;101271;400;1.30;5.00;10

Filtering that file by the first column, i need to obtain two new files:

File 1

AH08B/001319;F09351812;F09351812;F09351812;20131112;101009;10;3.30;15.00;0
AH08B/001319;F09351812;F09351812;F09351812;20131112;101009;10;3.30;15.00;0

File 2

AH08B/001320;F09351812;F09351812;F09351812;20131112;101271;400;1.30;5.00;10

What is the best approach that i can follow to obtain these results?

As long as the file isn't too large to fit in memory, something like this should work:

Dim groups = IO.File.ReadAllLines("DOC.csv").GroupBy(Function(x) x.Substring(0, x.IndexOf(";"c)))
For i = 0 To groups.Count - 1
    IO.File.WriteAllLines("DOC" & (i + 1).ToString.PadLeft(2, "0"c) & ".csv", groups(i).ToArray)
Next

If memory is an issue here's one way that will work:

Dim keys As New List(Of String)
Using sr As New IO.StreamReader("textfile1.txt")
    Do Until sr.EndOfStream
       Dim line = sr.ReadLine
        Dim key As String = line.Substring(0, line.IndexOf(";"c))
        If keys.Contains(key) Then
            IO.File.AppendAllText("DOC" & (keys.IndexOf(key) + 1).ToString.PadLeft(2, "0"c) & ".csv", line & vbNewLine)
        Else
            keys.Add(key)
            IO.File.WriteAllText("DOC" & keys.Count.ToString.PadLeft(2, "0"c) & ".csv", line & vbNewLine)
        End If
    Loop
End Using

Either way will create files with the lines grouped according to the first field and the filenames in the format of "DOCxx.csv".

Below takes the first element from your csv file, creates a new file if one does not exist, and then adds the record to it. This processes row by row (eg not optimized for speed), but shouldn't run into memory constraints.

string fileName = "C:\\Temp\\T1.csv";
if (File.Exists(fileName))
{
    StreamReader sr = new StreamReader(fileName);
    while (!sr.EndOfStream)
    {
        string record = sr.ReadLine();
        string newFileName = "C:\\Temp\\" + record.Substring(0, record.IndexOf(";")) + ".csv";
        if (!File.Exists(newFileName))
        {
            File.Create(newFileName);
        }
        StreamWriter sw = new StreamWriter(newFileName, true);
        sw.WriteLine(record);
        sw.Close();
    }
    sr.Close();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM