简体   繁体   English

读取和写入特定行到文本文件 C#

[英]Reading and Write specific lines to text file C#

I have a master file called FileName with IDs of people.我有一个名为 FileName 的主文件,其中包含人员 ID。 It is in sorted order.它是按顺序排列的。 I want to divide IDs into 27 chunks and copy each chunk into a different text file.我想将 ID 分成 27 个块并将每个块复制到不同的文本文件中。

using (FileStream fs = File.Open(FileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
         {
            string line;
      int numOfLines = File.ReadAllLines(FileName).Length;  -- I have 73467
       int eachSubSet = (numOfLines / 27);
       var lines = File.ReadAllLines(dataFileName).Take(eachSubSet);
       File.WriteAllLines(FileName1,lines);
      }

I have 27 different text files.我有 27 个不同的文本文件。 so I want 73467 of IDs divided equally and copied over to 27 different files.所以我希望将 73467 个 ID 均分并复制到 27 个不同的文件中。 So, 1st file will have ID#1 to ID#2721 2nd Dile will have ID#2722 to ID#(2722+2721) and so on.因此,第一个文件的 ID#1 到 ID#2721 第二个文件的 ID#2722 到 ID#(2722+2721) 等等。 I do not know how to automate this and run this quickly.我不知道如何自动执行此操作并快速运行。

Thanks HR谢谢人力资源

The simplest way would be to run File.ReadLine and WriteLine inside a loop and decide what file will receive which line.最简单的方法是在循环内运行 File.ReadLine 和 WriteLine 并决定哪个文件将接收哪一行。

I wouldn't recommend you to parallelize this routine since it's an IO operation, but just the copy of lines would be pretty fast.我不建议你并行化这个例程,因为它是一个 IO 操作,但只是行的副本会非常快。

Note that in your sample code you called File.ReadAllLines twice, so you actually parse your entire input file twice.请注意,在您的示例代码中,您两次调用 File.ReadAllLines,因此您实际上对整个输入文件进行了两次解析。 So avoiding that should speed up the process, and also you didn't actually split the files, you only wrote the first file out of the 27. Untested, but something along these lines should work:因此,避免这样做应该会加快进程,而且您实际上并没有拆分文件,您只编写了 27 个文件中的第一个文件。未经测试,但应该可以使用以下方法:

const int numOfFiles = 27;
string[] lines = File.ReadAllLines(FileName);
int numOfLines = lines.Length;
int eachSubSet = numOfLines/numOfFiles;
int firstSubset = numOfLines%numOfFiles + eachSubSet;
IEnumerable<string> linesLeftToWrite = lines;
for (int index = 0; index < numOfFiles; index++)
{
    int numToTake = index == 0 ? firstSubset : eachSubSet;
    File.WriteAllLines(string.Format("{0}_{1}.txt", FileName, index), linesLeftToWrite.Take(numToTake));
    linesLeftToWrite = linesLeftToWrite.Skip(numToTake);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM