简体   繁体   English

我在这里做错了什么...我的每个循环都很慢

[英]What am I Doing Wrong Here… My for each loop is very slow

I am trying to concatenate the strings in two files and save it in a third file. 我试图将字符串连接到两个文件中,并将其保存在第三个文件中。 But when the records from the first two files are more (say 100000+ records) my output file takes a long time to generate. 但是,如果前两个文件中的记录更多(例如100000条以上的记录),我的输出文件将花费很长时间生成。 What am i doing wrong here... Can someone please help 我在这里做错什么了...有人可以帮忙吗

 var fileA = File.ReadAllLines("File1.txt");
 var fileB = File.ReadAllLines("File2.txt"); 

Then Do a cartesian of the Rows in the Files NxM where N and M represent the Number of rows in File1 and File2. 然后对文件NxM中的行进行笛卡尔运算,其中N和M代表File1和File2中的行数。 So if there are 100 and 50 records each in File 1 and File2 Respectively, then the output is 100*50=5000 因此,如果文件1和文件2中分别有100和50条记录,则输出为100 * 50 = 5000

        FileStream fs = new FileStream("OutputFile.txt", FileMode.Create);
        // First, save the standard output.
        TextWriter tmp = Console.Out;
        StreamWriter sw = new StreamWriter(fs);


        foreach (var lst in cartesian)
        {
            Console.WriteLine(lst);
            Console.SetOut(sw);
            Console.WriteLine(lst);
            Console.SetOut(tmp);
            Console.WriteLine(lst);
        }

        sw.Close();

I don't think you're doing anything wrong. 我认为您没有做错任何事情。 It just legitimately takes a long time to do a cartesian join of 100,000 x 100,000 records. 合法地花费很长时间进行100,000 x 100,000条记录的笛卡尔连接。 You might improve performance a little bit by doing the join with nested for loops instead of LINQ, but your process is probably I/O bound. 通过使用嵌套的for循环而不是LINQ进行连接,可以稍微提高性能,但是您的过程可能受I / O约束。

Note that you don't need to use Console.SetOut , you can call WriteLine directly on sw : 请注意,您无需使用Console.SetOut ,可以直接在sw上调用WriteLine

foreach (var lst in cartesian)
{
  Console.WriteLine(lst);
  sw.WriteLine(lst);
  // and if you want to do it again: Console.WriteLine(lst);
}

Console.WriteLine() when writing to stdout is relatively heavy. 写入标准输出时, Console.WriteLine()相对较重。 See this test where I first just output 100000 lines to a text file with zero other processing, then the second test I write to stdout twice and call SetOut once each iteration. 看到这个测试,我首先将100000行输出到一个零其他处理的文本文件中,然后我将第二个测试写入stdout两次,并且每次迭代调用一次SetOut This is slightly different as your test writes to stdout twice but calls SetOut twice every iteration instead of only once. 这与测试稍有不同,因为您的测试两次写入stdout,但每次迭代调用SetOut两次,而不是一次。

FileStream fs = new FileStream(@"c:\temp\OutputFile.txt", FileMode.Create);
StreamWriter sw = new StreamWriter(fs);
TextWriter tmp = Console.Out; // stdout since it hasn't been changed
Console.SetOut(sw); // point to file
var stopw = Stopwatch.StartNew();
for (int i = 0; i < 100000; i++)
{               
    Console.WriteLine(i); // writes to file
}
sw.Dispose();
fs.Dispose();
var toFileTotalMs = stopw.Elapsed.TotalMilliseconds;

// Reset console to write to stdout
Console.SetOut(tmp);
stopw.Restart();
for (int i = 0; i < 100000; i++)
{
    Console.WriteLine(i); // writes to stdout
    Console.SetOut(tmp); // point to stdout (every iteration)
    Console.WriteLine(i); // writes to stdout
}
var toConsoleTotalMs = stopw.Elapsed.TotalMilliseconds;

Console.WriteLine($"toFileTotalMs={toFileTotalMs}; toConsoleTotalMs={toConsoleTotalMs};");

Console.Read(); // leaves console window open

Outputs: 输出:

toFileTotalMs = 17.7198 toConsoleTotalMs = 15964.9133 toFileTotalMs = 17.7198 toConsoleTotalMs = 15964.9133

So it takes 900 times longer to do two Console.WriteLine() 's to stdout and call SetOut than it does to just write to the file. 因此,执行两个Console.WriteLine()到stdout并调用SetOut比仅写入文件要长900倍。 I just tried changing the original for loop to call SetOut every iteration in addition to writing to file and it went from 17.7ms to 43.8ms. 我只是尝试将原始的for循环更改为除了写入文件外,还在每次迭代中调用SetOut ,它从17.7ms变为43.8ms。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM