简体   繁体   English

在 C# 中读取第 100 万行

[英]to read in C# the millionth line

I have a very long text file.我有一个很长的文本文件。 All rows have the same length.所有行的长度相同。 I want to read in C# the millionth line without first reading the previous 999999 lines because otherwise the program becomes too slow.我想在 C# 中读取第 100 万行而不先读取前面的 999999 行,否则程序会变得太慢。 How can I do?我能怎么做?

Try this尝试这个

const int BYTES_PER_LINE = 120;
static void Main(string[] args)
{
    StreamReader reader = new StreamReader("FileName", Encoding.UTF8);
    long skipLines = 999999;

    reader.BaseStream.Position = skipLines * BYTES_PER_LINE;
}​

Do you know the number of bytes in each line?你知道每行的字节数吗?

NB Knowing the number of characters is not sufficient.注意仅知道字符数是不够的。

If you know it's a fixed number of bytes use:如果您知道它是固定数量的字节,请使用:

using( Stream stream = File.Open(fileName, FileMode.Open) )
{
    stream.Seek(bytesPerLine * (myLine - 1), SeekOrigin.Begin);
    using( StreamReader reader = new StreamReader(stream) )
    {
        string line = reader.ReadLine();
    }
}

if not, then:如果没有,那么:

string line = File.ReadLines(FileName).Skip(999999).Take(1).First();

While this second option still requires the lines to be enumerated, it avoids reading the whole file into memory all at once in order to do so.虽然第二个选项仍然需要枚举行,但它避免了一次将整个文件读入内存以便这样做。

Of course, if by the millionth line you really mean the end of the file, a different approach would make sense.当然,如果第 100 万行你的意思是文件的结尾,那么另一种方法是有意义的。 Find the size of the file, and use that to read lines off the end.找到文件的大小,并使用它来读取末尾的行。

streamReader.BaseStream.Seek(skip_lines_offset, SeekOrigin.Begin);

string line = streamReader.ReadLine();

Seek method avoids reading the whole file. Seek方法避免读取整个文件。 You can read more here .您可以在此处阅读更多内容。 skip_lines_offset is the byte offset of the line, so number_of_skipped_lines * bytes_In_Line skip_lines_offset是行的字节偏移量,所以number_of_skipped_lines * bytes_In_Line

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM