简体   繁体   English

在C ++中不使用getline()读取文本文件

[英]Reading a text file without using getline() in C++

From what I learned, in order to analyze a text file, the go-to method is to analyze it line by line, which is easy and efficient. 据我了解,为了分析文本文件,首选方法是逐行分析文本文件,这既简单又高效。

However, when we're dealing with a huge file that has all the text in one line, the getline() function definitely would not be efficient at all. 但是,当我们处理一个大文件且所有文本都在一行中时,getline()函数绝对根本不会高效。 I'm wondering if there's any other efficient method to analyze this huge file? 我想知道是否还有其他有效的方法来分析这个大文件?

The only thing I have in mind is to store this huge line into a string variable, and then cut it to singles words. 我唯一要记住的是将这行大行存储到一个字符串变量中,然后将其剪切为单字形式。 But this still does not sound efficient at all. 但这听起来仍然根本没有效率。

Please help. 请帮忙。 Thank you! 谢谢!

You can use std::istream::get(char *, std::streamsize) to read large chunks of the file into a suitably-large buffer, and then process the file piecemeal, in large chunks. 您可以使用std::istream::get(char *, std::streamsize)将文件的大块读取到适当大的缓冲区中,然后逐块处理文件。

Alternatively, there can be operating system-specific ways that can be used too. 或者,也可以使用特定于操作系统的方式。 On Linux, a read-only mmap() of a file could be used to plow through it, with a minimum of fuss. 在Linux上,可以使用文件的只读mmap()进行浏览,而不必大惊小怪。

getline is basically just a shortcut for handling line-breaks - or other characters! getline基本上只是处理换行符或其他字符的快捷方式! So if your file has some delimiters (semicolon for example), you can use 因此,如果文件中有一些定界符(例如,分号),则可以使用

std::getline(fileStream, stringToSave, ';');

As for performance - you just gotta try what works in your case. 至于性能-您只需要尝试适合您的情况即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM