简体   繁体   English

使用c ++读取文本文件最优雅的方法是什么?

[英]What is the most elegant way to read a text file with c++?

I'd like to read whole content of a text file to a std::string object with c++. 我想用c ++将文本文件的全部内容读入std::string对象。

With Python, I can write: 使用Python,我可以写:

text = open("text.txt", "rt").read()

It is very simple and elegant. 它非常简单而优雅。 I hate ugly stuff, so I'd like to know - what is the most elegant way to read a text file with C++? 我讨厌丑陋的东西,所以我想知道 - 用C ++读取文本文件最优雅的方法是什么? Thanks. 谢谢。

There are many ways, you pick which is the most elegant for you. 有很多方法,你选择哪种方式最适合你。

Reading into char*: 读入char *:

ifstream file ("file.txt", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
    file.seekg(0, ios::end);
    size = file.tellg();
    char *contents = new char [size];
    file.seekg (0, ios::beg);
    file.read (contents, size);
    file.close();
    //... do something with it
    delete [] contents;
}

Into std::string: 进入std :: string:

std::ifstream in("file.txt");
std::string contents((std::istreambuf_iterator<char>(in)), 
    std::istreambuf_iterator<char>());

Into vector<char>: 进入vector <char>:

std::ifstream in("file.txt");
std::vector<char> contents((std::istreambuf_iterator<char>(in)),
    std::istreambuf_iterator<char>());

Into string, using stringstream: 使用stringstream进入字符串:

std::ifstream in("file.txt");
std::stringstream buffer;
buffer << in.rdbuf();
std::string contents(buffer.str());

file.txt is just an example, everything works fine for binary files as well, just make sure you use ios::binary in ifstream constructor. file.txt只是一个例子,一切都适用于二进制文件,只需确保在ifstream构造函数中使用ios :: binary。

There's another thread on this subject. 这个主题还有另一个主题。

My solutions from this thread (both one-liners): 我的解决方案来自这个线程(两个单行):

The nice (see Milan's second solution): 很好(见米兰的第二个解决方案):

string str((istreambuf_iterator<char>(ifs)), istreambuf_iterator<char>());

and the fast: 和快:

string str(static_cast<stringstream const&>(stringstream() << ifs.rdbuf()).str());

You seem to speak of elegance as a definite property of "little code". 你似乎把优雅说成是“小代码”的明确属性。 This is ofcourse subjective in some extent. 这在某种程度上是主观的。 Some would say that omitting all error handling isn't very elegant. 有人会说省略所有错误处理并不是很优雅。 Some would say that clear and compact code you understand right away is elegant. 有人会说,你立即理解的清晰紧凑的代码是优雅的。

Write your own one-liner function/method which reads the file contents, but make it rigorous and safe underneath the surface and you will have covered both aspects of elegance. 编写您自己的单行函数/方法,读取文件内容,但在表面下使其严谨和安全,您将涵盖优雅的两个方面。

All the best 祝一切顺利

/Robert /罗伯特·

But beware that a c++-string (or more concrete: An STL-string) is as little as a C-String capable of holding a string of arbitraty length - of course not! 但要注意一个c ++ - 字符串(或更具体的:一个STL字符串)就像一个能够容纳一串任意长度的C字符串一样少 - 当然不是!

Take a look at the member max_size() which gives you the maximum number of characters a string might contain. 看看成员max_size(),它给出了字符串可能包含的最大字符数。 This is an implementation definied number and may not be portable among different platforms. 这是一个实现定义的数字,可能无法在不同平台之间移植。 Visual Studio gives a value of about 4gigs for strings, others might give you only 64k and on 64Bit-platforms it might give you something really huge! Visual Studio为字符串提供了大约4gig的值,其他的可能只给你64k,在64Bit平台上它可能会给你一些非常大的东西! It depends and of course normally you will run into a bad_alloc-exception due to memory exhaustion a long time before reaching the 4gig limit... 这取决于当然通常你会在达到4gig限制之前的很长一段时间内由于内存耗尽而遇到bad_alloc异常......

BTW: max_size() is a member of other STL-containers as well! BTW:max_size()也是其他STL容器的成员! It will give you the maximum number of elements of a certain type (for which you instanciated the container) which this container will (theoretically) be able to hold. 它将为您提供此容器(理论上)能够容纳的特定类型(您为其设备容器)的最大元素数量。

So, if you're reading from a file of unknow origin you should: 因此,如果您正在阅读未知来源的文件,您应该:
- Check its size and make sure it's smaller than max_size() - 检查其大小并确保它小于max_size()
- Catch and process bad_alloc-exceptions - 捕获并处理bad_alloc-exceptions

And another point: Why are you keen on reading the file into a string? 还有一点:为什么你热衷于将文件读入字符串? I would expect to further process it by incrementally parsing it or something, right? 我期望通过逐步解析它或其他东西来进一步处理它,对吗? So instead of reading it into a string you might as well read it into a stringstream (which basically is just some syntactic sugar for a string) and do the processing. 因此,不是将其读入字符串,而是将其读入字符串流(基本上只是字符串的一些语法糖)并进行处理。 But then you could do the processing directly from the file as well. 但是你也可以直接从文件中进行处理。 Because if properly programmed the stringstream could seamlessly be replaced by a filestream, ie by the file itself. 因为如果正确编程,字符串流可以无缝地由文件流替换,即由文件本身替换。 Or by any other input stream as well, they all share the same members and operators and can thus be seamlessly interchanged! 或者通过任何其他输入流,它们都共享相同的成员和操作符,因此可以无缝地互换!

And for the processing itself: There's also a lot you can have automated by the compiler! 对于处理本身:编译器也可以自动化很多! E. g. E. g。 let's say you want to tokenize the string. 假设您想要对字符串进行标记。 When defining a proper template the following actions: 定义适当的模板时,请执行以下操作:
- Reading from a file (or a string or any other input stream) - 从文件(或字符串或任何其他输入流)读取
- Tokenizing the content - 对内容进行标记
- pushing all found tokens into an STL-container - 将所有找到的令牌推入STL容器
- sort the tokens alphabetically - 按字母顺序对标记进行排序
- eleminating any double values - 消除任何双重值
can all(!!) be achived in one single(!) line of C++-code (let aside the template itself and the error handling)! 所有(!!)都可以在单个(!)的C ++行代码中实现 - 代码(放弃模板本身和错误处理)! It's just a single call of the function std::copy()! 它只是函数std :: copy()的一次调用! Just google for "token iterator" and you'll get an idea of what I mean. 只需谷歌“令牌迭代器”,你就会明白我的意思。 So this appears to me to be even more "elegant" than just reading from a file... 因此,在我看来,这比仅仅从文件中读取更加“优雅”......

I like Milan's char* way, but with std::string. 我喜欢米兰的char *方式,但是使用std :: string。


#include <iostream>
#include <string>
#include <fstream>
#include <cstdlib>
using namespace std;

string& getfile(const string& filename, string& buffer) {
    ifstream in(filename.c_str(), ios_base::binary | ios_base::ate);
    in.exceptions(ios_base::badbit | ios_base::failbit | ios_base::eofbit);
    buffer.resize(in.tellg());
    in.seekg(0, ios_base::beg);
    in.read(&buffer[0], buffer.size());
    return buffer;
}

int main(int argc, char* argv[]) {
    if (argc != 2) {
        cerr << "Usage: this_executable file_to_read\n";
        return EXIT_FAILURE;
    }
    string buffer;
    cout << getfile(argv[1], buffer).size() << "\n";
}

(with or without the ios_base::binary, depending on whether you want newlines tranlated or not. You could also change getfile to just return a string so that you don't have to pass a buffer string in. Then, test to see if the compiler optimizes the copy out when returning.) (有或没有ios_base :: binary,取决于你是否需要转换换行。你也可以改变getfile只返回一个字符串,这样你就不必传入一个缓冲区字符串。然后,测试看看是否编译器在返回时优化副本。)

However, this might look a little better (and be a lot slower): 但是,这看起来可能会好一些(并且速度要慢得多):


#include <iostream>
#include <string>
#include <fstream>
#include <cstdlib>
using namespace std;

string getfile(const string& filename) {
    ifstream in(filename.c_str(), ios_base::binary);
    in.exceptions(ios_base::badbit | ios_base::failbit | ios_base::eofbit);
    return string(istreambuf_iterator<char>(in), istreambuf_iterator<char>());
}

int main(int argc, char* argv[]) {
    if (argc != 2) {
        cerr << "Usage: this_executable file_to_read\n";
        return EXIT_FAILURE;
    }
    cout << getfile(argv[1]).size() << "\n";
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM