简体   繁体   中英

How can I switch between fstream files without closing them (Simultaneous output files) - C++

I have a little C++ issue that I couldn't solve by browsing online. Here is my code (extracted):

if(File.is_open()) {
    while(!File.eof())  {
        i++;
        getline(File,Line);
        if(i>=2) {                //Skip Headers
            int CharCount=0;
            for(int CharPosition=0; CharPosition<Line.size(); CharPosition++)                       {
                if(Line[CharPosition]==',') {
                    Length=CharPosition;
                    break;
                }
            }
            NameText=Line.substr(0,Length);
            Path= Path_Folder + "\\" + NameText + ".csv";
            if(!CheckExistance(Path.c_str())) {
                fstream Text_File;
            }
            Text_File.open(Path, fstream::in | fstream::out | fstream::app);
            Text_File<<Line<<"\n";
            Text_File.close();
        }
    }
}

This code is working fine, but I would like to change the fact that it closes the Text_File every time it goes in the while loop.

Basically, this program split a big input file in a lot of smaller files. As my smaller files get bigger and bigger, the execution gets slower and slower (normal). My goal is then to let all the smaller files ( Text_File ) opened in this while loop and just switch the fstream pointer (pointer?) from one to another.

I tried to change as:

...

NameText=Line.substr(0,Length);
Path= Path_Folder + "\\" + NameText + ".csv";

if(!CheckExistance(Path.c_str())) {
    fstream Text_File;
}

if(!Text_File.open()) {
    Text_File.open(Path, fstream::in |fstream::out | fstream::app);
}

Text_File<<Line<<"\n";
\\Text_File.close();

...

But it is working on the same Text_File no matter what NameText is. So I am guessing that the pointer of the fstream Text_File doesn't change. What do I need to be then? Rest the pointer? How?

Thank you, all!

Not sure it is relevant but I am working with Microsoft Visual C++ 2010 Express. In addition, I am not a programmer neither by education nor by living, so if you can explain it without too advanced words, I'll appreciate.

It looks like you would like to juggle the filebuf s on an ostream object.

Now, the only obstacle is that ostream or basic_filebuf<char> aren't copyable types, so you can't put them into a map (by filename) directly. This is easily worked around by creating a little Holder type:

struct Holder {
    Holder(std::string const& path) 
        : buf(std::make_shared<std::filebuf>())
    { 
        buf->open(path.c_str(), std::ios::out | std::ios::app);
    }
    std::shared_ptr<std::filebuf> buf;
};

std::map<std::string, Holder> buffers;

Now the complete program (tested) would look like this:

#include <fstream>
#include <sstream>
#include <iostream>
#include <map>
#include <memory>

const std::string Path_Folder = ".";

int main()
{
    std::istream& File     = std::cin; // just for example
    std::filebuf  dummy;
    std::ostream  TextFile(&dummy);

    struct Holder {
        Holder(std::string const& path) 
            : buf(std::make_shared<std::filebuf>())
        { 
            buf->open(path.c_str(), std::ios::out | std::ios::app);
        }
        std::shared_ptr<std::filebuf> buf;
    };

    std::map<std::string, Holder> buffers;
    int i = 0;

    std::string   Line;
    while(getline(File, Line))
    {
        if (i++<2)
            continue; //Skip Headers

        auto NameText = Line.substr(0, Line.find(','));
        auto Path = Path_Folder + '/' + NameText + ".csv";

        // open, only if not allready opened
        auto found = buffers.find(NameText);
        if (end(buffers) == found)
            found = buffers.insert({ NameText, Path }).first;

        TextFile.rdbuf(found->second.buf.get());

        TextFile << Line << std::endl; // notice implicit std::flush in std::endl
    }

    // all files are automatically closed here
}

Three more notes:

  • files get automatically closed when the buffers map goes out of scope.
  • you might need to add explicit flushes when switching rdbuf() like this, if you don't end your lines with an implicit std::flush (like with std::endl ).
  • dummy only exists to have an ostream object that we can switch the buffer of

I tested this with the following input:

Header Row #1
Header Row #2
Jack,1,some data
Jill,2,some more data
Jack,3,not reopening :)
Jill,4,jill still receiving output
Romeo,5,someone else reporting

Now, I got the following output: see it live at Coliru

/tmp$ rm *.csv
/tmp$ make && ./test < input.txt && tail *.csv

g++ -std=c++11 -Wall -g test.cpp -o test
==> Jack.csv <==
Jack,1,some data
Jack,3,not reopening :)

==> Jill.csv <==
Jill,2,some more data
Jill,4,jill still receiving output

==> Romeo.csv <==
Romeo,5,someone else reporting

Note: it looks like your Text_File is out of scope. I guess you declared it somwhere else in the code. So, this line is useless:

if(!CheckExistance(Path.c_str())){fstream Text_File;}

To access multiple file streams you can use this simple class which utilizes the std::map data structure:

#include <iostream>
#include <map>
#include <string>
#include <fstream>

class StreamWriter
{
    typedef std::map<std::string, std::fstream> StreamMap;
    static StreamMap Files;

public:
    static std::fstream& GetFile(const std::string& filename)
    {
        std::fstream& stream = Files[filename];
        if (!stream.is_open())
        {
            stream.open(filename, std::fstream::in
                   | std::fstream::out | std::fstream::app);
        }
        return stream;
    }
};

StreamWriter::StreamMap StreamWriter::Files = StreamWriter::StreamMap();

Then, access to files is as simple as:

StreamWriter::GetFile("C:/sample1.txt") << "test";

That's it.

What I would do is use std::map or std::unordered_map to map names to fstream objects.

map<string, fstream> files;

...

while(getline(File,Line)) // don't use while(File.eof())
{
    ...

    if( files.count(NameText) == 0 ) // checks for the existence of the fstream object
    {            
        files[NameText].open(Path, fstream::in | fstream::out);
    }

    files[NameText] << Line << "\n";
}

See here for why I changed the condition for the while loop.


Your OS may have trouble having that many open files at once. Perhaps you could try something like this.

Alongside your map, keep a list of the names of the files that are open. Each time you need to write to a file, first search for it in your list, remove it and add it to the front of the list. If it's not there, just add it to the front of the list. Check to make sure the file is open. If it's not, then try to open it. If opening it fails, then one by one, remove items from the back of the list, close the corresponding file to that item, and try to open the current file again. Repeat until opening the file succeeds.

Doing this will ensure that the most frequently written to files stay at the front of the list and remain open. The less frequently written files will move to the back, and eventually be closed. The search for the file in the list is not optimal (O(n)), but since we're dealing with writing to files here, which is a much more expensive operation, you shouldn't notice any kind of perf hit.

You are trying to reuse the Text_File fstream. To do this, you have to do a close() to flush the stream, after you are done writing to a csv file. Please see this question: C++ can I reuse fstream to open and write multiple files?

Also: Here's my Google search for this question: http://goo.gl/Oy5KKM

Note that Text_File is a variable and like all variables you can have more than one with the same type. If you need to manage several different files, you can even use std::fstream in any of the standard containers such as std::vector or std::map . Also, you should consider breaking your code down into smaller more manageable parts. For example, you can create a function which takes an std::fstream& as a parameter. This allows the rest of the program to control which std::fstream& is used at any given time. I strongly suggest that you look at different design options to help organize your code.

The existence check statement has no effect - as mentioned already. Perhaps your intention was to do something like this:

if(!CheckExistance(Path.c_str())) {
    fstream Text_File;

    Text_File.open(Path, fstream::in | fstream::out | fstream::app);
    Text_File<<Line<<"\n";
    Text_File.close();
}

The fstream within the scope of if statement will hide the one you must have in the outer scope. Also, close is optional - stream will be closed when it goes out of scope.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM