简体   繁体   中英

Custom Stringstream - Convert std::wstring & std::string

I've got a template class derived from std::basic_stringstream<typename TString::value_type...> , as you can see. The problem happens while trying to convert them. It's probably an obvious problem, though I cannot seem to figure out the solution.

As example in main , I have a simple std::wstring and initialize it with L"123" .
After the std::wstring has been constructed, the operator of the custom basic_stringstream class is called (depending on std::wstring or std::string ).

Inspecting the WCStringStream object for debugging purposes, shows that it contains - instead of the string L"123" , the address of the first element of the entered string. The functions to_bytes and from_bytes do return the correct converted string, so the only problem left is the operator being called in both operator-functions:

*this << std::wstring_convert<...>().xx_bytes(s);

Example:
Template class is std::wstring .
Input is a std::string .
&operator<<(const std::string &s) is being called.
String is converted.
&operator<<(const std::wstring &s) is being called.
String-type matches with template type.
Operator of base-class ( basic_stringstream ) is called. (Or std::operator... )

Result:
Inspecting: {_Stringbuffer={_Seekhigh=0x007f6808 L"003BF76C췍췍췍췍췍췍췍췍췍...}...}
WCStringStream<std::wstring>::str() -> "003BF76C"

Expected result:
"123"

What's going wrong here ?


#define WIN32_LEAN_AND_MEAN
#define NOMINMAX
#include <Windows.h>
#include <iostream>
#include <sstream>
#include <codecvt>

template<class TString>
class WCStringStream : public std::basic_stringstream<typename TString::value_type,
    std::char_traits<typename TString::value_type>,
    std::allocator<typename TString::value_type> >
{
    typedef typename TString::value_type CharTraits;
    typedef std::basic_stringstream<CharTraits, std::char_traits<CharTraits>, std::allocator<CharTraits> > MyStream;
    //more typedefs...

public:
    //Constructor...
    inline WCStringStream(void) { }
    inline WCStringStream(const TString &s) : MyStream(s) { }
    //and more...
    //operator>> overloads...
    //defines for VS2010/2015 (C++11) included

    inline WCStringStream &operator<<(const std::wstring &s)
    {
        if (typeid(TString) == typeid(s))
            MyStream::operator<<(s.c_str());
        else
            *this << std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t>().to_bytes(s);
        return *this;
    }

    inline WCStringStream &operator<<(const std::string &s)
    {
        if (typeid(TString) == typeid(s))
            MyStream::operator<<(s.c_str());
        else
            *this << std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t>().from_bytes(s);
        return *this;
    }
};

//Example main
int main(int argc, char *argv[])
{
    typedef std::wstring fstring;

    WCStringStream<std::wstring> ws;
    WCStringStream<std::string> ss;

    ws << fstring(L"123");
    int a = 0;
    ws >> a;
    std::cout << a << std::endl;

    ss << fstring(L"123");
    int b = 0;
    ss >> b;
    std::cout << b << std::endl;

    return 0;
}

I'm compiling currently in VS2015 but I'd need it to run on VS2010 too.

First off: I think the approach to overload formatting function in a base class is ill-advised and I strongly recommend to not do it ! I do realize that any alternative will require a bit more work.

In fact, I think your primary problem is actually that you do not reach your overloaded functions anyway just showing how fragile the approach is (I think the string describe what overload ends up being called but I haven't verified that these are indeed accurate, partly because the code provided in the question is lacking necessary context):

WCStringStream<std::string> stream;
stream << "calls std::operator<< (std::ostream&, char const*)\n";
stream << L"calls std::ostream::operator<< (void const*)\n";
stream << std::string("calls std::operator<< (std::ostream&, T&&)\n";
std::string const s("calls your operator\n");
stream << s;

Since the overloaded output operators for strings and string literals can't be changed and they do the wrong think with respect to code conversions, I recommend using an entirely different approach although it still won't be without peril(*): convert the strings explicitly although using a more nicely packaged version of the code than the standard provides.

Assuming always using char as character type for all uses I would use a function wcvt() which is called for all strings and string-literals when inserting them into a stream. Since at the point the function is being called it wouldn't know the type of the stream it is going to be used with, it would return essentially a reference to the character sequence which is then converted appropriately for the character type used for the stream. That would be something along these lines:

template <typename cT>
class wconvert {
    cT const* begin_;
    cT const* end_;
public:
    wconvert(std::basic_string<cT> const& s)
        : begin_(s.data())
        , end_(s.data() + s.size()) {
    }
    wconvert(cT const* s)
    : begin_(s)
    , end_(s + std::char_traits<cT>::length(s)) {
    }
    cT const* begin() const { return this->begin_; }
    cT const* end() const { return this->end_; }
    std::streamsize size() const { return this->end_ - this->begin_; }
};

template <typename cT>
wconvert<cT> wcvt(cT const* s) {
    return wconvert<cT>(s);
}
template <typename cT>
wconvert<cT> wcvt(std::basic_string<cT> const& s) {
    return wconvert<cT>(s);
}

template <typename cT>
std::basic_ostream<cT>& operator<< (std::basic_ostream<cT>& out,
                                    wconvert<cT> const& cvt) {
    return out.write(cvt.begin(), cvt.size());
}

std::ostream& operator<< (std::ostream& out, wconvert<wchar_t> const& cvt) {
    auto tmp = std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t>().to_bytes(cvt.begin(), cvt.end());
    return out.write(tmp.data(), tmp.size());
}

std::wostream& operator<< (std::wostream& out, wconvert<char> const& cvt) {
    auto tmp = std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t>().from_bytes(cvt.begin(), cvt.end());
    return out.write(tmp.data(), tmp.size());
}

Of course, using this approach requires the use of wcvt(s) whenever s may be a string which needs to be converted. It is easy to forget doing so and it seems the original objective was to not have to remember the use of such a conversion. However, I don't see any alternative which is less fragile with the system of existing streams. Entirely abandoning the use of streams and using an entirely separate system of formatted I/O may yield less fragile approach.

(*) The approach easiest to get right is to stick with just on character type in a program and always using this character type. I do believe it was actually an error to introduce a second character type, wchar_t , and it an even bigger error to further complicate the existing mess by having also introduced char16_t and char32_t . We'd be much better off there were just one character type, char , although it actually wouldn't represent character but bytes of an encoding.

The problem was to explicitly call the base class operator, which takes the const void *_Val overload and prints the address.

MyStream::operator<<(s.c_str());

The solution to the problem:

if (typeid(TString) == typeid(s))
{
    MyStream &os = *this;
    os << s.c_str();
}

Of course calling *this << s.c_str() results in recursion, but the using the base class, it calls the global overloaded operator for the correct char-type wchar_t / char .

An also working solution is to use the member-function write instead of the operator.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM