简体   繁体   中英

Scary warnings in ancient code converting wstring to string

An old method contains code like the following (anonymised):

        std::wstring wstr = ...;
        std::string str(wstr.begin(), wstr.end());

Previously this all compiled without warnings but as we update to C++17 and VS2019 (v142) and tidy project settings, it now gives these big scary warnings:

C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.28.29333\include\xstring(2468,23): warning C4244: 'argument': conversion from 'wchar_t' to 'const _Elem', possible loss of data
        with
        [
            _Elem=char
        ]
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.28.29333\include\xstring(2479): message : see reference to function template instantiation 'void std::basic_string<char,std::char_traits<char>,std::allocator<char>>::_Construct<wchar_t*>(_Iter,const _Iter,std::input_iterator_tag)' being compiled
        with
        [
            _Iter=wchar_t *
        ]
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.28.29333\include\xstring(2479): message : see reference to function template instantiation 'void std::basic_string<char,std::char_traits<char>,std::allocator<char>>::_Construct<wchar_t*>(_Iter,const _Iter,std::input_iterator_tag)' being compiled
        with
        [
            _Iter=wchar_t *
        ]
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.28.29333\include\xstring(2459): message : see reference to function template instantiation 'void std::basic_string<char,std::char_traits<char>,std::allocator<char>>::_Construct<wchar_t*>(const _Iter,const _Iter,std::forward_iterator_tag)' being compiled
        with
        [
            _Iter=wchar_t *
        ]
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.28.29333\include\xstring(2459): message : see reference to function template instantiation 'void std::basic_string<char,std::char_traits<char>,std::allocator<char>>::_Construct<wchar_t*>(const _Iter,const _Iter,std::forward_iterator_tag)' being compiled
        with
        [
            _Iter=wchar_t *
        ]

message : see reference to function template instantiation 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>::basic_string<std::_String_iterator<std::_String_val<std::_Simple_types<_Elem>>>,0>(_Iter,_Iter,const _Alloc &)' being compiled
        with
        [
            _Elem=wchar_t,
            _Iter=std::_String_iterator<std::_String_val<std::_Simple_types<wchar_t>>>,
            _Alloc=std::allocator<char>
        ]
message : see reference to function template instantiation 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>::basic_string<std::_String_iterator<std::_String_val<std::_Simple_types<_Elem>>>,0>(_Iter,_Iter,const _Alloc &)' being compiled
        with
        [
            _Elem=wchar_t,
            _Iter=std::_String_iterator<std::_String_val<std::_Simple_types<wchar_t>>>,
            _Alloc=std::allocator<char>
        ]

I am pretty sure this code pre-dates use of UNICODE in our codebase - it seems to work but I don't really understand the warnings or what I should do about it.

I found this question: UTF8 to/from wide char conversion in STL but the nice neat solution has comments saying it's deprecated in C++17! It's somewhat a mystery why this code mixes string and wstring in the first place, is there an easy solution? Or is this a case "just leave it if it works?!"

The issue is that you are converting from a 16 bit string to an 8 bit string. Since 16 bits hold more data than 8, data will then get lost. If you are converting between UTF-16 and UTF-8, you need to do it properly with a conversion library.

C++ does provide conversion library in the form of: codecvt (Deprecated in C++17 but still there for a while).

If you are sure the string only contains ASCII, you can suppress the warning.

See https://en.cppreference.com/w/cpp/locale/codecvt_utf8_utf16 for details

The warning is quite clear on its own.

warning C4244: 'argument': conversion from 'wchar_t' to 'const _Elem', possible loss of data

Which means, this line std::string str(wstr.begin(), wstr.end()) involves a type casting from wchar_t to a narrower data type const _Elem aka char . Since any narrowing cast may lead to data loss, hence the warning.

Consider an example as following:

#include <cstddef>
#include <iostream>
#include <string>

int main() {
    std::wstring ws{};
    auto c = (wchar_t)0x41'42'43'44; // A'B'C'D in ASCII

    for (int i = 0; i < 3; ++i)
        ws.push_back(c);

    std::string str{ws.begin(), ws.end()};
    std::cout << str.c_str() << std::endl;
}

The code above run and print DDD .

On 64 bit machine, the constructor of str move 4 bytes at a time to read a wchar_t . However, string type can only accept element as char ==> the constructor must perform a narrowing cast from wchar_t to char which results in a loss of 3 byte ABC for each wchar_t element.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM