简体   繁体   中英

Is it always safe to assume that values of ..stream::int_type are >= 0 except for eof

I want to parse a file and use an std::stringstream to parse its contents. I use get() to read it character by character, which yields an std::stringstream::int_type . Now in certain cases I want to use a lookup table to convert ascii characters into other values (for example, to deterime whether a certain character is allowed in an identifier or not).

Now can I assume that the values I get from get() are non-negative, unless it is std::stringstream::traits_type::eof() ? (And hence use them as indices for the lookup tables).

I couldn't find anything in the standard regarding that, which might be due to a lack of understanding on my part how this whole bytes to characters thing works in C++.

First let look at the more general case of basic_stringstream.

You can't assume that eof() is negative (I see the constraint nowhere and the C standard states The value of the macro WEOF may differ from that of EOF and need not be negative. )

In general, int_type comes from the trait parameter and the description of int_type for character traits doesn't mandate that to_int_type returns something positive.

Now, stringsteam is basic_stringstream<char> and thus use char_traits<char> ; eof is negative but I haven't found a mandate that to_int_type has to non-negative values (it isn't in 21.2.3.1 and I see no way to deduce it from other constraints), but I wonder if I miss something as my expectation was that to_int_type(c) had to be equivalent to (int)(unsigned char)c -- it is the case for the GNU standard C++ library and I somewhat expect to get the same behavior as in C where functions taking or returning characters in int return non-negative values for characters.)

For information, the other standard specialization of char_traits :

  • char_traits<char16_t> and char_traits<char32_t> have an unsigned int_type , so even eof() is positive;

  • char_traits<wchar_t>::to_int_type isn't mandated to return a positive value for non eof() input either (but in contrast with char_traits<char> I didn't expect such mandate to be there).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM