Given data format as "int,int,...,int,string,int", is it possible to use stringstream (only) to properly decode the fields?
[Code]
int main(int c, char** v)
{
std::string line = "0,1,2,3,4,5,CT_O,6";
char delimiter[7];
int id, ag, lid, cid, fid, did, j = -12345;
char dcontact[4]; // <- The size of <string-field> is known and fixed
std::stringstream ssline(line);
ssline >> id >> delimiter[0]
>> ag >> delimiter[1]
>> lid >> delimiter[2]
>> cid >> delimiter[3]
>> fid >> delimiter[4]
>> did >> delimiter[5] // <- should I do something here?
>> dcontact >> delimiter[6]
>> j;
std::cout << id << ":" << ag << ":" << lid << ":" << cid << ":" << fid << ":" << did << ":";
std::cout << dcontact << "\n";
}
[Output] 0:1:2:3:4:5: :-45689
, the bolded part shows the stringstream failed to read 4 char only to dcontact. :-45689 ,粗体部分显示0:1:2:3:4:5: :-45689
无法读取4个字符仅用于dcontact。 dcontact
actually hold more than 4 chars, leaving j
with garbage data.
Yes, there is no specific overload of operator >> (istream&, char[N])
for N and there is for char*
so it sees that as the best match. The overload for char* reads to the next whitespace character so it doesn't stop at the comma.
You could wrap your dcontact in a struct and have a specific overload to read into your struct. Else you could use read, albeit it breaks your lovely chain of >>
operators.
ssline.read( dcontact, 4 );
will work at that point.
To read up to a delimiter, incidentally, you can use getline
. ( get
will also work but getline
free-function writing to a std::string
will mean you don't have to guess the length).
(Note that other people have specified to use get
rather than read
, but this will fail in your case as you do not have an extra byte at the end of your dcontact
array for a null terminator. IF you want dcontact
to be null-terminated then make it 5 characters and use 'get` and the null will be appended for you).
Slightly more robust (handles the ','
delimiter correctly):
template <char D>
std::istream& delim(std::istream& in)
{
char c;
if (in >> c && c != D) in.setstate(std::ios_base::failbit);
return in;
}
int main()
{
std::string line = "0,1,2,3,4,5,CT_O,6";
int id, ag, lid, cid, fid, did, j = -12345;
char dcontact[5]; // <- The size of <string-field> is known and fixed
std::stringstream ssline(line);
(ssline >> id >> delim<','>
>> ag >> delim<','>
>> lid >> delim<','>
>> cid >> delim<','>
>> fid >> delim<','>
>> did >> delim<','> >> std::ws
).get(dcontact, 5, ',') >> delim<','>
>> j;
std::cout << id << ":" << ag << ":" << lid << ":"
<< cid << ":" << fid << ":" << did << ":";
<< dcontact << "\n";
}
try this
int main(int c, char** v) {
string line = "0,1,2,3,4,5,CT_O,6";
char delimiter[7];
int id, ag, lid, cid, fid, did, j = -12345;
char dcontact[5]; // <- The size of <string-field> is known and fixed
stringstream ssline(line);
ssline >> id >> delimiter[0]
>> ag >> delimiter[1]
>> lid >> delimiter[2]
>> cid >> delimiter[3]
>> fid >> delimiter[4]
>> did >> delimiter[5];
ssline.get(dcontact, 5);
ssline >> delimiter[6]
>> j;
std::cout << id << ":" << ag << ":" << lid << ":" << cid << ":" << fid << ":" << did << ":";
std::cout << dcontact << "\n" << j;
}
The problem is that the >>
operator for a string ( std::string
or a C style string) actually implements the semantics for a word, with a particular definition of word. The decision is arbitrary (I would have made it a line), but since a string can represent many different things, they had to choose something.
The solution, in general, is not to use >>
on a string, ever. Define the class you want (here, probably something like Symbol
), and define an operator >>
for it which respects its semantics. You're code will be a lot clearer for it, and you can add various invarant controls as appropriate. If you know that the field is always exactly four characters, you can do something simple like:
class DContactSymbol
{
char myName[ 4 ];
public:
// ...
friend std::istream&
operator>>( std::istream& source, DContactSymbol& dest );
// ...
};
std::istream&
operator>>( std::istream& source, DContactSymbol& dest )
{
std::sentry guard( source );
if ( source ) {
std::string tmp;
std::streambuf* sb = source.rdbuf();
int ch = sb->sgetc();
while ( source && (isalnum( ch ) || ch == '_') ) {
tmp += static_cast< char >( ch );
if ( tmp.size() > sizeof( dest.myName ) ) {
source.setstate( std::ios_base::failbit );
}
}
if ( ch == source::traits_type::eof() ) {
source.setstate( std::ios_base::eofbit );
}
if ( tmp.size() != sizeof( dest.myName ) ) {
source.setstate( std::ios_base::failbit );
}
if ( source ) {
tmp.copy( dest.myName, sizeof( dest.myName ) );
}
}
return source;
}
(Note that unlike some of the other suggestions, for example using std::istream::read
, this one maintains all of the usual conventions, like skipping leading white space dependent on the skipws
flag.)
Of course, if you can't guarantee 100% that the symbol will always be 4 characters, you should use std::string
for it, and modify the >>
operator accordingly.
And BTW, you seem to want to read four characters into dcontact
, although it's only large enough for three (since >>
will insert a terminating '\\0'
). If you read any more than three into it, you have undefined behavior.
由于字符串的长度已知,因此您可以使用std::setw(4)
,如
ssline >> std::setw(4) >> dcontact >> delimiter[6];
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.