简体   繁体   中英

How to use the boost lexical_cast library for just for checking input

I use the boost lexical_cast library for parsing text data into numeric values quite often. In several situations however, I only need to check if values are numeric; I don't actually need or use the conversion.

So, I was thinking about writing a simple function to test if a string is a double:

template<typename T> 
bool is_double(const T& s)
{
  try 
  {
    boost::lexical_cast<double>(s); 
    return true;
  }
  catch (...) 
  {
    return false;
  }
}

My question is, are there any optimizing compilers that would drop out the lexical_cast here since I never actually use the value?

Is there a better technique to use the lexical_cast library to perform input checking?

You can now use boost::conversion::try_lexical_convert now defined in the header boost/lexical_cast/try_lexical_convert.hpp (if you only want try_lexical_convert ). Like so:

double temp;
std::string convert{"abcdef"};
assert(boost::conversion::try_lexical_convert<double>(convert, temp) != false);

Since the cast might throw an an exception, a compiler that would just drop that cast would be seriously broken. You can assume that all major compilers will handle this correctly.

Trying to to do the lexical_cast might not be optimal from a performance point of view, but unless you check millions of values this way it won't be anything to worry about.

I think you want to re-write that function slightly:

template<typename T>  
bool tryConvert(std::string const& s) 
{ 
    try         { boost::lexical_cast<T>(s);} 
    catch (...) { return false; }

    return true; 
} 

You could try something like this.

#include <sstream>

//Try to convert arg to result in a similar way to boost::lexical_cast
//but return true/false rather than throwing an exception.
template<typename T1, typename T2>
bool convert( const T1 & arg, T2 & result )
{
    std::stringstream interpreter;
    return interpreter<<arg && 
           interpreter>>result && 
           interpreter.get() == std::stringstream::traits_type::eof();
}

template<typename T>
double to_double( const T & t )
{
   double retval=0;
   if( ! convert(t,retval) ) { /* Do something about failure */ }
   return retval;
}

template<typename T>
double is_double( const T & t )
{
   double retval=0;
   return convert(t,retval) );
} 

The convert function does basically the same things as boost::lexical_cast, except lexical cast is more careful about avoiding allocating dynamic storage by using fixed buffers etc.

It would be possible to refactor the boost::lexical_cast code into this form, but that code is pretty dense and tough going - IMHO its a pity that lexical_cast wasn't implemented using somethign like this under the hood... then it could look like this:

template<typename T1, typename T2>
T1 lexical_cast( const T2 & t )
{
  T1 retval;
  if( ! try_cast<T1,T2>(t,retval) ) throw bad_lexical_cast();
  return retval;
}

The compiler is pretty unlikely to manage to throw out the conversion no matter what. Exceptions are just the icing on the cake. If you want to optimize this, you'll have to write your own parser to recognize the format for a float. Use regexps or manually parse, since the pattern is simple:

if ( s.empty() ) return false;
string::const_iterator si = s.begin();
if ( *si == '+' || * si == '-' ) ++ si;
if ( si == s.end() ) return false;
while ( '0' <= *si && *si <= '9' && si != s.end() ) ++ si;
if ( si == s.end() ) return true;
if ( * si == '.' ) ++ si;
if ( ( * si == 'e' || * si == 'E' )
 && si - s.begin() <= 1 + (s[0] == '+') + (s[0] == '-') ) return false;
if ( si == s.end() ) return si - s.begin() > 1 + (s[0] == '+') + (s[0] == '-');
while ( '0' <= *si && *si <= '9' && si != s.end() ) ++ si;
if ( si == s.end() ) return true;
if ( * si == 'e' || * si == 'E' ) {
    ++ si;
    if ( si == s.end() ) return false;
    if ( * si == '-' || * si == '+' ) ++ si;
    if ( si == s.end() ) return false;
    while ( '0' <= *si && *si <= '9' && si != s.end() ) ++ si;
}
return si == s.end();

Not tested… I'll let you run through all the possible format combinations ;v)

Edit: Also, note that this is totally incompatible with localization. You have absolutely no hope of internationally checking without converting.

Edit 2: Oops, I thought someone else already suggested this. boost::lexical_cast is actually deceptively simple. To at least avoid throwing+catching the exception, you can reimplement it somewhat:

istringstream ss( s );
double d;
ss >> d >> ws; // ws discards whitespace
return ss && ss.eof(); // omit ws and eof if you know no trailing spaces

This code, on the other hand, has been tested ;v)

最好先使用正则表达式,然后使用lexical_cast转换为实际类型。

As the type T is a templated typename, I believe your answer is the right one, as it will be able to handle all cases already handled by boost::lexical_cast.

Still, don't forget to specialize the function for known types, like char * , wchar_t * , std::string , wstring , etc.

For example, you could add the following code :

template<>
bool is_double<int>(const int & s)
{
   return true ;
}

template<>
bool is_double<double>(const double & s)
{
   return true ;
}

template<>
bool is_double<std::string>(const std::string & s)
{
   char * p ;
   strtod(s.c_str(), &p) ; // include <cstdlib> for strtod
   return (*p == 0) ;
}

This way, you can "optimize" the processing for the types you know, and delegate the remaining cases to boost::lexical_cast.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM