简体   繁体   中英

How can I separate numbers and letters in a C++ String?

I'm having several input strings containing numbers and letters. Sometimes the space is missing. I would like to add an additional Space each time the string changes from numbers to letters or from letters to numbers.

Example inputs:

"30EinsteinStreet"
"548 Roe Drive5500 TestCity"
"44B SaarlouisDrive1234Testtown"

they should become:

"30 EinsteinStreet"
"548 Roe Drive 5500 TestCity"
"44 B SaarlouisDrive 1234 Testtown"

My existing function is not working and I think far to complex. Can anyone provide an easy solution? Preferably using modern C++11 classes but no Boost. Also I'm using GCC so all the regex stuff doesn't work for me.

Thanks

Here is my existing method:

        inline string separateAlphanumChunks(const std::string& s) 
        {
            string ret = "";
            const int sl = s.length();
            //int inserts = 0;

            if (sl<=4)
                return s;

            for (int i=0 ; i< sl ; i++)
            {
//              cerr << "separateAlphanumChunks: '" << ret << "'" <<endl;
                // check if index would go out of range
                if (i+4 > sl)
                {
                    ret += s.substr (i,sl-i);
                    //TODO add the remain to ret
                    break;
                }

                // seperate chars
                const char c0 = s[i+0];
                const char c1 = s[i+1];

                // check if 0 and 1 are the same class
                const bool c0c = isCharAnInt (c0);
                const bool c1c = isCharAnInt (c1);
                bool class0 = false;
                if (c0c == c1c)
                {
                    class0 = c0c;
                }
                else
                {
                    ret += c0;
//                  cerr << "cont1: '" << c0 << "'" <<endl;
                    continue;
                }

                // seperate chars
                const char c2 = s[i+2];
                const char c3 = s[i+3];

                // check if 2 and 3 are the same class
                const bool c2c = isCharAnInt (c2);
                const bool c3c = isCharAnInt (c3);
                bool class2 = false;
                if (c2c == c3c)
                {
                    class2 = c2c;
                }
                else
                {
                    ret += c0;
//                  cerr << "cont2: '" << c0 << "'" <<endl;
                    continue;
                }

                // check if the 2 classes are different
                if (class0 != class2)
                {
                    // split!
                    ret += c0+(c1+string(" ")+c2)+c3;
                    //inserts++;
                    i+=3;
                }
                else
                {
                    ret += c0;
//                  cerr << "cont3: '" << c0 << "'" <<endl;
                    continue;
                }

            }

            // remove double spaces
            //replaceStringInPlace(ret, "  "," "); 
            //cerr << "separateAlphanumChunks: '" << ret << "'" <<endl;
            return ret;
        }

    inline bool isCharAnInt (char c)
    {
        //TODO might be able to use isdigit() here
        int i = c - '0';
        return ((i>=0) && (i<=9));
    }

I saw various complex answers, and this is the reason to give another answer too. The answer of your problem is exactly in the problem statement: "add an additional Space each time the string changes from numbers to letters or from letters to numbers."

So here is exactly what you want ( I used some code from a previous answer ) the compilation should be done using the flag -std=c++11

#include <string>
#include <iostream>
using namespace std;

enum charTypeT{ other, alpha, digit};

charTypeT charType(char c){
    if(isdigit(c))return digit;
    if(isalpha(c))return alpha;
    return other;
}

string separateThem(string inString){
  string oString = "";charTypeT st=other;
    for(auto c:inString){
        if( (st==alpha && charType(c)==digit) || (st==digit && charType(c)==alpha) )
          oString.push_back(' ');
        oString.push_back(c);st=charType(c);
    }
    return oString;
}

int main(){
  string str1 = "30EinsteinStreet";
  string str2 = "548 Roe Drive5500 TestCity";
  string str3 = "44B SaarlouisDrive1234Testtown";

  cout << separateThem(str1) << endl;
  cout << separateThem(str2) << endl;
  cout << separateThem(str3) << endl;
}

I think what you are looking for and what Ajay is hinting at is a finite-state machine to parse strings. Although this is not a C++11 solution, and you might find more elegant solutions by means of regex, I provided the code sample below.

#include <iostream>
#include <sstream>

bool isDigit(const char c)
{
    bool res = true;
    switch (c)
    {
        case '0': case '1': case '2': case '3': case '4':
        case '5': case '6': case '7': case '8': case '9':
            break;

        default:
            res = false;
            break;
    }
    return res;
}

std::string separateNumbers(const std::string& inputString)
{
    const size_t N = inputString.length();
    std::ostringstream os;
    bool readDigit = false;
    for (size_t i = 0; i < N; ++i)
    {
        if (isDigit(inputString[i]))
        {
            if ((i > 0) && (i < N) && (! readDigit))
            {
                if (inputString[i] != ' ')
                    os << ' ';
            }
            readDigit = true;
        }
        else
        {
            if ((i > 0) && (i < N) && (readDigit))
            {
                if (inputString[i] != ' ')
                    os << ' ';
            }
            readDigit = false;
        }
        os << inputString[i];
    }
    return os.str();
}

int main(int argc, char** argv)
{
    std::string strings[3] = {
        "30EinsteinStreet",
        "548 Roe Drive5500 TestCity",
        "44B SaarlouisDrive1234Testtown"
    };

    for (int i = 0; i < 3; ++i)
    {
        std::cout << "input #" << i << ": " << strings[i] << std::endl;
        std::cout << "output #" << i << ": " << separateNumbers(strings[i]) << std::endl;
        std::cout << std::endl;
    }

    return 0;
}

Here is my five cents.

#include <iostream>
#include <string>
#include <cctype>

std::string SeparateAlphanumChunks( const std::string &s )
{
    std::string::size_type n = 0;
    bool ctype = std::isdigit( s[0] );

    for ( char c : s )
    {
        if ( !ctype != !std::isdigit( c ) )
        {
            ctype = std::isdigit( c );
            if ( !isblank( c ) ) ++n;
        }
    }

    std::string t;
    t.reserve( s.size() + n );

    ctype = std::isdigit( s[0] );

    for ( char c : s )
    {
        if ( !ctype != !std::isdigit( c ) )
        {
            ctype = std::isdigit( c );
            if ( !isblank( c ) ) t.push_back( ' ');
        }
        t.push_back( c );
    }

    return t;
}

int main() 
{
    for ( const std::string &s : { "30EinsteinStreet",
                                   "548 Roe Drive5500 TestCity",
                                   "44B SaarlouisDrive1234Testtown" 
                                 } )
    {
        std::cout << SeparateAlphanumChunks( s ) << std::endl;
    }

    return 0;
}

The output is

30 EinsteinStreet
548 Roe Drive 5500 TestCity
44 B SaarlouisDrive 1234 Testtown

You also may change the string "in place". For example

#include <iostream>
#include <string>
#include <cctype>

std::string & SeparateAlphanumChunks( std::string &s )
{
    std::string::size_type n = 0;
    bool ctype = std::isdigit( s[0] );

    for ( char c : s )
    {
        if ( !ctype != !std::isdigit( c ) )
        {
            ctype = std::isdigit( c );
            if ( !isblank( c ) ) ++n;
        }
    }

    s.reserve( s.size() + n );

    ctype = std::isdigit( s[0] );

    for ( std::string::size_type i = 0; i < s.size(); i++ )
    {
        if ( !ctype != !std::isdigit( s[i] ) )
        {
            ctype = std::isdigit( s[i] );
            if ( !isblank( s[i] ) ) 
            {
                s.insert( i, 1, ' ' );
            }
        }
    }

    return s;
}


int main() 
{
    for ( std::string s : { "30EinsteinStreet",
                            "548 Roe Drive5500 TestCity",
                                "44B SaarlouisDrive1234Testtown" 
                              } )
    {
        std::cout << SeparateAlphanumChunks( s ) << std::endl;
    }

    return 0;
}

What I would suggest is to go trough a iteration trough string elements. Something like that will help:

#include <string>
#include <iostream>
using namespace std;

string separateThem(string inString){
  string numbers = "1234567890";
  string oString = "";
  int i;
  for(i=0; i<inString.size()-1; i++){
    if ((numbers.find(inString[i]) != string::npos) && (numbers.find(inString[i+1]) == string::npos) && !isspace(inString[i+1])){
      oString += inString.substr(i,1) + " ";
    }
    else if ((numbers.find(inString[i]) == string::npos) && (numbers.find(inString[i+1]) != string::npos) && !isspace(inString[i+1])){
      oString += inString.substr(i,1) + " ";
    }
else oString += inString.substr(i,1);
  }
  oString += inString.substr(i,1);
  return oString;
}

int main(){
  string str1 = "30EinsteinStreet";
  string str2 = "548 Roe Drive5500 TestCity";
  string str3 = "44B SaarlouisDrive1234Testtown";

  cout << separateThem(str1) << endl;
  cout << separateThem(str2) << endl;
  cout << separateThem(str3) << endl;
}

If you execute this the output will be:

30 EinsteinStreet
548 Roe Drive 5500 TestCity
44 B SaarlouisDrive 1234 Testtown

Hope this helps :)

Upgrade to GCC 4.9 (whose first release was back in April) and use a simple regex:

#include <regex>
#include <iostream>

std::string fix(const std::string& in)
{
    return std::regex_replace(
        in,
        std::regex("(?:([a-zA-Z])([0-9]))|(?:([0-9])([a-zA-Z]))"),
        "\\1\\3 \\2\\4",
        std::regex_constants::format_sed
    );
}

int main()
{
    const std::string in[] = {
        "30EinsteinStreet",
        "548 Roe Drive5500 TestCity",
        "44B SaarlouisDrive1234Testtown"
    };

    for (auto el : in)
        std::cout << fix(el) << '\n';
}

/*
"30 EinsteinStreet"
"548 Roe Drive 5500 TestCity"
"44 B SaarlouisDrive 1234 Testtown"
*/

( live demo )

I would suggest you to iterator the string as raw-string (ie string::c_str() ), and generate a new string altogether. This would be my algorithm (not very complete):

  • For each character, check if it is a digit. If no, just append to new string.
  • If yes, check if it is first character - if yes, just append to new string.
  • If the digit is last character, then append to new string.
  • If digit is falling in between, check if last appended character was space. If no space was there, put a space, and then put digit.
  • If last inserted character was a digit, and this is also a digit, insert.
  • However, if last was digit, but this is not a digit (and not a space), then insert a space.

You may need to tweak it further.

What if string is like this:

"enter 144    code   here    123    "

?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM