简体   繁体   中英

Extracting Numbers from Mixed String using stringstream

I am trying to extract numbers from a string like Hello1234 using stringstream. I have written the code which works for extracting numbers when entered as apart from the string like:

Hello 1234 World 9876 Hello1234

gives 1234 9876 as output but it doesn't read the mixed string which has both string and number. How can we extract it? - For example: Hello1234 should give 1234 .

Here is my code until now:

cout << "Welcome to the string stream program. " << endl;
    string string1;
    cout << "Enter a string with numbers and words: ";
    getline(cin, string1);

    stringstream ss; //intiazling string stream

    ss << string1;  //stores the string in stringstream 

    string temp;  //string for reading words
    int number;   //int for reading integers

    while(!ss.eof()) {
        ss >> temp;
        if (stringstream(temp) >> number) {
            cout << "A number found is: " << number << endl;
        }
    }

If you're not limited to a solution that uses std::stringstream , I suggest you take a look at regular expressions . Example:

int main() {
    std::string s = "Hello 123 World 456 Hello789";    
    std::regex regex(R"(\d+)");   // matches a sequence of digits

    std::smatch match;
    while (std::regex_search(s, match, regex)) {
        std::cout << std::stoi(match.str()) << std::endl;
        s = match.suffix();
    }
}

The output:

123
456
789

Simply replace any alpha characters in the string with white-space before you do the stream extraction.

std::string str = "Hello 1234 World 9876 Hello1234";

for (char& c : str)
{
    if (isalpha(c))
        c = ' ';
}

std::stringstream ss(str);

int val;
while (ss >> val)
    std::cout << val << "\n";

Output:

1234
9876
1234

You can use the code below with any type of stream - stringstream included. It reads from stream to first digit. The digit is put back in the stream and then the number is read as usually. Live code.

#include <iostream>

using namespace std;

istream& get_number( istream& is, int& n )
{
  while ( is && !isdigit( static_cast<unsigned char>( is.get() ) ) )
    ;
  is.unget();
  return is >> n;
}

int main()
{
  int n;
  while ( get_number( cin, n ) )
    cout << n << ' ';
}

Notes

Regarding regex - It seems people are forgetting/ignoring the basics and, for some reason (c++ purism?), prefer the sledgehammer for even the most basic problems.

Regarding speed - If you take the stream out of the picture, you cannot beat fundamental c. The code below is tens of times faster than the regex solution and at least a couple of times faster than any answer so far.

const char* get_number( const char*& s, int& n )
{
  // end of string
  if ( !*s )
    return 0;

  // skip to first digit
  while ( !isdigit( static_cast<unsigned char>( *s ) ) )
    ++s;

  // convert
  char* e;
  n = strtol( s, &e, 10 );
  return s = e;
}
//...
while ( get_number( s, n ) )
//...

Adding my version:

#include <iostream>
#include <string>
#include <sstream>

int main(){
    std::string s;
    std::getline(std::cin, s);
    std::stringstream ss;
    int number;

    for(const char c: s){
        if( std::isdigit(static_cast<unsigned char>(c)) ){ //Thanks to Aconcagua
            ss << c;
        } else if ( ss >> number ) {
            std::cout << number << " found\n";
        }
        ss.clear();
    }
    if(ss >> number)
    {
        std::cout << number << " found\n";
    }

    return 0;
}

Question itself is very trivial and as programmer most of us solving this kind of problem everyday. And we know there are many solution for any give problem but as programmer we try to find out best possible for any given problem.

When I came across this question there are already many useful and correct answer, but to satisfy my curiosity I try to benchmark all other solution, to find out best one.

I found best one out of all above, and feel that there is still some room for improvement.

So I am posting here my solution along with benchmark code.

#include <chrono>
#include <iostream>
#include <regex>
#include <sstream>
#include <string>
#include <vector>

using namespace std;
#define REQUIER_EQUAL(x, y)                                                    \
  if ((x) != (y)) {                                                            \
    std::cout << __PRETTY_FUNCTION__ << " failed at :" << __LINE__             \
              << std::endl                                                     \
              << "\tx:" << (x) << "\ty:" << (y) << std::endl;                  \
    ;                                                                          \
  }
#define RUN_FUNCTION(func, in, out)                                            \
  auto start = std::chrono::system_clock::now();                               \
  func(in, out);                                                               \
  auto stop = std::chrono::system_clock::now();                                \
  std::cout << "Time in " << __PRETTY_FUNCTION__ << ":"                        \
            << std::chrono::duration_cast<std::chrono::microseconds>(stop -    \
                                                                     start)    \
                   .count()                                                    \
            << " usec" << std::endl;

//Solution by @Evg 
void getNumbers1(std::string input, std::vector<int> &output) {
  std::regex regex(R"(\d+)"); // matches a sequence of digits
  std::smatch match;
  while (std::regex_search(input, match, regex)) {
    output.push_back(std::stoi(match.str()));
    input = match.suffix();
  }
}
//Solution by @n314159 
void getNumbers2(std::string input, std::vector<int> &output) {
  std::stringstream ss;
  int number;
  for (const char c : input) {
    if (std::isdigit(static_cast<unsigned char>(c))) { // Thanks to Aconcagua
      ss << c;
    } else if (ss >> number) {
      output.push_back(number);
    }
  }
}

//Solution by @The Failure by Design 
void getNumbers3(std::string input, std::vector<int> &output) {
  istringstream is{input};
  char c;
  int n;
  while (is.get(c)) {
    if (!isdigit(static_cast<unsigned char>(c)))
      continue;
    is.putback(c);
    is >> n;
    output.push_back(n);
  }
}
//Solution by @acraig5075 
void getNumbers4(std::string input, std::vector<int> &output) {
  for (char &c : input) {
    if (isalpha(c))
      c = ' ';
  }
  std::stringstream ss(input);
  int val;
  while (ss >> val)
    output.push_back(val);
}
//Solution by me 
void getNumbers5(std::string input, std::vector<int> &output) {
  std::size_t start = std::string::npos, stop = std::string::npos;
  for (auto i = 0; i < input.size(); ++i) {
    if (isdigit(input.at(i))) {
      if (start == std::string::npos) {
        start = i;
      }
    } else {
      if (start != std::string::npos) {
        output.push_back(std::stoi(input.substr(start, i - start)));
        start = std::string::npos;
      }
    }
  }
  if (start != std::string::npos)
    output.push_back(std::stoi(input.substr(start, input.size() - start)));
}

void test1_getNumbers1() {
  std::string input = "Hello 123 World 456 Hello789 ";
  std::vector<int> output;
  RUN_FUNCTION(getNumbers1, input, output);
  REQUIER_EQUAL(output.size(), 3);
  REQUIER_EQUAL(output[0], 123);
  REQUIER_EQUAL(output[1], 456);
  REQUIER_EQUAL(output[2], 789);
}
void test1_getNumbers2() {
  std::string input = "Hello 123 World 456 Hello789";
  std::vector<int> output;
  RUN_FUNCTION(getNumbers2, input, output);
  REQUIER_EQUAL(output.size(), 3);
  REQUIER_EQUAL(output[0], 123);
  REQUIER_EQUAL(output[1], 456);
  REQUIER_EQUAL(output[2], 789);
}
void test1_getNumbers3() {
  std::string input = "Hello 123 World 456 Hello789";
  std::vector<int> output;
  RUN_FUNCTION(getNumbers3, input, output);
  REQUIER_EQUAL(output.size(), 3);
  REQUIER_EQUAL(output[0], 123);
  REQUIER_EQUAL(output[1], 456);
  REQUIER_EQUAL(output[2], 789);
}

void test1_getNumbers4() {
  std::string input = "Hello 123 World 456 Hello789";
  std::vector<int> output;
  RUN_FUNCTION(getNumbers4, input, output);
  REQUIER_EQUAL(output.size(), 3);
  REQUIER_EQUAL(output[0], 123);
  REQUIER_EQUAL(output[1], 456);
  REQUIER_EQUAL(output[2], 789);
}
void test1_getNumbers5() {
  std::string input = "Hello 123 World 456 Hello789";
  std::vector<int> output;
  RUN_FUNCTION(getNumbers5, input, output);
  REQUIER_EQUAL(output.size(), 3);
  REQUIER_EQUAL(output[0], 123);
  REQUIER_EQUAL(output[1], 456);
  REQUIER_EQUAL(output[2], 789);
}

int main() {
  test1_getNumbers1();
  // test1_getNumbers2();
  test1_getNumbers3();
  test1_getNumbers4();
  test1_getNumbers5();
  return 0;
}

Sample output on my platform

Time in void test1_getNumbers1():703 usec Time in void test1_getNumbers3():17 usec Time in void test1_getNumbers4():10 usec Time in void test1_getNumbers5():6 usec

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM