简体   繁体   中英

How to obtain an unknown number of regex matches?

I'm trying to find several digits places in a string. I'm able to get only the last one, or a previously specified number of digits:

#include <iostream>
#include <regex>
#include <string>

int main()
{
    std::string s("aaabbbccd123456eeffgg");
    std::smatch match;
    std::regex braced_regex("(\\w+)(\\d{2,})(\\w+)");
    std::regex plus_regex("(\\w+)(\\d+)(\\w+)");

    auto printer = [](auto& match) {
            std::ssub_match sub(match);
            std::string match_substring(sub.str());
            std::cout <<  match_substring << '\n';
    };

    std::regex_match(s, match, braced_regex);
    std::cout << "Number of braced matches: " << match.size() << '\n';  
    std::for_each(match.begin(), match.end(), printer);

    std::regex_match(s, match, plus_regex);
    std::cout << "Number of plus matches: " << match.size() << '\n';  
    std::for_each(match.begin(), match.end(), printer);
    return 0;
}

Result:

Number of braced matches: 4
aaabbbccd123456eeffgg
aaabbbccd1234
56
eeffgg
Number of plus matches: 4
aaabbbccd123456eeffgg
aaabbbccd12345
6
eeffgg

How can I obtain the whole number sequence, ie 123456 from the provided string?

I think the problem is that the numbers are considered word parts and get matched with \\w . I would be tempted to use \\D meaning not a digit :

#include <iostream>
#include <regex>
#include <string>

int main()
{
    std::string s("aaabbbccd123456eeffgg");
    std::smatch match;
    std::regex plus_regex("(\\D+)(\\d+)(\\D+)");

    auto printer = [](auto& match) {
            std::ssub_match sub(match);
            std::string match_substring(sub.str());
            std::cout <<  match_substring << '\n';
    };

    std::regex_match(s, match, plus_regex);
    std::cout << "Number of plus matches: " << match.size() << '\n';
    std::for_each(match.begin(), match.end(), printer);
    return 0;
}

Output:

Number of plus matches: 4
aaabbbccd123456eeffgg
aaabbbccd
123456
eeffgg

Another possibility (depending what you want) is to use std::regex_search() which does not try to match the whole string but lets you match elements in the middle:

#include <iostream>
#include <regex>
#include <string>

int main()
{
    std::string s("aaabbbccd123456eeffgg");
    std::smatch match;
    std::regex braced_regex("\\d{2,}"); // just the numbers

    auto printer = [](auto& match) {
            std::ssub_match sub(match);
            std::string match_substring(sub.str());
            std::cout <<  match_substring << '\n';
    };

    std::regex_search(s, match, braced_regex); // NOTE: regex_search()!
    std::cout << "Number of braced matches: " << match.size() << '\n';
    std::for_each(match.begin(), match.end(), printer);
}

Output:

Number of braced matches: 1
123456
([a-zA-Z]+)(\\d{2,})([a-zA-Z]+)

You can try this. \\w === [a-zA-Z0-9_] .So \\w+ will match max it can.So it lets \\d{2,} have just 2.

or

(\\w+?)(\\d{2,})(\\w+)

Make the first \\w non greedy. See live demo .

In:

(\\w+)(\\d{2,})(\\w+)

\\\\w+ matches any word character [a-zA-Z0-9_], so it matches also 1234

to match whole number change \\\\w to [a-zA-Z_], so you will have:

std::regex braced_regex("([a-zA-Z_]+)(\\d{2,})(\\w+)");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM