簡體   English   中英

使用正則表達式搜索子序列 | C++

[英]Searching for a subsequence using Regular Expressions | C++

我想在我的字符串中搜索以 1 開頭和結尾的 0 序列。例如,

對於 100001 函數應該打印出:100001 對於 1000101 函數應該打印出:10001 和 101

我試圖使用正則表達式來完成它,但我的代碼沒有這樣做。

#include <iostream>
#include <regex>



int main(int argc, char * argv[]){

     std::string number(argv[1]);
     std::regex searchedPattern("1?[0]+1");

     std::smatch sMatch;

     std::regex_search(number,sMatch,searchedPattern);

     for(auto& x : sMatch){
         std::cout << x << std::endl;
     }

     return 0;
}

我用來在 Linux(Ubuntu 版本 18.04)上編譯代碼的命令:

g++ Cpp_Version.cpp -std=c++14 -o exec
./exec 1000101

g++ 版本:

g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

輸出是:

10001

我懷疑我的模式是錯誤的。 任何想法如何改進它?

std::regex_search不搜索所有結果。 改用std::sregex_iterator 它的文檔說明(強調我的):

在構造和每個增量上,它調用std::regex_search

#include <iostream> // std::cout, std::cerr
#include <regex> // std::regex, std::smatch, std::regex_search, std::sregex_iterator
#include <cstdlib> // EXIT_FAILURE, EXIT_SUCCESS

int main(int argc, char **argv) {
    if (argc < 2) {
        std::cerr << "./a.out 1000101" << std::endl;
        return EXIT_FAILURE;
    }
    std::string n{argv[1]};
    std::regex p{"(?=(1[0]+1))"};
    std::smatch m;
    if (false == std::regex_search(n, m, p)) {
        std::cerr << "regex_search has no match!" << std::endl;
        return EXIT_FAILURE;
    }
    std::cout << "regex_search found " << m.size() << " matches! But this is misleading...\n";
    for (const auto & field : m) {
        const auto begin = std::distance(n.cbegin(), field.first);
        const auto end = begin + std::distance(field.first, field.second);
        std::cout
            << "[" << begin << "," << end << "]\t"
            << field << "\n";
    }
    std::cout << "Unfortunately `sregex_iterator` can't tell you how many matches.\n";
    for (std::sregex_iterator it{n.cbegin(), n.cend(), p}, end{}; it != end; ++it) {
        m = *it;
                // m[0] is the capture for the lookahead. it is always empty, but it is needed to have an overlapping match group.
                // m[1] is the capture of your param.
        for (const auto & field : m) {
            const auto begin = std::distance(n.cbegin(), field.first);
            const auto end = begin + std::distance(field.first, field.second);
            std::cout
                << "[" << begin << "," << end << "]\t"
                << field << "\n";
        }
    }
    return EXIT_SUCCESS;
}

這是輸出:

$ g++ --version
g++ (GCC) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ g++ -std=c++20 -O2 -Wall -pedantic example.cpp && ./a.out 1000100101
regex_search found 2 matches! But this is misleading...
[0,0]
[0,5]   10001
Unfortunately `sregex_iterator` can't tell you how many matches.
[0,0]
[0,5]   10001
[4,4]
[4,8]   1001
[7,7]
[7,10]  101

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM