简体   繁体   English

C ++:正则表达式:返回完整的字符串而不匹配的组

[英]C++: Regex: returns full string and not matched group

for those asking, the {0} allows selection of any one block within the sResult string separated by the | 对于那些要求,{0}允许由分离的sResult串中的任何一个块的选择| 0 is the first block 0是第一个块

it needs to be dynamic for future expansion as that number will be configurable by users 它必须是动态的,以便将来扩展,因为该数字可由用户配置

So I am working on a regex to extract 1 portion of a string, however while it matches the results return are not what is expected. 所以我工作的一个正则表达式提取字符串1份,但同时它匹配返回的结果是不是预期。

    std::string sResult = "MATCH_ME|BUT|NOT|ANYTHNG|ELSE";
    std::regex pattern("^(?:[^|]+[|]){0}([^|;]+)");
    std::smatch regMatch;

    std::regex_search(sResult, regMatch, pattern);
    if(regMatch[1].matched)
    {
      for( int i = 0; i < regMatch.size(); i++)
      {
           //SUBMATCH 0 = "MATCH_ME|BUT|NOT|ANYTHNG|ELSE"
           //SUBMATCH 1 = "BUT|NOT|ANYTHNG|ELSE"
        std::ssub_match sm = regMatch[i];
        bValid = strcmp(regMatch[i].str().c_str(), pzPoint->_ptrTarget->_pzTag->szOPCItem);
      }
    }

For some reason I cannot figure out the code to get me just the MATCH_ME back so I can compare it to expected results list on the C++ side. 出于某种原因,我无法弄清楚代码即可获得我只是MATCH_ME回来,所以我可以把它比作在C ++侧预期的结果列表。

Anyone have any ideas on where I went wrong here. 任何人都对我在这里出错的地方有任何想法。

The following code example shows how to do what you are after - you compile this, then call it with a single numerical argument to extract that element of the input: 以下代码示例显示了如何执行操作-编译此代码,然后使用单个数字参数调用它以提取输入的该元素:

#include <iostream>
#include <cstring>
#include <regex>

int main(int argc, char *argv[]) {

    char pat[100];
    if (argc > 1) {
      sprintf(pat, "^(?:[^|]+[|]){%s}([^|;]+)", argv[1]);
      std::string sResult = "MATCH_ME|BUT|NOT|ANYTHNG|ELSE";
      std::regex pattern(pat);
      std::smatch regMatch;

      std::regex_search(sResult, regMatch, pattern);
      if(regMatch[1].matched)
      {
        std::ssub_match sm = regMatch[1];
        std::cout << "The match is " << sm << std::endl;
//bValid = strcmp(regMatch[i].str().c_str(), pzPoint->_ptrTarget->_pzTag->szOPCItem);
      }
    }
    return 0;
}

Creating an executable called match , you can then do 创建一个名为match的可执行文件,然后就可以执行

>> match 2
The match is NOT

which is what you wanted. 这就是你想要的。

The regex, it turns out, works just fine - although as a matter of preference I would use \\| 正则表达式,事实证明,工作得很好-虽然作为偏好,我会用的问题\\| instead of [|] for the first part. 而不是第一部分的[|]

It seems you're using regular expressions for what they haven't been designed for. 似乎您使用的是正则表达式,而并非针对它们。 You should first split your string at the delimiter | 你应该先在分隔符分割你的字符串| and apply regular expressions on the resulting tokens if you want to check them for validity. 并在结果令牌上应用正则表达式(如果要检查它们的有效性)。

By the way: The std::regex implementation in libstdc++ seems to be buggy. 顺便说一句: libstdc++std::regex实现似乎有问题。 I just did some tests and found that even simple patterns containing escaped pipe characters like \\\\| 我只是做了一些测试,发现即使是包含转义的竖线字符(例如\\\\|简单模式 failed to compile throwing a std::regex_error with no further information in the error message (GCC 4.8.1). 无法编译,并在错误消息(GCC 4.8.1)中没有进一步信息的情况下抛出std::regex_error

Turns out the problem was on the C side in extracting the match, it had to be done more directly, below is the code that gets me exactly what I wanted out of the string so I can use it later. 原来问题出在提取匹配项的C端上,它必须更直接地完成,下面的代码可以使我准确地从字符串中得到想要的东西,以便以后使用。

std::string sResult = "MATCH_ME|BUT|NOT|ANYTHNG|ELSE";
std::regex pattern("^(?:[^|]+[|]){0}([^|;]+)");
std::smatch regMatch;

std::regex_search(sResult, regMatch, pattern);
if(regMatch[1].matched)
{
   std::string theMatchedPortion = regMatch[1];
   //the issue was not with the regex but in how I was retrieving the results.
   //theMatchedPortion now equals "MATCH_ME" and by changing the number associated 
        with it I can navigate through the string
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM