[英]C++ Boost:regex_search expression - Issue combining expressions to catch all sequences
I'm trying to write a template parser and need to pickup (3) distinct sets of sequences for string replacement. 我正在尝试编写模板解析器,并且需要拾取(3)个不同的序列集以进行字符串替换。
// Each of These Expressions Work Perfect Separately!
// All Sequences start with | pipe. Followed by
boost::regex expr {"(\\|[0-9]{2})"}; // 2 Digits only.
boost::regex expr {"(\\|[A-Z]{1,2}+[0-9]{1,2})"}; // 1 or 2 Uppercase Chars and 1 or 2 Digits.
boost::regex expr {"(\\|[A-Z]{2})(?!\\d)"}; // 2 Uppercase Chars with no following digits.
However, once I try to combine them into a single statement, I get can't them to work properly to catch all sequences. 但是,一旦我尝试将它们组合成一个语句,就无法使其正常工作以捕获所有序列。 I must be missing something. 我肯定错过了什么。 Can anyone shed some light on what I'm missing? 谁能阐明我所缺少的内容?
Here is what I have so far: 这是我到目前为止的内容:
// Each sequence is separated with a | for or between parenthesis.
boost::regex expr {"(\\|[0-9]{2})|(\\|[A-Z]{1,2}+[0-9]{1,2})|(\\|[A-Z]{2})(?!\\d)"};
I'm using the follow string for testing, and probably little more then needed here is the code as well. 我正在使用跟随字符串进行测试,可能还需要一些代码。
#include <boost/regex.hpp>
#include <string>
#include <iostream>
std::string str = "|MC01 |U1 |s |A22 |12 |04 |2 |EW |SSAADASD |15";
boost::regex expr {"(\\|[0-9]{2})|(\\|[A-Z]{1,2}+[0-9]{1,2})|(\\|[A-Z]{2})(?!\\d)"};
boost::smatch matches;
std::string::const_iterator start = str.begin(), end = str.end();
while(boost::regex_search(start, end, matches, expr))
{
std::cout << "Matched Sub '" << matches.str()
<< "' following ' " << matches.prefix().str()
<< "' preceeding ' " << matches.suffix().str()
<< std::endl;
start = matches[0].second;
for(size_t s = 1; s < matches.size(); ++s)
{
std::cout << "+ Matched Sub " << matches[s].str()
<< " at offset " << matches[s].first - str.begin()
<< " of length " << matches[s].length()
<< std::endl;
}
}
I believe this is what you want: 我相信这就是您想要的:
const boost::regex expr {"(\\|[0-9]{2})|(\\|[A-Z]{1,2}+[0-9]{1,2})|(\\|[A-Z]{2})"}; // basically, remove the constraint on the last sub
I also suggest being explicit in your flags for expr
and passed to regex_search
. 我还建议在您的expr
标志中将其显式并传递给regex_search
。
I also fond that by added an extra check for matches on matched, this removes half-matched patterns which was throwing me off. 我也很喜欢通过在匹配项上添加额外的匹配检查,从而消除了让我失望的半匹配模式。
for(size_t s = 1; s < matches.size(); ++s)
{
if (matches[s].matched) // Check for bool True/False
{
std::cout << "+ Matched Sub " << matches[s].str()
<< " at offset " << matches[s].first - str.begin()
<< " of length " << matches[s].length()
<< std::endl;
}
}
Without it, matches where showing with an offset at the end of the string showing length 0. So I hope this helps anyone else who runs into this. 如果没有它,则匹配在显示长度为0的字符串的末尾显示偏移量的位置。因此,我希望这对遇到此问题的其他人有所帮助。
Another Tip is, in the loop, checking s == 1, 2, 3 refers back to the match on the expressions. 另一个提示是,在循环中,检查s == 1、2、3是指返回表达式上的匹配项。 Since I have (3) expressions, if it matched on the first part of the expression, s will have a 1 value when matched is a true value, otherwise it will have 2 or 3. Pretty nice! 因为我有(3)个表达式,所以如果在表达式的第一部分匹配,则s在匹配为真值时将具有1值,否则它将具有2或3。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.