简体   繁体   English

在C ++中直接在表达式中使用正则表达式捕获

[英]Using a regex capture directly in expression in C++

I'm trying to use a captured group directly in the regex. 我试图直接在正则表达式中使用捕获的组。 However, when I try to do this the program hangs indefinitely. 但是,当我尝试执行此操作时,程序将无限期挂起。

For example: 例如:

string input = "<Tag>blahblah</Tag>";
regex r1("<([a-zA-Z]+)>[a-z]+</\1>");
string result = regex_replace(result, regex, "");

If I add another slash to the capture "<([a-zA-Z]+)>[az]</\\\\1>" , the program compiles but throws a "regex_error(regex_constants::error_backref)" exception. 如果我在捕获"<([a-zA-Z]+)>[az]</\\\\1>"添加另一个斜杠,则程序将编译但抛出“ regex_error(regex_constants :: error_backref)”异常。

Notes: 笔记:
Compiler: Apple LLVM 5.1 编译器:Apple LLVM 5.1
I am using this as part of the process to clean junk from blocks of text. 我将其用作从文本块中清除垃圾的过程的一部分。 The document is not necessarily HTML/XML and desired text is not always within tags. 该文档不一定是HTML / XML,所需的文本也不总是在标记内。 So if possible, I would like to be able to do this with regular expressions, not a parser. 因此,如果可能的话,我希望能够使用正则表达式而不是解析器来做到这一点。

The backslash character in string literals is an escape character. 字符串文字中的反斜杠字符是转义字符。

Either escape it "<([a-zA-Z]+)>[az]+</\\\\1>" or use a raw literal, R"(<([a-zA-Z]+)>[az]+</\\1>)" 请转义"<([a-zA-Z]+)>[az]+</\\\\1>"或使用原始文字R"(<([a-zA-Z]+)>[az]+</\\1>)"

With that, your program works as you would expect: 这样,您的程序就会按预期运行:

#include <regex>
#include <iostream>

int main()
{
    std::string input = "Hello<Tag>blahblah</Tag> World";
    std::regex r1("<([a-zA-Z]+)>[a-z]+</\\1>");
    std::string result = regex_replace(input, r1, "");

    std::cout << "The result is '" << result << "'\n";
}

demo: http://coliru.stacked-crooked.com/a/ae20b09d46f975e9 演示: http : //coliru.stacked-crooked.com/a/ae20b09d46f975e9

The exception you're getting with \\\\1 suggests that your compiler is configured to use GNU libstdc++, where regex was not implemented. \\\\1的异常表明您的编译器配置为使用未实现正则表达式的GNU libstdc ++。 Look up how to set it up to use LLVM libc++ or use boost.regex. 查找如何将其设置为使用LLVM libc ++或使用boost.regex。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM