简体   繁体   English

正则表达式 C++:提取标签之间的子字符串

[英]Regex C++: extract substring between tags

I would like to extract a some substring between two tags.我想在两个标签之间提取一些子字符串。 Example: <column r="1"><tb="red"><v>1</v></t></column> I would like to get: <tb="red"><v>1</v></t>示例: <column r="1"><tb="red"><v>1</v></t></column>我想得到: <tb="red"><v>1</v></t>

I don't want to use boost or other libs.我不想使用 boost 或其他库。 Just standard stuffs from C++, except CERN's ROOT lib, with TRegexp, but I don't know how to use it...只是来自 C++ 的标准东西,除了 CERN 的 ROOT 库,带有 TRegexp,但我不知道如何使用它......

You shouldn't be using regexes to try to match html, but, for this special case, you could do:应该使用正则表达式来尝试匹配 html,但是,对于这种特殊情况,您可以这样做:

#include <string>
#include <regex>

// Your string
std::string str = "<column r="1"><t b=\"red\"><v>1</v></t></column>";

// Your regex, in this specific scenario
// Will NOT work for nested <column> tags!
std::regex rgx("<column.*?>(.*?)</column>");
std::smatch match;

// Try to match it
if(std::regex_search(str.begin(), str.end(), match, rgx)) {
  // You can use `match' here to get your substring
};

As Anton said above: don't .正如安东所说: 不要

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM