[英]C++ Get the substring between custom delimiters without the use of regex
我有一个简单的格式字符串:
"lorem ipsum <span id='1'>extract_me-1</span> dolor
sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum
sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem"
现在我需要提取指定的自定义分隔符之间的字符串
例如,
Substring("<span id='1'>","</span>") = extract_me-1
Substring("<span id='2'>","</span>") = extract_me-2
Substring("lorem","<span id='1'>") = ipsum
Substring("extract_me-1","dolor") = </span>
我已经使用正则表达式完成了这项任务:
std::string str="lorem ipsum <span id='1'>extract_me-1</span> dolor sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem";
std::smatch match;
std::regex rgx ("<span id='1'>(.*?)</span>");
if (regex_search(str, match, rgx)){
//First substring
std::cout<<match.str(1);
}
有什么方法可以在不使用正则表达式的情况下做到这一点。我尝试使用substr
几次,但仍然无济于事。非常感谢任何帮助,谢谢
编辑:输入str
不是完整的 html 格式,只是一些随机标签.. 我只需要从开始到下一个最近端 position的 substring (或span
相同的重复)
您需要检查每个str.find()
调用的每个返回值,就像我对第一个调用所做的那样,但这是它的要点。 可能只想搜索标签,然后是 id,但您还需要检查该标签的不存在 id:
#include <string>
int main() {
const std::string str="lorem ipsum <span id='1'>extract_me-1</span> dolor sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem";
const std::string tag = "<span id='";
std::string r = "";
for(size_t pos = 0;;) {
size_t tag_pos = str.find(tag, pos);
if(tag_pos == str.npos) {
break;
}
size_t id_pos = tag_pos + tag.size();
size_t id_pos2 = str.find("'", id_pos);
size_t txt_pos = str.find(">", id_pos2) + 1;
size_t txt_pos2 = str.find("<", txt_pos);
r += "txt";
r += str.substr(id_pos, id_pos2 - id_pos);
r += " = ";
r += str.substr(txt_pos, txt_pos2 - txt_pos);
r += "\n";
pos = txt_pos2;
}
}
我能够使用.find
和.substr
解决这个问题。 结果比我想象的要容易
#include <string>
#include <iostream>
using namespace std;
int t1,t2;
string str="lorem ipsum <span id='1'>extract_me-1</span> dolor sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem";
string subStrng(string start,string end);
int main() {
string txt1 = subStrng("<span id='1'>","</span>");
string txt2 = subStrng("<span id='2'>","</span>");
string txt3 = subStrng("<span id='3'>","</span>");
cout<<txt1<<"\n"<<txt2<<"\n"<<txt3;
return 0;
}
//Substring func.
string subStrng(string start,string end){
t1=str.find(start);
if(t1 >= 0){
// string 'start' exist in str.
// Now, lets find the next closest string 'end'
t1=t1+start.length();
t2=str.find(end,t1);
if(t2 >= 0){
// next closest 'end' exists in the str.
// Now, lets extract the substring in between
return str.substr(t1,t2-t1);
}else{
return "";
}
}else{
return "";
}
}
干杯
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.