[英]C++ Get the substring between custom delimiters without the use of regex
I've a simple string of format:我有一个简单的格式字符串:
"lorem ipsum <span id='1'>extract_me-1</span> dolor
sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum
sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem"
And now i need to extract the strings between a custom delimiters specified现在我需要提取指定的自定义分隔符之间的字符串
for example,例如,
Substring("<span id='1'>","</span>") = extract_me-1
Substring("<span id='2'>","</span>") = extract_me-2
Substring("lorem","<span id='1'>") = ipsum
Substring("extract_me-1","dolor") = </span>
I've accomplished this task using regex:我已经使用正则表达式完成了这项任务:
std::string str="lorem ipsum <span id='1'>extract_me-1</span> dolor sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem";
std::smatch match;
std::regex rgx ("<span id='1'>(.*?)</span>");
if (regex_search(str, match, rgx)){
//First substring
std::cout<<match.str(1);
}
Is there any way to do this without the use of regex.. I've tried using substr
a couple of times, but still no avail.. any help is highly appreciated, thnks有什么方法可以在不使用正则表达式的情况下做到这一点。我尝试使用
substr
几次,但仍然无济于事。非常感谢任何帮助,谢谢
EDIT: the input str
is not in a complete html format, just a bit of random tags.. and i just need the substring from start to next closest end position (yes, even when there is nested tags of same span
or repetition)编辑:输入
str
不是完整的 html 格式,只是一些随机标签.. 我只需要从开始到下一个最近端 position的 substring (或span
相同的重复)
You need to check each return value of each of the str.find()
calls like I do for the first one but this is the gist of it.您需要检查每个
str.find()
调用的每个返回值,就像我对第一个调用所做的那样,但这是它的要点。 Might want to just search for the tag, then the id, but then you also need to check for non-existing id for for that tag:可能只想搜索标签,然后是 id,但您还需要检查该标签的不存在 id:
#include <string>
int main() {
const std::string str="lorem ipsum <span id='1'>extract_me-1</span> dolor sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem";
const std::string tag = "<span id='";
std::string r = "";
for(size_t pos = 0;;) {
size_t tag_pos = str.find(tag, pos);
if(tag_pos == str.npos) {
break;
}
size_t id_pos = tag_pos + tag.size();
size_t id_pos2 = str.find("'", id_pos);
size_t txt_pos = str.find(">", id_pos2) + 1;
size_t txt_pos2 = str.find("<", txt_pos);
r += "txt";
r += str.substr(id_pos, id_pos2 - id_pos);
r += " = ";
r += str.substr(txt_pos, txt_pos2 - txt_pos);
r += "\n";
pos = txt_pos2;
}
}
I was able to solve this using .find
and .substr
.我能够使用
.find
和.substr
解决这个问题。 It turned out to be easier than I thought结果比我想象的要容易
#include <string>
#include <iostream>
using namespace std;
int t1,t2;
string str="lorem ipsum <span id='1'>extract_me-1</span> dolor sit amet <span id='2'>extract_me-2</span> adispicing consequit lorem ipsum sit amet <span id='3'>extract_me-3</span> adispicing dolor lorem";
string subStrng(string start,string end);
int main() {
string txt1 = subStrng("<span id='1'>","</span>");
string txt2 = subStrng("<span id='2'>","</span>");
string txt3 = subStrng("<span id='3'>","</span>");
cout<<txt1<<"\n"<<txt2<<"\n"<<txt3;
return 0;
}
//Substring func.
string subStrng(string start,string end){
t1=str.find(start);
if(t1 >= 0){
// string 'start' exist in str.
// Now, lets find the next closest string 'end'
t1=t1+start.length();
t2=str.find(end,t1);
if(t2 >= 0){
// next closest 'end' exists in the str.
// Now, lets extract the substring in between
return str.substr(t1,t2-t1);
}else{
return "";
}
}else{
return "";
}
}
Cheers干杯
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.