[英]Splitting a string into multiple strings with multiple delimiters without removing?
I use boost framework, so it could be helpful, but I haven't found a necessary function. 我使用boost框架,因此可能会有所帮助,但是我没有找到必要的功能。
For usual fast splitting I can use: 对于通常的快速拆分,我可以使用:
string str = ...;
vector<string> strs;
boost::split(strs, str, boost::is_any_of("mM"));
but it removes m and M characters. 但会删除m和M个字符。
I also can't siply use regexp because it searches the string for the longest value which meets a defined pattern. 我也不擅长使用regexp,因为它会在字符串中搜索符合定义模式的最长值。
PS There are a lot of similar questions, but they describe this implementation in other programming languages only. PS有很多类似的问题,但是它们仅以其他编程语言描述了此实现。
Untested, but rather than using vector<string>
, you could try a vector<boost::iterator_range<std::string::iterator>>
(so you get a pair of iterators to the main string for each token. Then iterate from (start of range -1 [as long as start of range is not begin()
of main string], to end of range) 未经测试,但可以使用
vector<boost::iterator_range<std::string::iterator>>
(而不是使用vector<string>
,因此您可以为每个标记获得一对主字符串的迭代器。然后从(范围-1的开始[只要范围的begin()
不是主字符串的begin()
,到范围的结束)
EDIT: Here is an example: 编辑:这是一个示例:
#include <iostream>
#include <string>
#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
#include <boost/range/iterator_range.hpp>
int main(void)
{
std::string str = "FooMBarMSFM";
std::vector<boost::iterator_range<std::string::iterator>> tokens;
boost::split(tokens, str, boost::is_any_of("mM"));
for(auto r : tokens)
{
std::string b(r.begin(), r.end());
std::cout << b << std::endl;
if (r.begin() != str.begin())
{
std::string bm(std::prev(r.begin()), r.end());
std::cout << "With token: [" << bm << "]" << std::endl;
}
}
}
Your need is beyond the conception of split
. 您的需求超出了
split
的概念。 If you want to keep 'm or M', you could write a special split by strstr
, strchr
, strtok
or find
function. 如果要保留'm或M',则可以通过
strstr
, strchr
, strtok
或find
函数编写一个特殊的拆分。 You could change some code to produce a flexible split
function. 您可以更改一些代码以产生灵活的
split
功能。 Here is an example: 这是一个例子:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void split(char *src, const char *separator, char **dest, int *num)
{
char *pNext;
int count = 0;
if (src == NULL || strlen(src) == 0) return;
if (separator == NULL || strlen(separator) == 0) return;
pNext = strtok(src,separator);
while(pNext != NULL)
{
*dest++ = pNext;
++count;
pNext = strtok(NULL,separator);
}
*num = count;
}
Besides, you could try boost::regex
. 此外,您可以尝试
boost::regex
。
My current solution is the following (but it is not universal and looks like too complex). 我当前的解决方案如下(但它不是通用的,看起来太复杂了)。
I choose one character which couldn't appear in this string. 我选择了一个不会出现在该字符串中的字符。 In my case it is '|'.
在我的情况下,它是“ |”。
string str = ...;
vector<string> strs;
boost::split(strs, str, boost::is_any_of("m"));
str = boost::join(strs, "|m");
boost::split(strs, str, boost::is_any_of("M"));
str = boost::join(strs, "|M");
if (boost::iequals(str.substr(0, 1), "|") {
str = str.substr(1);
}
boost::split(strs, str, boost::is_any_of("|"));
I add "|" 我加“ |” before each of symbols m/M, except of the very first position in string.
每个符号m / M之前,除了字符串中的第一个位置。 Then I split the string into substrings with deleting of this extra character
然后我将字符串拆分为子字符串,并删除此多余字符
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.