[英]How to remove repetitive characters from std::string
I have got a std::string
like this:我有一个像这样的
std::string
:
std::string fileName;
where fileName
is like /tmp/fs////js//config.js
It is coming from somewhere and I need to store it.其中
fileName
就像/tmp/fs////js//config.js
它来自某个地方,我需要存储它。 But when I store it, I need to remove extra '/' chars from the path, basically need only one separator between directory names and file names.但是当我存储它时,我需要从路径中删除额外的'/'字符,基本上只需要目录名和文件名之间的一个分隔符。
I can remove these by iterating over the string one char at a time and comparing with the next char, but its not very efficient.我可以通过一次迭代一个字符并与下一个字符进行比较来删除这些字符串,但效率不高。
Can anyone suggest some efficient way to do it?谁能建议一些有效的方法来做到这一点?
Removing duplicate adjacent elements is a job for std::unique
.删除重复的相邻元素是
std::unique
。 You need to provide your own predicate in this case but it's O(n) and dead simple.在这种情况下,您需要提供自己的谓词,但它是 O(n) 并且非常简单。
struct both_slashes {
bool operator()(char a, char b) const {
return a == '/' && b == '/';
}
};
std::string path("/tmp/fs////js//config.js");
path.erase(std::unique(path.begin(), path.end(), both_slashes()), path.end());
你不会找到比这更有效的 - 想想看 - 你需要删除连续的重复字符 - 这意味着,即使在最好的情况下,你也必须至少查看每个字符一次.
I think std::unique
will work even though your string is not sorted because all it removes is consecutive duplicates.我认为即使您的字符串没有排序,
std::unique
也会起作用,因为它删除的是连续的重复项。
Of course it won't know that /
is a special character here and you may find file-names that contain double-letters also getting modified unexpectedly to single-leter, posibly anoyingly.当然,它不会知道
/
在这里是一个特殊字符,您可能会发现包含双字母的文件名也被意外修改为单字母,可能是令人讨厌的。
It is also O(N) but you can't avoid that.它也是 O(N) 但你无法避免。
One algorithm that will work well is std::remove_if because you can put in your own "functor" which can keep state so it will know what the last character was.一种运行良好的算法是 std::remove_if ,因为您可以放入自己的“函子”,它可以保持状态,以便知道最后一个字符是什么。
struct slash_pred
{
char last_char;
slash_pred()
: last_char( '\0' ) // or whatever as long as it's not '/'
{
}
bool operator()(char ch)
{
bool remove = (ch == '/') && (last_char == '/');
last_char = ch;
}
};
path.erase( std::remove_if( path.begin(), path.end(),
slash_pred() ), path.end() );
O(N) but should work. O(N) 但应该工作。
For the dissenters who think remove_if
might be O(N^2) it might be implemented like this:对于认为
remove_if
可能是 O(N^2) 的异议者,它可能是这样实现的:
template< typename ForwardIterator, typename Pred >
ForwardIterator remove_if( ForwardIterator read, ForwardIterator end, Pred pred )
{
ForwardIterator write = read; // outside the loop as we return it
for( ; read!=end; ++read )
{
if( !pred( *read ) )
{
if( write != read ) // avoid self-assign
{
*write = *read;
}
++write;
}
}
return write;
}
O(n) in time + O(n) in mem O(n) 时间 + O(n) 内存
void clean_path(std::string& path) {
std::string new_path;
char sep = '/';
for (auto i = 0; i < path.size(); ++i) {
if (path[i] == sep && !new_path.empty() && new_path.back() == sep)
continue;
new_path.push_back(path[i]);
}
path = new_path;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.