简体   繁体   中英

Is there a better way than String.Replace to remove backspaces from a string?

I have a string read from another source such as "\\b\\bfoo\\bx". In this case, it would translate to the word "fox" as the first 2 \\b's are ignored, and the last 'o' is erased, and then replaced with 'x'. Also another case would be "patt\\b\\b\\b\\b\\b\\b\\b\\b\\b\\bfoo" should be translated to "foo"

I have come up with something using String.Replace, but it is complex and I am worried it is not working correctly, also it is creating a lot of new string objects which I would like to avoid.

Any ideas?

Probably the easiest is to just iterate over the entire string. Given your inputs, the following code does the trick in 1-pass

public string ReplaceBackspace(string hasBackspace)
{
    if( string.IsNullOrEmpty(hasBackspace) )
        return hasBackspace;

    StringBuilder result = new StringBuilder(hasBackspace.Length);
    foreach (char c in hasBackspace)
    {
        if (c == '\b')
        {
            if (result.Length > 0)
                result.Length--;
        }
        else
        {
            result.Append(c);
        }
    }
    return result.ToString();
}

The way I would do it is low-tech, but easy to understand.

Create a stack of characters. Then iterate through the string from beginning to end. If the character is a normal character (non-slash), push it onto the stack. If it is a slash, and the next character is a 'b', pop the top of the stack. If the stack is empty, ignore it.

At the end, pop each character in turn, add it to a StringBuilder, and reverse the result.

Regular expressions version:

var data = @"patt\b\b\b\b\b\b\b\b\b\bfoo";
var regex = new Regex(@"(^|[^\\b])\\b");

while (regex.IsMatch(data))
{
    data = regex.Replace(data, "");
}

Optimized version (and this one works with backspace '\\b' and not with string "\\b"):

var data = "patt\b\b\b\b\b\b\b\b\b\bfoo";
var regex = new Regex(@"[^\x08]\x08", RegexOptions.Compiled);

while (data.Contains('\b'))
{
    data = regex.Replace(data.TrimStart('\b'), "");
}
public static string ProcessBackspaces(string source)
{
    char[] buffer = new char[source.Length];
    int idx = 0;

    foreach (char c in source)
    {
        if (c != '\b')
        {
            buffer[idx] = c;
            idx++;
        }
        else if (idx > 0)
        {
            idx--;
        }
    }

    return new string(buffer, 0, idx);
}

EDIT

I've done a quick, rough benchmark of the code posted in answers so far (processing the two example strings from the question, one million times each):

 ANSWER                 | TIME (ms)
------------------------|-----------
 Luke (this one)        |       318
 Alexander Taran        |       567
 Robert Paulson         |       683
 Markus Nigbur          |      2100
 Kamarey (new version)  |      7075
 Kamarey (old version)  |     30902

You could iterate through the string backward, making a character array as you go. Every time you hit a backspace, increment a counter, and every time you hit a normal character, skip it if your counter is non-zero and decrement the counter.

I'm not sure what the best C# data structure is to manage this and then be able to get the string in the right order afterward quickly. StringBuilder has an Insert method but I don't know if it will be performant to keep inserting characters at the start or not. You could put the characters in a stack and hit ToArray() at the end -- that might or might not be faster.

String myString = "patt\b\b\b\b\b\b\b\b\b\bfoo";
      List<char> chars = myString.ToCharArray().ToList();
      int delCount = 0;

      for (int i = chars.Count -1; i >= 0; i--)
      {
        if (chars[i] == '\b')
        {
          delCount++;
          chars.RemoveAt(i);
        } else {
          if (delCount > 0 && chars[i] != null) {
            chars.RemoveAt(i);
            delCount--;
          }
        }
      }

i'd go like this: code is not tested

char[] result = new char[input.Length()];
int r =0;
for (i=0; i<input.Length(); i++){
if (input[i] == '\b'  && r>0) r--;
 else result[r]=input[i];

}

string resultsring = result.take(r);

创建一个StringBuilder并复制除退格字符之外的所有内容。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM