简体   繁体   中英

Memory Efficiency and Performance of String.Replace .NET Framework

 string str1 = "12345ABC...\\...ABC100000"; 
 // Hypothetically huge string of 100000 + Unicode Chars
 str1 = str1.Replace("1", string.Empty);
 str1 = str1.Replace("22", string.Empty);
 str1 = str1.Replace("656", string.Empty);
 str1 = str1.Replace("77ABC", string.Empty);

 // ...  this replace anti-pattern might happen with upto 50 consecutive lines of code.

 str1 = str1.Replace("ABCDEFGHIJD", string.Empty);

I have inherited some code that does the same as the snippet above. It takes a huge string and replaces (removes) constant smaller strings from the large string.

I believe this is a very memory intensive process given that new large immutable strings are being allocated in memory for each replace, awaiting death via the GC.

1. What is the fastest way of replacing these values, ignoring memory concerns?

2. What is the most memory efficient way of achieving the same result?

I am hoping that these are the same answer!

Practical solutions that fit somewhere in between these goals are also appreciated.

Assumptions:

  • All replacements are constant and known in advance
  • Underlying characters do contain some unicode [non-ascii] chars

All characters in a .NET string are "unicode chars". Do you mean they're non-ascii? That shouldn't make any odds - unless you run into composition issues, eg an "e + acute accent" not being replaced when you try to replace an "e acute".

You could try using a regular expression with Regex.Replace , or StringBuilder.Replace . Here's sample code doing the same thing with both:

using System;
using System.Text;
using System.Text.RegularExpressions;

class Test
{
    static void Main(string[] args)
    {
        string original = "abcdefghijkl";

        Regex regex = new Regex("a|c|e|g|i|k", RegexOptions.Compiled);

        string removedByRegex = regex.Replace(original, "");
        string removedByStringBuilder = new StringBuilder(original)
            .Replace("a", "")
            .Replace("c", "")
            .Replace("e", "")
            .Replace("g", "")
            .Replace("i", "")
            .Replace("k", "")
            .ToString();

        Console.WriteLine(removedByRegex);
        Console.WriteLine(removedByStringBuilder);
    }
}

I wouldn't like to guess which is more efficient - you'd have to benchmark with your specific application. The regex way may be able to do it all in one pass, but that pass will be relatively CPU-intensive compared with each of the many replaces in StringBuilder.

If you want to be really fast, and I mean really fast you'll have to look beyond the StringBuilder and just write well optimized code.

One thing your computer doesn't like to do is branching, if you can write a replace method which operates on a fixed array (char *) and doesn't branch you have great performance.

What you'll be doing is that the replace operation is going to search for a sequence of characters and if it finds any such sub string it will replace it. In effect you'll copy the string and when doing so, preform the find and replace.

You'll rely on these functions for picking the index of some buffer to read/write. The goal is to preform the replace method such that when nothing has to change you write junk instead of branching.

You should be able to complete this without a single if statement and remember to use unsafe code. Otherwise you'll be paying for index checking for every element access.

unsafe
{
    fixed( char * p = myStringBuffer )
    {
        // Do fancy string manipulation here
    }
}

I've written code like this in C# for fun and seen significant performance improvements, almost 300% speed up for find and replace. While the .NET BCL (base class library) performs quite well it is riddled with branching constructs and exception handling this will slow down you code if you use the built-in stuff. Also these optimizations while perfectly sound are not preformed by the JIT-compiler and you'll have to run the code as a release build without any debugger attached to be able to observe the massive performance gain.

I could provide you with more complete code but it is a substantial amount of work. However, I can guarantee you that it will be faster than anything else suggested so far.

1. What is the fastest way of replacing these values, ignoring memory concerns?

The fastest way is to build a custom component that's specific to your use case. As of .NET 4.6, There's no class in the BCL designed for multiple string replacements.

If you NEED something fast out of the BCL, StringBuilder is the fastest BCL component for simple string replacement. The source code can be found here : It's pretty efficient for replacing a single string. Only use Regex if you really need the pattern-matching power of regular expressions. It's slower and a little more cumbersome, even when compiled.

2. What is the most memory efficient way of achieving the same result?

The most memory-efficient way is to perform a filtered stream copy from the source to the destination (explained below). Memory consumption will be limited to your buffer, however this will be more CPU intensive; as a rule of thumb, you're going to trade CPU performance for memory consumption.

Technical Details

String replacements are tricky. Even when performing a string replacement in a mutable memory space (such as with StringBuilder ), it's expensive. If the replacement string is a different length than original string, you're going to be relocating every character following the replacement string to keep the whole string contiguous. This results in a LOT of memory writes, and even in the case of StringBuilder , causes you to rewrite most of the string in-memory on every call to Replace.

So what is the fastest way to do string replacements? Write the new string using a single-pass: Don't let your code go back and have to re-write anything. Writes are more expensive than reads. You're going to have to code this yourself for best results.

High-Memory Solution

The class I've written generates strings based on templates. I place tokens ($ReplaceMe$) in a template which marks places where I want to insert a string later. I use it in cases where XmlWriter is too onerous for XML that's largely static and repetitive, and I need to produce large XML (or JSON) data streams.

The class works by slicing the template up into parts and places each part into a numbered dictionary. Parameters are also enumerated. The order in which the parts and parameters are inserted into a new string are placed into an integer array. When a new string is generated, the parts and parameters are picked from the dictionary and used to create a new string.

It's neither fully-optimized nor is it bulletproof, but it works great for generating very large data streams from templates.

Low-Memory Solution

You'll need to read small chunks from the source string into a buffer, search the buffer using an optimized search algorithm, and then write the new string to the destination stream / string. There are a lot of potential caveats here, but it would be memory efficient and a better solution for source data that's dynamic and can't be cached, such as whole-page translations or source data that's too large to reasonably cache. I don't have a sample solution for this handy.

Sample Code

Desired Results

<DataTable source='Users'>
  <Rows>
    <Row id='25' name='Administrator' />
    <Row id='29' name='Robert' />
    <Row id='55' name='Amanda' />
  </Rows>
</DataTable>

Template

<DataTable source='$TableName$'>
  <Rows>
    <Row id='$0$' name='$1$'/>
  </Rows>
</DataTable>

Test Case

class Program
{
  static string[,] _users =
  {
    { "25", "Administrator" },
    { "29", "Robert" },
    { "55", "Amanda" },
  };

  static StringTemplate _documentTemplate = new StringTemplate(@"<DataTable source='$TableName$'><Rows>$Rows$</Rows></DataTable>");
  static StringTemplate _rowTemplate = new StringTemplate(@"<Row id='$0$' name='$1$' />");
  static void Main(string[] args)
  {
    _documentTemplate.SetParameter("TableName", "Users");
    _documentTemplate.SetParameter("Rows", GenerateRows);

    Console.WriteLine(_documentTemplate.GenerateString(4096));
    Console.ReadLine();
  }

  private static void GenerateRows(StreamWriter writer)
  {
    for (int i = 0; i <= _users.GetUpperBound(0); i++)
      _rowTemplate.GenerateString(writer, _users[i, 0], _users[i, 1]);
  }
}

StringTemplate Source

public class StringTemplate
{
  private string _template;
  private string[] _parts;
  private int[] _tokens;
  private string[] _parameters;
  private Dictionary<string, int> _parameterIndices;
  private string[] _replaceGraph;
  private Action<StreamWriter>[] _callbackGraph;
  private bool[] _graphTypeIsReplace;

  public string[] Parameters
  {
    get { return _parameters; }
  }

  public StringTemplate(string template)
  {
    _template = template;
    Prepare();
  }

  public void SetParameter(string name, string replacement)
  {
    int index = _parameterIndices[name] + _parts.Length;
    _replaceGraph[index] = replacement;
    _graphTypeIsReplace[index] = true;
  }

  public void SetParameter(string name, Action<StreamWriter> callback)
  {
    int index = _parameterIndices[name] + _parts.Length;
    _callbackGraph[index] = callback;
    _graphTypeIsReplace[index] = false;
  }

  private static Regex _parser = new Regex(@"\$(\w{1,64})\$", RegexOptions.Compiled);
  private void Prepare()
  {
    _parameterIndices = new Dictionary<string, int>(64);
    List<string> parts = new List<string>(64);
    List<object> tokens = new List<object>(64);
    int param_index = 0;
    int part_start = 0;

    foreach (Match match in _parser.Matches(_template))
    {
      if (match.Index > part_start)
      {
        //Add Part
        tokens.Add(parts.Count);
        parts.Add(_template.Substring(part_start, match.Index - part_start));
      }


      //Add Parameter
      var param = _template.Substring(match.Index + 1, match.Length - 2);
      if (!_parameterIndices.TryGetValue(param, out param_index))
        _parameterIndices[param] = param_index = _parameterIndices.Count;
      tokens.Add(param);

      part_start = match.Index + match.Length;
    }

    //Add last part, if it exists.
    if (part_start < _template.Length)
    {
      tokens.Add(parts.Count);
      parts.Add(_template.Substring(part_start, _template.Length - part_start));
    }

    //Set State
    _parts = parts.ToArray();
    _tokens = new int[tokens.Count];

    int index = 0;
    foreach (var token in tokens)
    {
      var parameter = token as string;
      if (parameter == null)
        _tokens[index++] = (int)token;
      else
        _tokens[index++] = _parameterIndices[parameter] + _parts.Length;
    }

    _parameters = _parameterIndices.Keys.ToArray();
    int graphlen = _parts.Length + _parameters.Length;
    _callbackGraph = new Action<StreamWriter>[graphlen];
    _replaceGraph = new string[graphlen];
    _graphTypeIsReplace = new bool[graphlen];

    for (int i = 0; i < _parts.Length; i++)
    {
      _graphTypeIsReplace[i] = true;
      _replaceGraph[i] = _parts[i];
    }
  }

  public void GenerateString(Stream output)
  {
    var writer = new StreamWriter(output);
    GenerateString(writer);
    writer.Flush();
  }

  public void GenerateString(StreamWriter writer)
  {
    //Resolve graph
    foreach(var token in _tokens)
    {
      if (_graphTypeIsReplace[token])
        writer.Write(_replaceGraph[token]);
      else
        _callbackGraph[token](writer);
    }
  }

  public void SetReplacements(params string[] parameters)
  {
    int index;
    for (int i = 0; i < _parameters.Length; i++)
    {
      if (!Int32.TryParse(_parameters[i], out index))
        continue;
      else
        SetParameter(index.ToString(), parameters[i]);
    }
  }

  public string GenerateString(int bufferSize = 1024)
  {
    using (var ms = new MemoryStream(bufferSize))
    {
      GenerateString(ms);
      ms.Position = 0;
      using (var reader = new StreamReader(ms))
        return reader.ReadToEnd();
    }
  }

  public string GenerateString(params string[] parameters)
  {
    SetReplacements(parameters);
    return GenerateString();
  }

  public void GenerateString(StreamWriter writer, params string[] parameters)
  {
    SetReplacements(parameters);
    GenerateString(writer);
  }
}

StringBuilder: http://msdn.microsoft.com/en-us/library/2839d5h5.aspx

The performance of the Replace operation itself should be roughly same as string.Replace and according to Microsoft no garbage should be produced.

Here's a quick benchmark...

        Stopwatch s = new Stopwatch();
        s.Start();
        string replace = source;
        replace = replace.Replace("$TS$", tsValue);
        replace = replace.Replace("$DOC$", docValue);
        s.Stop();

        Console.WriteLine("String.Replace:\t\t" + s.ElapsedMilliseconds);

        s.Reset();

        s.Start();
        StringBuilder sb = new StringBuilder(source);
        sb = sb.Replace("$TS$", tsValue);
        sb = sb.Replace("$DOC$", docValue);
        string output = sb.ToString();
        s.Stop();

        Console.WriteLine("StringBuilder.Replace:\t\t" + s.ElapsedMilliseconds);

I didn't see much difference on my machine (string.replace was 85ms and stringbuilder.replace was 80), and that was against about 8MB of text in "source"...

StringBuilder sb = new StringBuilder("Hello string");
sb.Replace("string", String.Empty);
Console.WriteLine(sb);  

StringBuilder , a mutable string.

Here is my benchmark :

using System;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

internal static class MeasureTime
{
    internal static TimeSpan Run(Action func, uint count = 1)
    {
        if (count <= 0)
        {
            throw new ArgumentOutOfRangeException("count", "Must be greater than zero");
        }

        long[] arr_time = new long[count];
        Stopwatch sw = new Stopwatch();
        for (uint i = 0; i < count; i++)
        {
            sw.Start();
            func();
            sw.Stop();
            arr_time[i] = sw.ElapsedTicks;
            sw.Reset();
        }

        return new TimeSpan(count == 1 ? arr_time.Sum() : Convert.ToInt64(Math.Round(arr_time.Sum() / (double)count)));
    }
}

public class Program
{
    public static string RandomString(int length)
    {
        Random random = new Random();
        const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
        return new String(Enumerable.Range(1, length).Select(_ => chars[random.Next(chars.Length)]).ToArray());
    }

    public static void Main()
    {
        string rnd_str = RandomString(500000);
        Regex regex = new Regex("a|c|e|g|i|k", RegexOptions.Compiled);
        TimeSpan ts1 = MeasureTime.Run(() => regex.Replace(rnd_str, "!!!"), 10);
        Console.WriteLine("Regex time: {0:hh\\:mm\\:ss\\:fff}", ts1);
        
        StringBuilder sb_str = new StringBuilder(rnd_str);
        TimeSpan ts2 = MeasureTime.Run(() => sb_str.Replace("a", "").Replace("c", "").Replace("e", "").Replace("g", "").Replace("i", "").Replace("k", ""), 10);
        Console.WriteLine("StringBuilder time: {0:hh\\:mm\\:ss\\:fff}", ts2);
        
        TimeSpan ts3 = MeasureTime.Run(() => rnd_str.Replace("a", "").Replace("c", "").Replace("e", "").Replace("g", "").Replace("i", "").Replace("k", ""), 10);
        Console.WriteLine("String time: {0:hh\\:mm\\:ss\\:fff}", ts3);

        char[] ch_arr = {'a', 'c', 'e', 'g', 'i', 'k'};
        TimeSpan ts4 = MeasureTime.Run(() => new String((from c in rnd_str where !ch_arr.Contains(c) select c).ToArray()), 10);
        Console.WriteLine("LINQ time: {0:hh\\:mm\\:ss\\:fff}", ts4);
    }

}

Regex time: 00:00:00:008

StringBuilder time: 00:00:00:015

String time: 00:00:00:005

LINQ can't process rnd_str (Fatal Error: Memory usage limit was exceeded)

String.Replace is fastest

I've ended up in this thread a couple of times now and I haven't been totally convinced after reading the previous answers since some of the benchmarks was done with a StopWatch which might give some kind of indication but does not feel great.

My use case is that I have a string that might be quite big ie a HTML output from a website. I need to replace a number of placeholders inside this string (about 10, max 20) with values.

I created a Benchmark.NET-test to get some robust data around this and here is what I found:

TLDR:

  • Don't use String.Replace if performance/memory is a concern.
  • Regex.Replace is the fastest but uses slightly more memory than StringBuilder.Replace. Compiled-regex is fastest if you intend to reuse the same pattern, for low number of usage a non-compile Regex-instance is cheaper to create.
  • If you only care about memory consumption and are fine with slower execution, use StringBuilder.Replace

Results from the test:

|                Method | ItemsToReplace |       Mean |     Error |    StdDev |   Gen 0 |  Gen 1 | Gen 2 | Allocated |
|---------------------- |--------------- |-----------:|----------:|----------:|--------:|-------:|------:|----------:|
|         StringReplace |              3 |  21.493 us | 0.1182 us | 0.1105 us |  3.6926 | 0.0305 |     - |  18.96 KB |
|  StringBuilderReplace |              3 |  35.383 us | 0.1341 us | 0.1119 us |  2.5024 |      - |     - |  13.03 KB |
|          RegExReplace |              3 |  19.620 us | 0.1252 us | 0.1045 us |  3.4485 | 0.0305 |     - |  17.75 KB |
| RegExReplace_Compiled |              3 |   4.573 us | 0.0318 us | 0.0282 us |  2.7084 | 0.0610 |     - |  13.91 KB |
|         StringReplace |             10 |  74.273 us | 0.7900 us | 0.7390 us | 12.2070 | 0.1221 |     - |  62.75 KB |
|  StringBuilderReplace |             10 | 115.322 us | 0.5820 us | 0.5444 us |  2.6855 |      - |     - |  13.84 KB |
|          RegExReplace |             10 |  24.121 us | 0.1130 us | 0.1002 us |  4.4250 | 0.0916 |     - |  22.75 KB |
| RegExReplace_Compiled |             10 |   8.601 us | 0.0298 us | 0.0279 us |  3.6774 | 0.1221 |     - |  18.92 KB |
|         StringReplace |             20 | 150.193 us | 1.4508 us | 1.3571 us | 24.6582 | 0.2441 |     - | 126.89 KB |
|  StringBuilderReplace |             20 | 233.984 us | 1.1707 us | 1.0951 us |  2.9297 |      - |     - |   15.3 KB |
|          RegExReplace |             20 |  28.699 us | 0.1179 us | 0.1045 us |  4.8218 | 0.0916 |     - |  24.79 KB |
| RegExReplace_Compiled |             20 |  12.672 us | 0.0599 us | 0.0560 us |  4.0894 | 0.1221 |     - |  20.95 KB |

So my conclusions is:

  • Regex.Replace is the way to go for fast execution and reasonable memory usage. Use a Compiled shared instance to speed up.
  • StringBuilder has the lowest memory footprint but is a lot slower than Regex.Replace. I would only use it if memory is the only thing that matters.

The code for the benchmark looks like this:

[MemoryDiagnoser]
[HtmlExporter]
[PlainExporter]
[RPlotExporter]
public class String_Replace
{
    private Dictionary<string, string> _valuesToReplace = new Dictionary<string, string>()
    {
        {"foo","fooRep" },
        {"bar","barRep" },
        {"lorem","loremRep" },
        {"ipsum","ipsumRep" },
        {"x","xRep" },
        {"y","yRep" },
        {"z","zRep" },
        {"yada","yadaRep" },
        {"old","oldRep" },
        {"new","newRep" },

        {"enim","enimRep" },
        {"amet","ametRep" },
        {"sit","sitRep" },
        {"convallis","convallisRep" },
        {"vehicula","vehiculaRep" },
        {"suspendisse","suspendisseRep" },
        {"accumsan","accumsanRep" },
        {"suscipit","suscipitRep" },
        {"ligula","ligulaRep" },
        {"posuere","posuereRep" }
    };

    private Regex _regexCompiled;

    private string GetText_With_3_Tags()
    {
        return @"<html>
        <body>
        Lorem ipsum dolor sit [foo], consectetur [bar] elit. Proin nulla quam, faucibus a ligula quis, posuere commodo elit. Nunc at tincidunt elit. Sed ipsum ex, accumsan sed viverra sit amet, tincidunt id nibh. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Nam interdum ex eget blandit lacinia. Nullam a tortor id sapien fringilla pellentesque vel ac purus. Fusce placerat dapibus tortor id luctus. Aenean in lacinia neque. Fusce quis ultrices odio. Nam id leo neque.

Etiam erat lorem, tincidunt volutpat odio at, finibus pharetra felis. Sed magna enim, accumsan at convallis a, aliquet eu quam. Vestibulum faucibus tincidunt ipsum et lacinia. Sed cursus ut arcu a commodo. Integer euismod eros at efficitur sollicitudin. In quis magna non orci sollicitudin condimentum. Fusce sed lacinia lorem, nec varius erat. In quis odio viverra, pharetra ex ac, hendrerit ante. Mauris congue enim et tellus sollicitudin pulvinar non sit amet tortor. Suspendisse at ex pharetra, semper diam ut, molestie velit. Cras lacinia urna neque, sit amet laoreet ex venenatis nec. Mauris at leo massa.

Aliquam mollis ultrices mi, sit amet venenatis enim rhoncus nec. Integer sit amet lectus tempor, finibus nisl quis, sodales ante. Curabitur suscipit dolor a dignissim consequat. Nulla eget vestibulum massa. Nam fermentum congue velit a placerat. Vivamus bibendum ex velit, id auctor ipsum bibendum eu. Praesent id gravida dui. Curabitur sollicitudin lobortis purus ac tempor. Sed felis enim, ornare et est egestas, blandit tincidunt lacus. Ut commodo dignissim augue, eget bibendum augue facilisis non.

Ut tortor neque, dignissim sit amet [lorem] ut, facilisis sit amet quam. Nullam et leo ut est congue vehicula et accumsan dolor. Aliquam erat dolor, eleifend a ipsum et, maximus suscipit ipsum. Nunc nec diam ex. Praesent suscipit aliquet condimentum. Nulla sodales lobortis fermentum. Maecenas ut laoreet sem. Ut id pulvinar urna, vel gravida lacus. Integer nunc urna, euismod eget vulputate sit amet, pharetra nec velit. Donec vel elit ac dolor varius faucibus tempus sed tortor. Donec metus diam, condimentum sit amet odio at, cursus cursus risus. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nullam maximus tellus id quam consequat vestibulum. Curabitur rutrum eros tellus, eget commodo mauris sollicitudin a. In dignissim non est at pretium. Nunc bibendum pharetra dui ac ullamcorper.

Sed rutrum vehicula pretium. Morbi eu felis ante. Aliquam vel mauris at felis tempus dictum ac a justo. Suspendisse ultricies nisi turpis, non sagittis magna porttitor venenatis. Aliquam ac risus quis leo semper viverra non ac nunc. Phasellus lacinia augue sed libero elementum, at interdum nunc posuere. Duis lacinia rhoncus urna eget scelerisque. Morbi ullamcorper tempus bibendum. Proin at est eget nibh dignissim bibendum. Fusce imperdiet ut urna nec mattis. Aliquam massa mauris, consequat tristique magna non, sodales tempus massa. Ut lobortis risus rhoncus, molestie mi vitae, accumsan enim. Quisque dapibus libero elementum lectus dignissim, non finibus lacus lacinia.
        </p><p>
        </body>
        </html>";
    }

    
    private string GetText_With_10_Tags()
    {
          return @"<html>
        <body>
        Lorem ipsum dolor sit [foo], consectetur [bar] elit. Proin nulla quam, faucibus a ligula quis, posuere commodo elit. Nunc at tincidunt elit. Sed ipsum ex, accumsan sed viverra sit amet, tincidunt id nibh. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Nam interdum ex eget blandit lacinia. Nullam a tortor id sapien fringilla pellentesque vel ac purus. Fusce placerat dapibus tortor id luctus. Aenean in lacinia neque. Fusce quis ultrices odio. Nam id leo neque.

Etiam erat lorem, tincidunt volutpat odio at, finibus pharetra felis. Sed magna enim, [z] at convallis a, aliquet eu quam. Vestibulum faucibus tincidunt ipsum et lacinia. Sed cursus ut arcu a commodo. Integer euismod eros at efficitur sollicitudin. In quis magna non orci sollicitudin condimentum. Fusce sed lacinia lorem, nec varius erat. In quis odio viverra, pharetra ex ac, hendrerit ante. Mauris congue enim et tellus sollicitudin pulvinar non sit amet tortor. Suspendisse at ex pharetra, semper diam ut, molestie velit. Cras lacinia urna neque, sit amet laoreet ex venenatis nec. Mauris at leo massa.

Aliquam mollis ultrices mi, sit amet venenatis enim rhoncus nec. Integer sit amet [y] tempor, finibus nisl quis, sodales ante. Curabitur suscipit dolor a dignissim consequat. Nulla eget vestibulum massa. Nam fermentum congue velit a placerat. Vivamus bibendum ex velit, id auctor ipsum bibendum eu. Praesent id gravida dui. Curabitur sollicitudin lobortis purus ac tempor. Sed felis enim, ornare et est egestas, blandit tincidunt lacus. Ut commodo dignissim augue, eget bibendum augue facilisis non.

Ut tortor neque, dignissim sit amet [lorem] ut, [ipsum] sit amet quam. [x] et leo ut est congue [new] et accumsan dolor. Aliquam erat dolor, eleifend a ipsum et, maximus suscipit ipsum. Nunc nec diam ex. Praesent suscipit aliquet condimentum. Nulla sodales lobortis fermentum. Maecenas ut laoreet sem. Ut id pulvinar urna, vel gravida lacus. Integer nunc urna, euismod eget vulputate sit amet, pharetra nec velit. Donec vel elit ac dolor varius faucibus tempus sed tortor. Donec metus diam, condimentum sit amet odio at, cursus cursus risus. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nullam maximus tellus id quam consequat vestibulum. Curabitur rutrum eros tellus, eget commodo mauris sollicitudin a. In dignissim non est at pretium. Nunc bibendum pharetra dui ac ullamcorper.

Sed rutrum vehicula pretium. Morbi eu felis ante. Aliquam vel [old] at felis [yada] dictum ac a justo. Suspendisse ultricies nisi turpis, non sagittis magna porttitor venenatis. Aliquam ac risus quis leo semper viverra non ac nunc. Phasellus lacinia augue sed libero elementum, at interdum nunc posuere. Duis lacinia rhoncus urna eget scelerisque. Morbi ullamcorper tempus bibendum. Proin at est eget nibh dignissim bibendum. Fusce imperdiet ut urna nec mattis. Aliquam massa mauris, consequat tristique magna non, sodales tempus massa. Ut lobortis risus rhoncus, molestie mi vitae, accumsan enim. Quisque dapibus libero elementum lectus dignissim, non finibus lacus lacinia.
        </p><p>
        </body>
        </html>";
    }

    private string GetText_With_20_Tags()
    {
           return @"<html>
        <body>
        Lorem ipsum dolor sit [foo], consectetur [bar] elit. Proin nulla [convallis], faucibus a [vehicula] quis, posuere commodo elit. Nunc at tincidunt elit. Sed ipsum ex, accumsan sed viverra sit amet, tincidunt id nibh. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Nam interdum ex eget blandit lacinia. Nullam a tortor id sapien fringilla pellentesque vel ac purus. Fusce placerat dapibus tortor id luctus. Aenean in lacinia neque. Fusce quis ultrices odio. Nam id leo neque.

Etiam erat lorem, tincidunt [posuere] odio at, finibus pharetra felis. Sed magna enim, [z] at convallis a, [enim] eu quam. Vestibulum faucibus tincidunt ipsum et lacinia. Sed cursus ut arcu a commodo. Integer euismod eros at efficitur sollicitudin. In quis magna non orci sollicitudin condimentum. Fusce sed lacinia lorem, nec varius erat. In quis odio viverra, pharetra ex ac, hendrerit ante. Mauris congue enim et tellus sollicitudin pulvinar non sit amet tortor. Suspendisse at ex pharetra, semper diam ut, molestie velit. Cras lacinia urna neque, sit amet laoreet ex venenatis nec. Mauris at leo massa.

[suspendisse] mollis [amet] mi, sit amet venenatis enim rhoncus nec. Integer sit amet [y] tempor, finibus nisl quis, sodales ante. Curabitur suscipit dolor a dignissim consequat. Nulla eget vestibulum massa. Nam fermentum congue velit a placerat. Vivamus bibendum ex velit, id auctor ipsum bibendum eu. Praesent id gravida dui. Curabitur sollicitudin lobortis purus ac tempor. Sed felis enim, ornare et est egestas, blandit tincidunt lacus. Ut commodo dignissim augue, eget bibendum augue facilisis non.

Ut tortor neque, dignissim sit amet [lorem] ut, [ipsum] sit amet quam. [x] et leo ut est congue [new] et accumsan [ligula]. Aliquam erat dolor, eleifend a ipsum et, maximus suscipit ipsum. Nunc nec diam ex. Praesent suscipit aliquet condimentum. Nulla sodales lobortis fermentum. Maecenas ut laoreet sem. Ut id pulvinar urna, vel gravida lacus. Integer nunc urna, euismod eget vulputate sit amet, pharetra nec velit. Donec vel elit ac dolor varius faucibus tempus sed tortor. Donec metus diam, condimentum sit amet odio at, cursus cursus risus. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nullam maximus tellus id quam consequat vestibulum. Curabitur rutrum eros tellus, eget commodo mauris sollicitudin a. In dignissim non est at pretium. Nunc bibendum pharetra dui ac ullamcorper.

Sed rutrum vehicula [accumsan]. Morbi eu [suscipit] [sit]. Aliquam vel [old] at felis [yada] dictum ac a justo. Suspendisse ultricies nisi turpis, non sagittis magna porttitor venenatis. Aliquam ac risus quis leo semper viverra non ac nunc. Phasellus lacinia augue sed libero elementum, at interdum nunc posuere. Duis lacinia rhoncus urna eget scelerisque. Morbi ullamcorper tempus bibendum. Proin at est eget nibh dignissim bibendum. Fusce imperdiet ut urna nec mattis. Aliquam massa mauris, consequat tristique magna non, sodales tempus massa. Ut lobortis risus rhoncus, molestie mi vitae, accumsan enim. Quisque dapibus libero elementum lectus dignissim, non finibus lacus lacinia.
        </p><p>
        </body>
        </html>";
    }

    private string GetText(int numberOfReplace)
    {
        if (numberOfReplace == 3)
            return GetText_With_3_Tags();
        if (numberOfReplace == 10)
            return GetText_With_10_Tags();
        if (numberOfReplace == 20)
            return GetText_With_20_Tags();

        return "";
    }

    public String_Replace()
    {
        _regexCompiled = new Regex(@"\[([^]]*)\]",RegexOptions.Compiled);
    }

    [Params(3,10,20)]
    public int ItemsToReplace { get; set; }

    [Benchmark]
    public void StringReplace()
    {
        var str = GetText(ItemsToReplace);
        foreach (var rep  in _valuesToReplace.Take(ItemsToReplace))
        {
            str = str.Replace("[" + rep.Key + "]", rep.Value);
        }
    }

    [Benchmark]
    public void StringBuilderReplace()
    {
        var sb = new StringBuilder(GetText(ItemsToReplace));
        foreach (var rep  in _valuesToReplace.Take(ItemsToReplace))
        {
            sb.Replace("[" + rep.Key + "]", rep.Value);
        }
        var res = sb.ToString();
    }

    [Benchmark]
    public void RegExReplace()
    {
        var str = GetText(ItemsToReplace);
        Regex regex = new Regex(@"\[([^]]*)\]");
        
        str = regex.Replace(str, Replace);
        var res = str;
    }

    

    [Benchmark]
    public void RegExReplace_Compiled()
    {
        var str = GetText(ItemsToReplace);

        str = _regexCompiled.Replace(str, Replace);
        var res = str;
    }

    private string Replace(Match match)
    {
        if(match.Groups.Count > 0)
        { 
            string collectionKey = match.Groups[1].Value;

            return _valuesToReplace[collectionKey];
        }
        return string.Empty;
    }

}

if you want a built in class in dotnet i think StringBuilder is the best. to make it manully you can use unsafe code with char* and iterate through your string and replace based on your criteria

由于您对一个字符串进行了多次替换,因此我建议您在 StringBuilder 上使用 RegEx。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM