简体   繁体   English

两个字符串之间

[英]Between of two strings

I have simple method in C# : 我在C#中有简单的方法:

public static string BetweenOf(string ActualStr, string StrFirst, string StrLast)
        {
            return ActualStr.Substring(ActualStr.IndexOf(StrFirst) + StrFirst.Length, (ActualStr.Substring(ActualStr.IndexOf(StrFirst))).IndexOf(StrLast) + StrLast.Length);
        }

How can i optimise this ? 我该如何优化呢?

If I have understood what you want to do, I think your implementation is possibly not correct. 如果我了解您想要做什么,那么我认为您的实现可能不正确。

Here is an implementation that I believe will perform better at least in terms of the GC because it does not use multiple calls to SubString which create new strings on the heap which are only used temporarily. 我认为这是一种实现,至少在GC方面会更好,因为它不使用对SubString多次调用,这些调用在SubString创建仅临时使用的新字符串。

public static string BetweenOfFixed(string ActualStr, string StrFirst, string StrLast)
{
  int startIndex = ActualStr.IndexOf(StrFirst) + StrFirst.Length;
  int endIndex = ActualStr.IndexOf(StrLast, startIndex);
  return ActualStr.Substring(startIndex, endIndex - startIndex);
}

It would be interesting to compare the performance of this vs. the regex solution. 比较此与正则表达式解决方案的性能将很有趣。

You could construct a regex: 您可以构造一个正则表达式:

var regex = strFirst + "(.*)" + strLast;

Your between text will be the first (and only) capture for the match. 您之间的文本将是该匹配的第一个(也是唯一一个)捕获。

Here's how the code from @Chris here stacks up against a regular expression test: 这是@Chris的代码与正则表达式测试叠加的方式:

void Main()
{
    string input = "abcdefghijklmnopq";
    string first = "de";
    string last = "op";
    Regex re1 = new Regex("de(.*)op", RegexOptions.None);
    Regex re2 = new Regex("de(.*)op", RegexOptions.Compiled);

    // pass 1 is JIT preheat
    for (int pass = 1; pass <= 2; pass++)
    {
        int iterations = 1000000;
        if (pass == 1)
            iterations = 1;

        Stopwatch sw = Stopwatch.StartNew();
        for (int index = 0; index < iterations; index++)
            BetweenOfFixed(input, first, last);
        sw.Stop();
        if (pass == 2)
            Debug.WriteLine("IndexOf: " + 
                sw.ElapsedMilliseconds + "ms");

        sw = Stopwatch.StartNew();
        for (int index = 0; index < iterations; index++)
            BetweenOfRegexAdhoc(input, first, last);
        sw.Stop();
        if (pass == 2)
            Debug.WriteLine("Regex adhoc: " + 
                sw.ElapsedMilliseconds + "ms");

        sw = Stopwatch.StartNew();
        for (int index = 0; index < iterations; index++)
            BetweenOfRegexCached(input, first, last);
        sw.Stop();
        if (pass == 2)
            Debug.WriteLine("Regex uncompiled: " +
                sw.ElapsedMilliseconds + "ms");

        sw = Stopwatch.StartNew();
        for (int index = 0; index < iterations; index++)
            BetweenOfRegexCompiled(input, first, last);
        sw.Stop();
        if (pass == 2)
            Debug.WriteLine("Regex compiled: " +
                sw.ElapsedMilliseconds + "ms");
    }
}

public static string BetweenOfFixed(string ActualStr, string StrFirst,
    string StrLast)
{
    int startIndex = ActualStr.IndexOf(StrFirst) + StrFirst.Length;
    int endIndex = ActualStr.IndexOf(StrLast, startIndex);
    return ActualStr.Substring(startIndex, endIndex - startIndex);
}

public static string BetweenOfRegexAdhoc(string ActualStr, string StrFirst,
    string StrLast)
{
    // I'm assuming you don't replace the delimiters on every call
    Regex re = new Regex("de(.*)op", RegexOptions.None);
    return re.Match(ActualStr).Groups[1].Value;
}

private static Regex _BetweenOfRegexCached =
    new Regex("de(.*)op", RegexOptions.None);
public static string BetweenOfRegexCached(string ActualStr, string StrFirst,
    string StrLast)
{
    return _BetweenOfRegexCached.Match(ActualStr).Groups[1].Value;
}

private static Regex _BetweenOfRegexCompiled =
    new Regex("de(.*)op", RegexOptions.Compiled);
public static string BetweenOfRegexCompiled(string ActualStr, string StrFirst,
    string StrLast)
{
    return _BetweenOfRegexCompiled.Match(ActualStr).Groups[1].Value;
}

Output: 输出:

IndexOf: 1419ms
Regex adhoc: 7788ms
Regex uncompiled: 1074ms
Regex compiled: 682ms

What about using a regular expression? 使用正则表达式呢? This would probably be faster than building of temporary strings. 这可能比构建临时字符串更快。 Also this would enable to easily and gently handle the case where no such string can be found. 同样,这将使得能够容易且轻柔地处理找不到这样的弦的情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM