简体   繁体   English

C#ReadOnlySpan <char> vs用于字符串解剖的子字符串

[英]C# ReadOnlySpan<char> vs Substring for string dissection

I have a fairly simple string extension method that gets called very frequently in a system I have that is doing a lot of string manipulations. 我有一个相当简单的字符串扩展方法,它在我所做的很多字符串操作的系统中经常被调用。 I read this post ( String.Substring() seems to bottleneck this code ) and thought I would try the same method to see if I could find some performance by changing how I'm reading the string. 我读了这篇文章( String.Substring()似乎是这个代码的瓶颈 )并且认为我会尝试相同的方法,看看我是否可以通过改变我读取字符串的方式来找到一些性能。 My results are not quite what I was expecting (I was expecting ReadOnlySpan to provide a significant perf boost), and I'm wondering why that is. 我的结果并不是我所期待的(我期待ReadOnlySpan提供显着的性能提升),我想知道为什么会这样。 In my production code on a real run I found a very slight loss of performance. 在我的实际运行代码中,我发现性能略有下降。

I generated a file with ~1.15 million rows of strings with the character I care about, called the method on each, and dumped the results to console. 我使用我关心的字符生成了一个包含~115万行字符串的文件,在每个字符串上调用方法,并将结果转储到控制台。

My results (runtime in milliseconds) are: 我的结果(以毫秒为单位的运行时间)是:

ReadOnlySpan.IndexOf Framework 4.7.1: 68538
ReadOnlySpan.IndexOf Core 2.1: 64486

ReadOnlySpan.SequenceEqual Framework 4.7.1: 63650
ReadOnlySpan.SequenceEqual Core 2.1: 65071

substring Framework 4.7.1: 63508
substring Core 2.1: 64125


The code (all identical from Full Framework to Core 2.1): 代码(从完整框架到核心2.1完全相同):

The calling code: 调用代码:

static void Main(string[] args)
{
    Stopwatch sw = new Stopwatch();
    sw.Start();

    var f = File.ReadAllLines("periods.CSV");

    foreach (string s in f)
    { Console.WriteLine(s.CountOccurrences(".")); }

    sw.Stop();
    Console.WriteLine("Done in " + sw.ElapsedMilliseconds + " ms");
    Console.ReadKey();
}


The original substring form of my method: 我方法的原始子串形式:

public static int CountOccurrencesSub(this string val, string searchFor)
{
    if (string.IsNullOrEmpty(val) || string.IsNullOrEmpty(searchFor))
    { return 0; }

    int count = 0;

    for (int x = 0; x <= val.Length - searchFor.Length; x++)
    {
        if (val.Substring(x, searchFor.Length) == searchFor)
        { count++; }
    }

    return count;
}


The ReadOnlySpan version (which I've tested with both IndexOf and SequenceEqual for equality checks): ReadOnlySpan版本(我用IndexOf和SequenceEqual测试了相等性检查):

public static int CountOccurrences(this string val, string searchFor)
{
    if (string.IsNullOrEmpty(val) || string.IsNullOrEmpty(searchFor))
    { return 0; }

    int count = 0;

    ReadOnlySpan<char> vSpan = val.AsSpan();
    ReadOnlySpan<char> searchSpan = searchFor.AsSpan();

    for (int x = 0; x <= vSpan.Length - searchSpan.Length; x++)
    {
        if (vSpan.Slice(x, searchSpan.Length).SequenceEqual(searchSpan))
        { count++; }
    }

    return count;
}


Does the equality comparison do an allocation in the methods I'm calling, and therefor no boost? 相等比较是否在我正在调用的方法中进行分配,因此没有提升? Is this just not a good application for ReadOnlySpan? 这对ReadOnlySpan来说不是一个好的应用程序吗? Am I just plain old missing something? 我只是简单地失去了什么?

Although I'm a bit late to the party but I think I can still add relevant information to this topic. 虽然我有点迟到了,但我想我仍然可以在这个主题上添加相关信息。

First of all, some words about the other posters' measurements. 首先,关于其他海报测量的一些话。

OP's results are clearly incorrect. OP的结果显然不正确。 As it was pointed out in the comments, the I/O operations completely distorts the stats. 正如评论中指出的那样,I / O操作完全扭曲了统计数据。

The poster of the accepted answer is on the right track. 接受的答案的海报是在正确的轨道上。 His method eliminates the slow I/O operations and focuses clearly on the subject of the benchmark. 他的方法消除了缓慢的I / O操作,并明确关注基准测试的主题。 However, he doesn't mention the environment (especially the .NET runtime) used and his "warm-up method" is rather debatable. 但是,他没有提到使用的环境(尤其是.NET运行时),他的“热身方法”是值得商榷的。

Performance measurement is a really tricky business, it's very hard to get it right. 绩效评估是一项非常棘手的业务,很难做到正确。 I wouldn't even try to code it myself if I wanted to get valid results. 如果我想获得有效的结果,我甚至不会尝试自己编码。 So I decided to check out this issue using the widely-adopted Benchmark.NET library. 所以我决定使用广泛采用的Benchmark.NET库来查看这个问题。 To make this all more interesting I added a third candidate to the mix. 为了使这更有趣,我添加了第三个候选人。 This implementation uses String.CompareOrdinal for occurence counting and I expected pretty good results from it. 这个实现使用String.CompareOrdinal进行出现计数,我期望得到相当好的结果。

The benchmark 基准

Before the measurement starts (at the global setup stage), I generate 1,000,000 lines of lorem ipsum text. 在测量开始之前(在全局设置阶段),我生成1,000,000行lorem ipsum文本。 This data is used throughout the measurement. 该数据在整个测量过程中使用。

Each method is exercised with 1,000 and 1,000,000 lines and with a shorter (5 characters long) and a longer (39 characters long) search text. 每种方法都使用1,000和1,000,000行,并使用较短(5个字符长)和较长(39个字符长)的搜索文本。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace MyBenchmarks
{
#if NETCOREAPP2_1
    [CoreJob]
#else
    [ClrJob]
#endif
    [RankColumn, MarkdownExporterAttribute.StackOverflow]
    public class Benchmark
    {
        static readonly string[] words = new[]
        {
            "lorem", "ipsum", "dolor", "sit", "amet", "consectetuer",
            "adipiscing", "elit", "sed", "diam", "nonummy", "nibh", "euismod",
            "tincidunt", "ut", "laoreet", "dolore", "magna", "aliquam", "erat"
        };

        // borrowed from greg (https://stackoverflow.com/questions/4286487/is-there-any-lorem-ipsum-generator-in-c)
        static IEnumerable<string> LoremIpsum(Random random, int minWords, int maxWords, int minSentences, int maxSentences, int numLines)
        {
            var line = new StringBuilder();
            for (int l = 0; l < numLines; l++)
            {
                line.Clear();
                var numSentences = random.Next(maxSentences - minSentences) + minSentences + 1;
                for (int s = 0; s < numSentences; s++)
                {
                    var numWords = random.Next(maxWords - minWords) + minWords + 1;
                    line.Append(words[random.Next(words.Length)]);
                    for (int w = 1; w < numWords; w++)
                    {
                        line.Append(" ");
                        line.Append(words[random.Next(words.Length)]);
                    }
                    line.Append(". ");
                }
                yield return line.ToString();
            }
        }

        string[] lines;

        [Params(1000, 1_000_000)]
        public int N;

        [Params("lorem", "lorem ipsum dolor sit amet consectetuer")]
        public string SearchValue;

        [GlobalSetup]
        public void GlobalSetup()
        {
            lines = LoremIpsum(new Random(), 6, 8, 2, 3, 1_000_000).ToArray();
        }

        public static int CountOccurrencesSub(string val, string searchFor)
        {
            if (string.IsNullOrEmpty(val) || string.IsNullOrEmpty(searchFor))
            { return 0; }

            int count = 0;

            for (int x = 0; x <= val.Length - searchFor.Length; x++)
            {
                if (val.Substring(x, searchFor.Length) == searchFor)
                { count++; }
            }

            return count;
        }

        public static int CountOccurrences(string val, string searchFor)
        {
            if (string.IsNullOrEmpty(val) || string.IsNullOrEmpty(searchFor))
            { return 0; }

            int count = 0;

            ReadOnlySpan<char> vSpan = val.AsSpan();
            ReadOnlySpan<char> searchSpan = searchFor.AsSpan();

            for (int x = 0; x <= vSpan.Length - searchSpan.Length; x++)
            {
                if (vSpan.Slice(x, searchSpan.Length).SequenceEqual(searchSpan))
                { count++; }
            }

            return count;
        }

        public static int CountOccurrencesCmp(string val, string searchFor)
        {
            if (string.IsNullOrEmpty(val) || string.IsNullOrEmpty(searchFor))
            { return 0; }

            int count = 0;

            for (int x = 0; x <= val.Length - searchFor.Length; x++)
            {
                if (string.CompareOrdinal(val, x, searchFor, 0, searchFor.Length) == 0)
                { count++; }
            }

            return count;
        }


        [Benchmark(Baseline = true)]
        public int Substring()
        {
            int occurences = 0;
            for (var i = 0; i < N; i++)
                occurences += CountOccurrencesSub(lines[i], SearchValue);
            return occurences;
        }

        [Benchmark]
        public int Span()
        {
            int occurences = 0;
            for (var i = 0; i < N; i++)
                occurences += CountOccurrences(lines[i], SearchValue);
            return occurences;
        }

        [Benchmark]
        public int Compare()
        {
            int occurences = 0;
            for (var i = 0; i < N; i++)
                occurences += CountOccurrencesCmp(lines[i], SearchValue);
            return occurences;
        }
    }

    public class Program
    {
        public static void Main(string[] args)
        {
            BenchmarkRunner.Run<Benchmark>();
        }
    }
}

The results 结果

NET Core 2.1 NET Core 2.1

BenchmarkDotNet=v0.11.0, OS=Windows 7 SP1 (6.1.7601.0)
Intel Core i3-4360 CPU 3.70GHz (Haswell), 1 CPU, 4 logical and 2 physical cores
Frequency=3604970 Hz, Resolution=277.3948 ns, Timer=TSC
.NET Core SDK=2.1.400
  [Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
  Core   : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT

Job=Core  Runtime=Core  

    Method |       N |          SearchValue |           Mean |           Error |          StdDev |         Median | Scaled | ScaledSD | Rank |
---------- |-------- |--------------------- |---------------:|----------------:|----------------:|---------------:|-------:|---------:|-----:|
 Substring |    1000 |                lorem |     2,149.4 us |       2.2763 us |       2.1293 us |     2,149.4 us |   1.00 |     0.00 |    3 |
      Span |    1000 |                lorem |       555.5 us |       0.2786 us |       0.2470 us |       555.5 us |   0.26 |     0.00 |    1 |
   Compare |    1000 |                lorem |     1,471.8 us |       0.2133 us |       0.1891 us |     1,471.8 us |   0.68 |     0.00 |    2 |
           |         |                      |                |                 |                 |                |        |          |      |
 Substring |    1000 | lorem(...)etuer [39] |     2,128.7 us |       1.0414 us |       0.9741 us |     2,128.6 us |   1.00 |     0.00 |    3 |
      Span |    1000 | lorem(...)etuer [39] |       388.9 us |       0.0440 us |       0.0412 us |       388.9 us |   0.18 |     0.00 |    1 |
   Compare |    1000 | lorem(...)etuer [39] |     1,215.6 us |       0.7016 us |       0.6220 us |     1,215.5 us |   0.57 |     0.00 |    2 |
           |         |                      |                |                 |                 |                |        |          |      |
 Substring | 1000000 |                lorem | 2,239,510.8 us | 241,887.0796 us | 214,426.5747 us | 2,176,083.7 us |   1.00 |     0.00 |    3 |
      Span | 1000000 |                lorem |   558,317.4 us |     447.3105 us |     418.4144 us |   558,338.9 us |   0.25 |     0.02 |    1 |
   Compare | 1000000 |                lorem | 1,471,941.2 us |     190.7533 us |     148.9276 us | 1,471,955.8 us |   0.66 |     0.05 |    2 |
           |         |                      |                |                 |                 |                |        |          |      |
 Substring | 1000000 | lorem(...)etuer [39] | 2,350,820.3 us |  46,974.4500 us | 115,229.1264 us | 2,327,187.2 us |   1.00 |     0.00 |    3 |
      Span | 1000000 | lorem(...)etuer [39] |   433,567.7 us |  14,445.7191 us |  42,593.5286 us |   417,333.4 us |   0.18 |     0.02 |    1 |
   Compare | 1000000 | lorem(...)etuer [39] | 1,299,065.2 us |  25,474.8504 us |  46,582.2045 us | 1,296,892.8 us |   0.55 |     0.03 |    2 |  

NET Framework 4.7.2 NET Framework 4.7.2

BenchmarkDotNet=v0.11.0, OS=Windows 7 SP1 (6.1.7601.0)
Intel Core i3-4360 CPU 3.70GHz (Haswell), 1 CPU, 4 logical and 2 physical cores
Frequency=3604960 Hz, Resolution=277.3956 ns, Timer=TSC
  [Host] : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3062.0
  Clr    : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3062.0

Job=Clr  Runtime=Clr  

    Method |       N |          SearchValue |           Mean |          Error |          StdDev |         Median | Scaled | ScaledSD | Rank |
---------- |-------- |--------------------- |---------------:|---------------:|----------------:|---------------:|-------:|---------:|-----:|
 Substring |    1000 |                lorem |     2,025.8 us |      2.4639 us |       1.9237 us |     2,025.4 us |   1.00 |     0.00 |    3 |
      Span |    1000 |                lorem |     1,216.6 us |      4.2994 us |       4.0217 us |     1,217.8 us |   0.60 |     0.00 |    1 |
   Compare |    1000 |                lorem |     1,295.5 us |      5.2427 us |       4.6475 us |     1,293.1 us |   0.64 |     0.00 |    2 |
           |         |                      |                |                |                 |                |        |          |      |
 Substring |    1000 | lorem(...)etuer [39] |     1,939.5 us |      0.4428 us |       0.4142 us |     1,939.3 us |   1.00 |     0.00 |    3 |
      Span |    1000 | lorem(...)etuer [39] |       944.9 us |      2.6648 us |       2.3622 us |       944.7 us |   0.49 |     0.00 |    1 |
   Compare |    1000 | lorem(...)etuer [39] |     1,002.0 us |      0.2475 us |       0.2067 us |     1,002.1 us |   0.52 |     0.00 |    2 |
           |         |                      |                |                |                 |                |        |          |      |
 Substring | 1000000 |                lorem | 2,065,805.7 us |  2,009.2139 us |   1,568.6619 us | 2,065,555.1 us |   1.00 |     0.00 |    3 |
      Span | 1000000 |                lorem | 1,209,976.4 us |  6,238.6091 us |   5,835.5982 us | 1,206,554.3 us |   0.59 |     0.00 |    1 |
   Compare | 1000000 |                lorem | 1,303,321.8 us |  1,257.7418 us |   1,114.9552 us | 1,303,330.1 us |   0.63 |     0.00 |    2 |
           |         |                      |                |                |                 |                |        |          |      |
 Substring | 1000000 | lorem(...)etuer [39] | 2,085,652.9 us | 62,651.7471 us | 168,309.8501 us | 1,973,522.2 us |   1.00 |     0.00 |    3 |
      Span | 1000000 | lorem(...)etuer [39] |   958,421.2 us |  3,703.5508 us |   3,464.3034 us |   958,324.9 us |   0.46 |     0.03 |    1 |
   Compare | 1000000 | lorem(...)etuer [39] | 1,007,936.8 us |    802.1730 us |     750.3531 us | 1,007,680.3 us |   0.49 |     0.04 |    2 |

Conclusion 结论

It's clear that there is a solid performance gain using Span<T> . 很明显,使用Span <T>可以获得可靠的性能提升 What's somewhat surprising that it's 4-5x on .NET Core and only 2x on .NET Framework. 有点令人惊讶的是,它在.NET Core上只有4-5倍,在.NET Framework上只有2倍。 What reasons could be behind that? 那可能是什么原因? Does anyone have a clue? 有人有线索吗?

String.CompareOrdinal performs pretty good as well. String.CompareOrdinal也表现得很好。 I expected somewhat better results because theoretically it's just the same byte by byte compare but it's not bad at all. 我预计会有更好的结果,因为理论上它只是逐字节比较,但它并没有坏。 On .NET Framework it's a viable option by all means. 在.NET Framework上,它无论如何都是可行的选择。

The length of the search string (except for extremities, of course) doesn't seem to have too much impact on the results. 搜索字符串的长度(当然除了四肢)似乎对结果没有太大影响。

I was intrigued and tried to repeat your test. 我很好奇,并试图重复你的测试。 Depending on how large the data set is, the code using ReadOnlySpan performs almost twice as fast: 根据数据集的大小,使用ReadOnlySpan的代码执行速度几乎快两倍:

CountOccurences Done in 1080 ms
CountOccurencesSub Done in 1789 ms

With larger data sets, the difference seems to increase (which seems logical as Substring allocates a string, which increases GC pressure). 对于较大的数据集,差异似乎增加(这似乎是合理的,因为Substring分配一个字符串,这会增加GC压力)。

I used this code to test: 我用这段代码来测试:

static void Main(string[] args)
{
    var r = new Random();

    // generate 100000 lines of 1000 random characters
    var text = Enumerable.Range(0, 100000).Select(x => new string(Enumerable.Range(0, 1000).Select(i => (char)r.Next(255)).ToArray())).ToArray();

    // warm up
    "".CountOccurrencesSub("");
    "".CountOccurrences("");

    Measure(text, "CountOccurencesSub", s => s.CountOccurrencesSub("."));

    Measure(text, "CountOccurences", s => s.CountOccurrences("."));

    Console.ReadKey();
}

private static void Measure(string[] text, string test, Action<string> action)
{
    Stopwatch sw = new Stopwatch();
    sw.Start();

    foreach (string s in text)
    {
        action(s);
    }

    sw.Stop();

    Console.WriteLine($"{test} Done in {sw.ElapsedMilliseconds} ms");
}

The performance benchmark that hurt me the most with ReadOnlySpan is that it put me in a world of hurt when I tried to use it in an app that gets put into the Windows app store. 使用ReadOnlySpan对我造成最大伤害的性能基准是,当我试图在一个放入Windows应用商店的应用程序中使用它时,它让我处于一个受伤的世界。 The ReadOnlySpan<> stuff isn't compatible (yet, it seems) with compiling with the .NETNative tool chain. ReadOnlySpan <>的东西与.NETNative工具链的编译不兼容(但似乎)。 That, in turn may (or may not) trigger a cascade of subsequent WACK test errors that will block submission to the store. 反过来,这可能(或可能不会)触发一系列后续WACK测试错误,这些错误将阻止提交到商店。

So far, I've spent two days just trying to get an app that uses ReadOnlySpan<> into the store. 到目前为止,我花了两天时间试图将一个使用ReadOnlySpan <>的应用程序添加到商店中。 That far, far outweighs any possible performance improvements I might have potentially gotten. 到目前为止,远远超过了我可能获得的任何可能的性能改进。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM