简体   繁体   English

`stackalloc` 关键字的实际使用

[英]Practical use of `stackalloc` keyword

Has anyone ever actually used stackalloc while programming in C#?有没有人在stackalloc编程时实际使用过 stackalloc? I am aware of what is does, but the only time it shows up in my code is by accident, because Intellisense suggests it when I start typing static , for example.我知道 is 做了什么,但它唯一一次出现在我的代码中是偶然的,因为 Intellisense 在我开始输入static时提示它,例如。

Although it is not related to the usage scenarios of stackalloc , I actually do a considerable amount of legacy interop in my apps, so every now and then I could resort to using unsafe code.虽然它与stackalloc的使用场景无关,但实际上我在我的应用程序中做了相当多的遗留互操作,所以我时不时地使用unsafe的代码。 But nevertheless I usually find ways to avoid unsafe completely.但是尽管如此,我通常会找到完全避免unsafe的方法。

And since stack size for a single thread in.Net is ~1Mb (correct me if I'm wrong), I am even more reserved from using stackalloc .由于 .Net 中单个线程的堆栈大小为 ~1Mb(如果我错了请纠正我),我对使用stackalloc更加保留。

Are there some practical cases where one could say: "this is exactly the right amount of data and processing for me to go unsafe and use stackalloc "?是否有一些实际案例可以说:“这对我来说是正确的数据量和处理 go 不安全并使用stackalloc ”?

The sole reason to use stackalloc is performance (either for computations or interop). 使用stackalloc的唯一原因是性能(用于计算或互操作)。 By using stackalloc instead of a heap allocated array, you create less GC pressure (the GC needs to run less), you don't need to pin the arrays down, it's faster to allocate than a heap array, an it is automatically freed on method exit (heap allocated arrays are only deallocated when GC runs). 通过使用stackalloc而不是堆分配的数组,可以减少GC的压力(GC需要运行的时间更少),不需要固定数组,分配的速度比堆数组快,它会自动释放方法出口(仅当GC运行时才释放分配给堆的数组)。 Also by using stackalloc instead of a native allocator (like malloc or the .Net equivalent) you also gain speed and automatic deallocation on scope exit. 同样,通过使用stackalloc代替本机分配器(如malloc或.Net等效项),您还可以在范围退出时获得速度和自动释放。

Performance wise, if you use stackalloc you greatly increase the chance of cache hits on the CPU due to the locality of data. 在性能方面,如果使用stackalloc ,则由于数据的局部性,会大大增加CPU上缓存命中的机会。

I have used stackalloc to allocate buffers for [near] realtime DSP work. 我已经使用stackalloc为[近]实时DSP工作分配缓冲区。 It was a very specific case where performance needed to be as consistent as possible. 在一个非常特殊的情况下,性能需要尽可能保持一致。 Note there is a difference between consistency and overall throughput - in this case I wasn't concerned with heap allocations being too slow, just with the non determinism of garbage collection at that point in the program. 请注意,一致性和总体吞吐量之间存在差异-在这种情况下,我并不担心堆分配太慢,而只是在程序中此时没有确定的垃圾回收。 I wouldn't use it in 99% of cases. 我不会在99%的情况下使用它。

stackalloc is only relevant for unsafe code. stackalloc仅与不安全的代码有关。 For managed code you can't decide where to allocate data. 对于托管代码,您无法决定在何处分配数据。 Value types are allocated on the stack per default (unless they are part of a reference type, in which case they are allocated on the heap). 默认情况下,值类型是在堆栈上分配的(除非它们是引用类型的一部分,在这种情况下,它们是在堆上分配的)。 Reference types are allocated on the heap. 引用类型在堆上分配。

The default stack size for a plain vanilla .NET application is 1 MB, but you can change this in the PE header. 普通香草.NET应用程序的默认堆栈大小为1 MB,但是您可以在PE标头中更改此大小。 If you're starting threads explicitly, you may also set a different size via the constructor overload. 如果要显式启动线程,则还可以通过构造函数重载来设置其他大小。 For ASP.NET applications the default stack size is only 256K, which is something to keep in mind if you're switching between the two environments. 对于ASP.NET应用程序,默认堆栈大小仅为256K,如果要在两种环境之间切换,则应牢记这一点。

Stackalloc initialization of spans. 范围的Stackalloc初始化。 In previous versions of C#, the result of stackalloc could only be stored into a pointer local variable. 在早期版本的C#中,stackalloc的结果只能存储在指针局部变量中。 As of C# 7.2, stackalloc can now be used as part of an expression and can target a span, and that can be done without using the unsafe keyword. 从C#7.2开始,现在可以将stackalloc用作表达式的一部分,并且可以将其作为目标范围,而无需使用unsafe关键字即可完成此操作。 Thus, instead of writing 因此,与其写作

Span<byte> bytes;
unsafe
{
  byte* tmp = stackalloc byte[length];
  bytes = new Span<byte>(tmp, length);
}

You can write simply: 您可以简单地编写:

Span<byte> bytes = stackalloc byte[length];

This is also extremely useful in situations where you need some scratch space to perform an operation, but want to avoid allocating heap memory for relatively small sizes 这在需要一些暂存空间来执行操作但又希望避免为较小的内存分配堆内存的情况下非常有用

Span<byte> bytes = length <= 128 ? stackalloc byte[length] : new byte[length];
... // Code that operates on the Span<byte>

Source: C# - All About Span: Exploring a New .NET Mainstay 来源: C#-有关跨度的一切:探索新的.NET主流

Late answer but I believe still helpful.迟到的答案,但我相信仍然有帮助。

I came to this question and I was still curios to see the performance difference so I created the following benchmark (used BenchmarkDotNet NuGet Package):我来到这个问题,我仍然好奇看到性能差异所以我创建了以下基准(使用 BenchmarkDotNet NuGet 包):

[MemoryDiagnoser]
[Orderer(SummaryOrderPolicy.FastestToSlowest)]
[RankColumn]
public class Benchmark1
{
    //private MemoryStream ms = new MemoryStream();

    static void FakeRead(byte[] buffer, int start, int length)
    {
        for (int i = start; i < length; i++)
            buffer[i] = (byte) (i % 250);
    }

    static void FakeRead(Span<byte> buffer)
    {
        for (int i = 0; i < buffer.Length; i++)
            buffer[i] = (byte) (i % 250);
    }

    [Benchmark]
    public void AllocatingOnHeap()
    {
        var buffer = new byte[1024];
        FakeRead(buffer, 0, buffer.Length);
    }

    [Benchmark]
    public void ConvertingToSpan()
    {
        var buffer = new Span<byte>(new byte[1024]);
        FakeRead(buffer);
    }

    [Benchmark]
    public void UsingStackAlloc()
    {
        Span<byte> buffer = stackalloc byte[1024];
        FakeRead(buffer);
    }
}

And this where the results这就是结果

|           Method |     Mean |    Error |   StdDev | Rank |  Gen 0 | Allocated |
|----------------- |---------:|---------:|---------:|-----:|-------:|----------:|
|  UsingStackAlloc | 704.9 ns | 13.81 ns | 12.91 ns |    1 |      - |         - |
| ConvertingToSpan | 755.8 ns |  5.77 ns |  5.40 ns |    2 | 0.0124 |   1,048 B |
| AllocatingOnHeap | 839.3 ns |  4.52 ns |  4.23 ns |    3 | 0.0124 |   1,048 B |

This benchmark shows that using stackalloc is the fastest solution and also it uses no allocations!这个基准表明使用stackalloc是最快的解决方案,而且它不使用分配! If you are curios how to use the NuGet Package BenchmarkDotNet then watch this video .如果您对如何使用 NuGet Package BenchmarkDotNet 感到好奇,请观看此视频

There are some great answers in this question but I just want to point out that 这个问题有很好的答案,但我只想指出

Stackalloc can also be used to call native APIs Stackalloc也可以用来调用本地API

Many native functions requires the caller to allocate a buffer to get the return result. 许多本机函数要求调用者分配一个缓冲区以获取返回结果。 For example, the CfGetPlaceholderInfo function in cfapi.h has the following signature. 例如, CfGetPlaceholderInfo在功能cfapi.h具有以下特征。

HRESULT CfGetPlaceholderInfo(
HANDLE                    FileHandle,
CF_PLACEHOLDER_INFO_CLASS InfoClass,
PVOID                     InfoBuffer,
DWORD                     InfoBufferLength,
PDWORD                    ReturnedLength);

In order to call it in C# through interop, 为了通过互操作在C#中调用它,

[DllImport("Cfapi.dll")]
public static unsafe extern HResult CfGetPlaceholderInfo(IntPtr fileHandle, uint infoClass, void* infoBuffer, uint infoBufferLength, out uint returnedLength);

You can make use of stackalloc. 您可以使用stackalloc。

byte* buffer = stackalloc byte[1024];
CfGetPlaceholderInfo(fileHandle, 0, buffer, 1024, out var returnedLength);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM