简体   繁体   English

C#中的memset相当于什么?

[英]What is the equivalent of memset in C#?

I need to fill a byte[] with a single non-zero value.我需要用一个非零值填充一个byte[] How can I do this in C# without looping through each byte in the array?我如何在 C# 中执行此操作而不循环遍历数组中的每个byte

Update: The comments seem to have split this into two questions -更新:评论似乎将其分为两个问题 -

  1. Is there a Framework method to fill a byte[] that might be akin to memset是否有框架方法来填充可能类似于memset的 byte[]
  2. What is the most efficient way to do it when we are dealing with a very large array?当我们处理一个非常大的数组时,最有效的方法是什么?

I totally agree that using a simple loop works just fine, as Eric and others have pointed out.正如 Eric 和其他人所指出的,我完全同意使用一个简单的循环就可以很好地工作。 The point of the question was to see if I could learn something new about C#:) I think Juliet's method for a Parallel operation should be even faster than a simple loop.问题的重点是看看我是否可以学到一些关于 C# 的新知识:) 我认为 Juliet 的并行操作方法应该比简单的循环更快。

Benchmarks: Thanks to Mikael Svenson: http://techmikael.blogspot.com/2009/12/filling-array-with-default-value.html基准测试:感谢 Mikael Svenson: http://techmikael.blogspot.com/2009/12/filling-array-with-default-value.html

It turns out the simple for loop is the way to go unless you want to use unsafe code.事实证明,简单for循环是通往 go 的途径,除非您想使用不安全的代码。

Apologies for not being clearer in my original post.很抱歉在我原来的帖子中没有说清楚。 Eric and Mark are both correct in their comments;埃里克和马克的评论都是正确的; need to have more focused questions for sure.肯定需要有更有针对性的问题。 Thanks for everyone's suggestions and responses.感谢大家的建议和回复。

You could use Enumerable.Repeat :您可以使用Enumerable.Repeat

byte[] a = Enumerable.Repeat((byte)10, 100).ToArray();

The first parameter is the element you want repeated, and the second parameter is the number of times to repeat it.第一个参数是你想要重复的元素,第二个参数是重复的次数。

This is OK for small arrays but you should use the looping method if you are dealing with very large arrays and performance is a concern.这对于小数组来说是可以的,但是如果您正在处理非常大的数组并且性能是一个问题,则应该使用循环方法。

Actually, there is little known IL operation called Initblk ( English version ) which does exactly that.实际上,鲜为人知的称为Initblk英文版)的 IL 操作正是这样做的。 So, let's use it as a method that doesn't require "unsafe".因此,让我们将其用作不需要“不安全”的方法。 Here's the helper class:这是帮助类:

public static class Util
{
    static Util()
    {
        var dynamicMethod = new DynamicMethod("Memset", MethodAttributes.Public | MethodAttributes.Static, CallingConventions.Standard,
            null, new [] { typeof(IntPtr), typeof(byte), typeof(int) }, typeof(Util), true);

        var generator = dynamicMethod.GetILGenerator();
        generator.Emit(OpCodes.Ldarg_0);
        generator.Emit(OpCodes.Ldarg_1);
        generator.Emit(OpCodes.Ldarg_2);
        generator.Emit(OpCodes.Initblk);
        generator.Emit(OpCodes.Ret);

        MemsetDelegate = (Action<IntPtr, byte, int>)dynamicMethod.CreateDelegate(typeof(Action<IntPtr, byte, int>));
    }

    public static void Memset(byte[] array, byte what, int length)
    {
        var gcHandle = GCHandle.Alloc(array, GCHandleType.Pinned);
        MemsetDelegate(gcHandle.AddrOfPinnedObject(), what, length);
        gcHandle.Free();
    }

    public static void ForMemset(byte[] array, byte what, int length)
    {
        for(var i = 0; i < length; i++)
        {
            array[i] = what;
        }
    }

    private static Action<IntPtr, byte, int> MemsetDelegate;

}

And what is the performance?性能如何? Here's my result for Windows/.NET and Linux/Mono (different PCs).这是我对 Windows/.NET 和 Linux/Mono(不同的 PC)的结果。

Mono/for:     00:00:01.1356610
Mono/initblk: 00:00:00.2385835 

.NET/for:     00:00:01.7463579
.NET/initblk: 00:00:00.5953503

So it's worth considering.所以值得考虑。 Note that the resulting IL will not be verifiable.请注意,生成的 IL 将无法验证。

Building on Lucero's answer , here is a faster version.基于Lucero's answer ,这里有一个更快的版本。 It will double the number of bytes copied using Buffer.BlockCopy every iteration.每次迭代使用Buffer.BlockCopy复制的字节数都会加倍。 Interestingly enough, it outperforms it by a factor of 10 when using relatively small arrays (1000), but the difference is not that large for larger arrays (1000000), it is always faster though.有趣的是,当使用相对较小的数组 (1000) 时,它的性能比它高 10 倍,但对于较大的数组 (1000000),差异并不大,但它总是更快。 The good thing about it is that it performs well even down to small arrays.它的好处是即使在小阵列上也能很好地执行。 It becomes faster than the naive approach at around length = 100. For a one million element byte array, it was 43 times faster.在长度为 100 左右时,它比简单的方法更快。对于一百万个元素的字节数组,它快了 43 倍。 (tested on Intel i7, .Net 2.0) (在英特尔 i7、.Net 2.0 上测试)

public static void MemSet(byte[] array, byte value) {
    if (array == null) {
        throw new ArgumentNullException("array");
    }

    int block = 32, index = 0;
    int length = Math.Min(block, array.Length);

    //Fill the initial array
    while (index < length) {
        array[index++] = value;
    }

    length = array.Length;
    while (index < length) {
        Buffer.BlockCopy(array, 0, array, index, Math.Min(block, length-index));
        index += block;
        block *= 2;
    }
}

A little bit late, but the following approach might be a good compromise without reverting to unsafe code.有点晚了,但以下方法可能是一个很好的折衷方案,而不会恢复到不安全的代码。 Basically it initializes the beginning of the array using a conventional loop and then reverts to Buffer.BlockCopy() , which should be as fast as you can get using a managed call.基本上它使用传统循环初始化数组的开头,然后恢复到Buffer.BlockCopy() ,这应该与使用托管调用一样快。

public static void MemSet(byte[] array, byte value) {
  if (array == null) {
    throw new ArgumentNullException("array");
  }
  const int blockSize = 4096; // bigger may be better to a certain extent
  int index = 0;
  int length = Math.Min(blockSize, array.Length);
  while (index < length) {
    array[index++] = value;
  }
  length = array.Length;
  while (index < length) {
    Buffer.BlockCopy(array, 0, array, index, Math.Min(blockSize, length-index));
    index += blockSize;
  }
}

This simple implementation uses successive doubling, and performs quite well (about 3-4 times faster than the naive version according to my benchmarks):这个简单的实现使用连续加倍,并且性能非常好(根据我的基准测试,比原始版本快大约 3-4 倍):

public static void Memset<T>(T[] array, T elem) 
{
    int length = array.Length;
    if (length == 0) return;
    array[0] = elem;
    int count;
    for (count = 1; count <= length/2; count*=2)
        Array.Copy(array, 0, array, count, count);
    Array.Copy(array, 0, array, count, length - count);
}

Edit: upon reading the other answers, it seems I'm not the only one with this idea.编辑:在阅读其他答案时,似乎我不是唯一一个有这个想法的人。 Still, I'm leaving this here, since it's a bit cleaner and it performs on par with the others.尽管如此,我还是把它留在这里,因为它更干净一点,而且性能与其他产品不相上下。

If performance is critical, you could consider using unsafe code and working directly with a pointer to the array.如果性能至关重要,您可以考虑使用不安全代码并直接使用指向数组的指针。

Another option could be importing memset from msvcrt.dll and use that.另一种选择是从 msvcrt.dll 导入 memset 并使用它。 However, the overhead from invoking that might easily be larger than the gain in speed.但是,调用的开销可能很容易大于速度的增益。

Looks like System.Runtime.CompilerServices.Unsafe.InitBlock now does the same thing as the OpCodes.Initblk instruction that Konrad's answer mentions (he also mentioned a source link ).看起来System.Runtime.CompilerServices.Unsafe.InitBlock现在与康拉德的回答提到的OpCodes.Initblk指令做同样的事情(他还提到了一个源链接)。

The code to fill in the array is as follows:填充数组的代码如下:

byte[] a = new byte[N];
byte valueToFill = 255;

System.Runtime.CompilerServices.Unsafe.InitBlock(ref a[0], valueToFill, (uint) a.Length);

Or use P/Invoke way :或者使用 P/Invoke 方式

[DllImport("msvcrt.dll", 
EntryPoint = "memset", 
CallingConvention = CallingConvention.Cdecl, 
SetLastError = false)]
public static extern IntPtr MemSet(IntPtr dest, int c, int count);

static void Main(string[] args)
{
    byte[] arr = new byte[3];
    GCHandle gch = GCHandle.Alloc(arr, GCHandleType.Pinned);
    MemSet(gch.AddrOfPinnedObject(), 0x7, arr.Length); 
}

If performance is absolutely critical, then Enumerable.Repeat(n, m).ToArray() will be too slow for your needs.如果性能绝对至关重要,那么Enumerable.Repeat(n, m).ToArray()对您的需求来说太慢了。 You might be able to crank out faster performance using PLINQ or Task Parallel Library :您可以使用 PLINQ 或Task Parallel Library提高性能:

using System.Threading.Tasks;

// ...

byte initialValue = 20;
byte[] data = new byte[size]
Parallel.For(0, size, index => data[index] = initialValue);

All answers are writing single bytes only - what if you want to fill a byte array with words?所有答案都只写入单个字节 - 如果您想用单词填充字节数组怎么办? Or floats?还是漂浮? I find use for that now and then.我偶尔会发现它的用处。 So after having written similar code to 'memset' in a non-generic way a few times and arriving at this page to find good code for single bytes, I went about writing the method below.因此,在以非通用方式编写了几次类似的“memset”代码并到达此页面以找到单字节的好代码之后,我开始编写下面的方法。

I think PInvoke and C++/CLI each have their drawbacks.我认为 PInvoke 和 C++/CLI 各有其缺点。 And why not have the runtime 'PInvoke' for you into mscorxxx?为什么不在 mscorxxx 中为您提供运行时“PInvoke”? Array.Copy and Buffer.BlockCopy are native code certainly. Array.Copy 和 Buffer.BlockCopy 当然是原生代码。 BlockCopy isn't even 'safe' - you can copy a long halfway over another, or over a DateTime as long as they're in arrays. BlockCopy 甚至不是“安全的”——只要它们在数组中,您就可以在另一个或 DateTime 的中途复制很长的一段。

At least I wouldn't go file new C++ project for things like this - it's a waste of time almost certainly.至少我不会为这样的事情提交新的 C++ 项目——这几乎肯定是在浪费时间。

So here's basically an extended version of the solutions presented by Lucero and TowerOfBricks that can be used to memset longs, ints, etc as well as single bytes.所以这里基本上是 Lucero 和 TowerOfBricks 提供的解决方案的扩展版本,可用于 memset longs、ints 等以及单个字节。

public static class MemsetExtensions
{
    static void MemsetPrivate(this byte[] buffer, byte[] value, int offset, int length) {
        var shift = 0;
        for (; shift < 32; shift++)
            if (value.Length == 1 << shift)
                break;
        if (shift == 32 || value.Length != 1 << shift)
            throw new ArgumentException(
                "The source array must have a length that is a power of two and be shorter than 4GB.", "value");

        int remainder;
        int count = Math.DivRem(length, value.Length, out remainder);

        var si = 0;
        var di = offset;
        int cx;
        if (count < 1) 
            cx = remainder;
        else 
            cx = value.Length;
        Buffer.BlockCopy(value, si, buffer, di, cx);
        if (cx == remainder)
            return;

        var cachetrash = Math.Max(12, shift); // 1 << 12 == 4096
        si = di;
        di += cx;
        var dx = offset + length;
        // doubling up to 1 << cachetrash bytes i.e. 2^12 or value.Length whichever is larger
        for (var al = shift; al <= cachetrash && di + (cx = 1 << al) < dx; al++) {
            Buffer.BlockCopy(buffer, si, buffer, di, cx);
            di += cx;
        }
        // cx bytes as long as it fits
        for (; di + cx <= dx; di += cx)
            Buffer.BlockCopy(buffer, si, buffer, di, cx);
        // tail part if less than cx bytes
        if (di < dx)
            Buffer.BlockCopy(buffer, si, buffer, di, dx - di);
    }
}

Having this you can simply add short methods to take the value type you need to memset with and call the private method, eg just find replace ulong in this method:有了这个,您可以简单地添加简短的方法来获取您需要使用 memset 的值类型并调用私有方法,例如只需在此方法中找到替换 ulong:

    public static void Memset(this byte[] buffer, ulong value, int offset, int count) {
        var sourceArray = BitConverter.GetBytes(value);
        MemsetPrivate(buffer, sourceArray, offset, sizeof(ulong) * count);
    }

Or go silly and do it with any type of struct (although the MemsetPrivate above only works for structs that marshal to a size that is a power of two):或者愚蠢地使用任何类型的结构(尽管上面的 MemsetPrivate 仅适用于编组为 2 的幂的大小的结构):

    public static void Memset<T>(this byte[] buffer, T value, int offset, int count) where T : struct {
        var size = Marshal.SizeOf<T>();
        var ptr = Marshal.AllocHGlobal(size);
        var sourceArray = new byte[size];
        try {
            Marshal.StructureToPtr<T>(value, ptr, false);
            Marshal.Copy(ptr, sourceArray, 0, size);
        } finally {
            Marshal.FreeHGlobal(ptr);
        }
        MemsetPrivate(buffer, sourceArray, offset, count * size);
    }

I changed the initblk mentioned before to take ulongs to compare performance with my code and that silently fails - the code runs but the resulting buffer contains the least significant byte of the ulong only.我更改了之前提到的 initblk 以使用 ulong 将性能与我的代码进行比较,并且默默地失败了 - 代码运行但结果缓冲区仅包含 ulong 的最低有效字节。

Nevertheless I compared the performance writing as big a buffer with for, initblk and my memset method.尽管如此,我还是将写入作为大缓冲区的性能与 for、initblk 和 memset 方法进行了比较。 The times are in ms total over 100 repetitions writing 8 byte ulongs whatever how many times fit the buffer length.无论多少次适合缓冲区长度,总时间以毫秒为单位,超过 100 次重复写入 8 字节 ulong。 The for version is manually loop-unrolled for the 8 bytes of a single ulong. for 版本是为单个 ulong 的 8 个字节手动循环展开的。

Buffer Len  #repeat  For millisec  Initblk millisec   Memset millisec
0x00000008  100      For   0,0032  Initblk   0,0107   Memset   0,0052
0x00000010  100      For   0,0037  Initblk   0,0102   Memset   0,0039
0x00000020  100      For   0,0032  Initblk   0,0106   Memset   0,0050
0x00000040  100      For   0,0053  Initblk   0,0121   Memset   0,0106
0x00000080  100      For   0,0097  Initblk   0,0121   Memset   0,0091
0x00000100  100      For   0,0179  Initblk   0,0122   Memset   0,0102
0x00000200  100      For   0,0384  Initblk   0,0123   Memset   0,0126
0x00000400  100      For   0,0789  Initblk   0,0130   Memset   0,0189
0x00000800  100      For   0,1357  Initblk   0,0153   Memset   0,0170
0x00001000  100      For   0,2811  Initblk   0,0167   Memset   0,0221
0x00002000  100      For   0,5519  Initblk   0,0278   Memset   0,0274
0x00004000  100      For   1,1100  Initblk   0,0329   Memset   0,0383
0x00008000  100      For   2,2332  Initblk   0,0827   Memset   0,0864
0x00010000  100      For   4,4407  Initblk   0,1551   Memset   0,1602
0x00020000  100      For   9,1331  Initblk   0,2768   Memset   0,3044
0x00040000  100      For  18,2497  Initblk   0,5500   Memset   0,5901
0x00080000  100      For  35,8650  Initblk   1,1236   Memset   1,5762
0x00100000  100      For  71,6806  Initblk   2,2836   Memset   3,2323
0x00200000  100      For  77,8086  Initblk   2,1991   Memset   3,0144
0x00400000  100      For 131,2923  Initblk   4,7837   Memset   6,8505
0x00800000  100      For 263,2917  Initblk  16,1354   Memset  33,3719

I excluded the first call every time, since both initblk and memset take a hit of I believe it was about .22ms for the first call.我每次都排除了第一次调用,因为 initblk 和 memset 都受到了影响,我相信第一次调用大约是 0.22 毫秒。 Slightly surprising my code is faster for filling short buffers than initblk, seeing it got half a page full of setup code.有点令人惊讶的是,我的代码填充短缓冲区的速度比 initblk 快,因为它有半页的设置代码。

If anybody feels like optimizing this, go ahead really.如果有人想对此进行优化,请继续。 It's possible.这是可能的。

Tested several ways, described in different answers.测试了几种方法,在不同的答案中描述。 See sources of test in c# test class在 c# 测试类中查看测试源

基准报告

您可以在初始化数组时执行此操作,但我认为这不是您的要求:

byte[] myBytes = new byte[5] { 1, 1, 1, 1, 1};

With the advent of Span<T> (which is dotnet core only, but it is the future of dotnet ) you have yet another way of solving this problem:随着Span<T> (它只是 dotnet 核心,但它是 dotnet 的未来)的出现,您还有另一种解决这个问题的方法:

var array = new byte[100];
var span = new Span<byte>(array);

span.Fill(255);

.NET Core has a built-in Array.Fill() function, but sadly .NET Framework is missing it. .NET Core 有一个内置的 Array.Fill() 函数,但遗憾的是 .NET Framework 缺少它。 .NET Core has two variations: fill the entire array and fill a portion of the array starting at an index. .NET Core 有两种变体:填充整个数组和从索引开始填充数组的一部分。

Building on the ideas above, here is a more generic Fill function that will fill the entire array of several data types.基于上述想法,这里有一个更通用的 Fill 函数,它将填充多种数据类型的整个数组。 This is the fastest function when benchmarking against other methods discussed in this post.当与本文中讨论的其他方法进行基准测试时,这是最快的功能。

This function, along with the version that fills a portion an array are available in an open source and free NuGet package ( HPCsharp on nuget.org ).此函数以及填充数组一部分的版本在开源和免费 NuGet 包( nuget.org 上的 HPCsharp )中可用 Also included is a slightly faster version of Fill using SIMD/SSE instructions that performs only memory writes, whereas BlockCopy-based methods perform memory reads and writes.还包括使用 SIMD/SSE 指令的稍快版本的 Fill,该指令仅执行内存写入,而基于 BlockCopy 的方法执行内存读取和写入。

    public static void FillUsingBlockCopy<T>(this T[] array, T value) where T : struct
    {
        int numBytesInItem = 0;
        if (typeof(T) == typeof(byte) || typeof(T) == typeof(sbyte))
            numBytesInItem = 1;
        else if (typeof(T) == typeof(ushort) || typeof(T) != typeof(short))
            numBytesInItem = 2;
        else if (typeof(T) == typeof(uint) || typeof(T) != typeof(int))
            numBytesInItem = 4;
        else if (typeof(T) == typeof(ulong) || typeof(T) != typeof(long))
            numBytesInItem = 8;
        else
            throw new ArgumentException(string.Format("Type '{0}' is unsupported.", typeof(T).ToString()));

        int block = 32, index = 0;
        int endIndex = Math.Min(block, array.Length);

        while (index < endIndex)          // Fill the initial block
            array[index++] = value;

        endIndex = array.Length;
        for (; index < endIndex; index += block, block *= 2)
        {
            int actualBlockSize = Math.Min(block, endIndex - index);
            Buffer.BlockCopy(array, 0, array, index * numBytesInItem, actualBlockSize * numBytesInItem);
        }
    }

Most of answers is for byte memset but if you want to use it for float or any other struct you should multiply index by size of your data.大多数答案是针对字节 memset 的,但是如果您想将它用于浮点数或任何其他结构,您应该将索引乘以数据的大小。 Because Buffer.BlockCopy will copy based on the bytes.因为 Buffer.BlockCopy 将根据字节进行复制。 This code will be work for float values此代码适用于浮点值

public static void MemSet(float[] array, float value) {
    if (array == null) {
        throw new ArgumentNullException("array");
    }

    int block = 32, index = 0;
    int length = Math.Min(block, array.Length);

    //Fill the initial array
    while (index < length) {
        array[index++] = value;
    }

    length = array.Length;
    while (index < length) {
        Buffer.BlockCopy(array, 0, array, index * sizeof(float), Math.Min(block, length-index)* sizeof(float));
        index += block;
        block *= 2;
    }
}

Here's 'unsafe/unchecked' code that performs memset qword aligned and qword at a time.这是一次执行 memset qword 对齐和 qword 的“不安全/未检查”代码。 Should be pretty fast.应该很快。

// Copyright (c) 2022 Hafthor Stefansson
// Distributed under the MIT/X11 software license
// Ref: http://www.opensource.org/licenses/mit-license.php.
static unsafe void UnsafeSet(byte[] a, uint d, uint c, byte v) {
    unchecked {
        ushort v2 = (ushort)(v << 8 | v);
        uint v4 = (uint)v2 << 16 | v2;
        ulong v8 = (ulong)v4 << 32 | v4;
        fixed (byte* p = a) {
            byte* di = p + d;
            if (c >= 1 && (d & 1) != 0) { // word align
                *((byte*)di) = v; di++; c--; d++;
            }
            if (c >= 2 && (d & 2) != 0) { // dword align
                *((ushort*)di) = v2; di += 2; c -= 2; d += 2;
            }
            if (c >= 4 && (d & 4) != 0) { // qword align
                *((uint*)di) = v4; di += 4; c -= 4;
            }
            while (c >= 8) { // qword set
                *((ulong*)di) = v8; di += 8; c -= 8;
            }
            if (c >= 4) { // dword remainder
                *((uint*)di) = v4; di += 4; c -= 4;
            }
            if (c >= 2) { // word remainder
                *((ushort*)di) = v2; di += 2; c -= 2;
            }
            if (c >= 1) { // byte remainder
                *((byte*)di) = v; // di++; c--;
            }
        }
    }
}

The Array object has a method called Clear. Array 对象有一个名为 Clear 的方法。 I'm willing to bet that the Clear method is faster than any code you can write in C#.我敢打赌,Clear 方法比您可以用 C# 编写的任何代码都快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM