[英]What is the equivalent of memset in C#?
I need to fill a byte[]
with a single non-zero value.我需要用一个非零值填充一个byte[]
。 How can I do this in C# without looping through each byte
in the array?我如何在 C# 中执行此操作而不循环遍历数组中的每个byte
?
Update: The comments seem to have split this into two questions -更新:评论似乎将其分为两个问题 -
memset
是否有框架方法来填充可能类似于memset
的 byte[]I totally agree that using a simple loop works just fine, as Eric and others have pointed out.正如 Eric 和其他人所指出的,我完全同意使用一个简单的循环就可以很好地工作。 The point of the question was to see if I could learn something new about C#:) I think Juliet's method for a Parallel operation should be even faster than a simple loop.问题的重点是看看我是否可以学到一些关于 C# 的新知识:) 我认为 Juliet 的并行操作方法应该比简单的循环更快。
Benchmarks: Thanks to Mikael Svenson: http://techmikael.blogspot.com/2009/12/filling-array-with-default-value.html基准测试:感谢 Mikael Svenson: http://techmikael.blogspot.com/2009/12/filling-array-with-default-value.html
It turns out the simple for
loop is the way to go unless you want to use unsafe code.事实证明,简单for
循环是通往 go 的途径,除非您想使用不安全的代码。
Apologies for not being clearer in my original post.很抱歉在我原来的帖子中没有说清楚。 Eric and Mark are both correct in their comments;埃里克和马克的评论都是正确的; need to have more focused questions for sure.肯定需要有更有针对性的问题。 Thanks for everyone's suggestions and responses.感谢大家的建议和回复。
You could use Enumerable.Repeat
:您可以使用Enumerable.Repeat
:
byte[] a = Enumerable.Repeat((byte)10, 100).ToArray();
The first parameter is the element you want repeated, and the second parameter is the number of times to repeat it.第一个参数是你想要重复的元素,第二个参数是重复的次数。
This is OK for small arrays but you should use the looping method if you are dealing with very large arrays and performance is a concern.这对于小数组来说是可以的,但是如果您正在处理非常大的数组并且性能是一个问题,则应该使用循环方法。
Actually, there is little known IL operation called Initblk ( English version ) which does exactly that.实际上,鲜为人知的称为Initblk ( 英文版)的 IL 操作正是这样做的。 So, let's use it as a method that doesn't require "unsafe".因此,让我们将其用作不需要“不安全”的方法。 Here's the helper class:这是帮助类:
public static class Util
{
static Util()
{
var dynamicMethod = new DynamicMethod("Memset", MethodAttributes.Public | MethodAttributes.Static, CallingConventions.Standard,
null, new [] { typeof(IntPtr), typeof(byte), typeof(int) }, typeof(Util), true);
var generator = dynamicMethod.GetILGenerator();
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldarg_1);
generator.Emit(OpCodes.Ldarg_2);
generator.Emit(OpCodes.Initblk);
generator.Emit(OpCodes.Ret);
MemsetDelegate = (Action<IntPtr, byte, int>)dynamicMethod.CreateDelegate(typeof(Action<IntPtr, byte, int>));
}
public static void Memset(byte[] array, byte what, int length)
{
var gcHandle = GCHandle.Alloc(array, GCHandleType.Pinned);
MemsetDelegate(gcHandle.AddrOfPinnedObject(), what, length);
gcHandle.Free();
}
public static void ForMemset(byte[] array, byte what, int length)
{
for(var i = 0; i < length; i++)
{
array[i] = what;
}
}
private static Action<IntPtr, byte, int> MemsetDelegate;
}
And what is the performance?性能如何? Here's my result for Windows/.NET and Linux/Mono (different PCs).这是我对 Windows/.NET 和 Linux/Mono(不同的 PC)的结果。
Mono/for: 00:00:01.1356610
Mono/initblk: 00:00:00.2385835
.NET/for: 00:00:01.7463579
.NET/initblk: 00:00:00.5953503
So it's worth considering.所以值得考虑。 Note that the resulting IL will not be verifiable.请注意,生成的 IL 将无法验证。
Building on Lucero's answer , here is a faster version.基于Lucero's answer ,这里有一个更快的版本。 It will double the number of bytes copied using Buffer.BlockCopy
every iteration.每次迭代使用Buffer.BlockCopy
复制的字节数都会加倍。 Interestingly enough, it outperforms it by a factor of 10 when using relatively small arrays (1000), but the difference is not that large for larger arrays (1000000), it is always faster though.有趣的是,当使用相对较小的数组 (1000) 时,它的性能比它高 10 倍,但对于较大的数组 (1000000),差异并不大,但它总是更快。 The good thing about it is that it performs well even down to small arrays.它的好处是即使在小阵列上也能很好地执行。 It becomes faster than the naive approach at around length = 100. For a one million element byte array, it was 43 times faster.在长度为 100 左右时,它比简单的方法更快。对于一百万个元素的字节数组,它快了 43 倍。 (tested on Intel i7, .Net 2.0) (在英特尔 i7、.Net 2.0 上测试)
public static void MemSet(byte[] array, byte value) {
if (array == null) {
throw new ArgumentNullException("array");
}
int block = 32, index = 0;
int length = Math.Min(block, array.Length);
//Fill the initial array
while (index < length) {
array[index++] = value;
}
length = array.Length;
while (index < length) {
Buffer.BlockCopy(array, 0, array, index, Math.Min(block, length-index));
index += block;
block *= 2;
}
}
A little bit late, but the following approach might be a good compromise without reverting to unsafe code.有点晚了,但以下方法可能是一个很好的折衷方案,而不会恢复到不安全的代码。 Basically it initializes the beginning of the array using a conventional loop and then reverts to Buffer.BlockCopy()
, which should be as fast as you can get using a managed call.基本上它使用传统循环初始化数组的开头,然后恢复到Buffer.BlockCopy()
,这应该与使用托管调用一样快。
public static void MemSet(byte[] array, byte value) {
if (array == null) {
throw new ArgumentNullException("array");
}
const int blockSize = 4096; // bigger may be better to a certain extent
int index = 0;
int length = Math.Min(blockSize, array.Length);
while (index < length) {
array[index++] = value;
}
length = array.Length;
while (index < length) {
Buffer.BlockCopy(array, 0, array, index, Math.Min(blockSize, length-index));
index += blockSize;
}
}
This simple implementation uses successive doubling, and performs quite well (about 3-4 times faster than the naive version according to my benchmarks):这个简单的实现使用连续加倍,并且性能非常好(根据我的基准测试,比原始版本快大约 3-4 倍):
public static void Memset<T>(T[] array, T elem)
{
int length = array.Length;
if (length == 0) return;
array[0] = elem;
int count;
for (count = 1; count <= length/2; count*=2)
Array.Copy(array, 0, array, count, count);
Array.Copy(array, 0, array, count, length - count);
}
Edit: upon reading the other answers, it seems I'm not the only one with this idea.编辑:在阅读其他答案时,似乎我不是唯一一个有这个想法的人。 Still, I'm leaving this here, since it's a bit cleaner and it performs on par with the others.尽管如此,我还是把它留在这里,因为它更干净一点,而且性能与其他产品不相上下。
If performance is critical, you could consider using unsafe code and working directly with a pointer to the array.如果性能至关重要,您可以考虑使用不安全代码并直接使用指向数组的指针。
Another option could be importing memset from msvcrt.dll and use that.另一种选择是从 msvcrt.dll 导入 memset 并使用它。 However, the overhead from invoking that might easily be larger than the gain in speed.但是,调用的开销可能很容易大于速度的增益。
Looks like System.Runtime.CompilerServices.Unsafe.InitBlock
now does the same thing as the OpCodes.Initblk
instruction that Konrad's answer mentions (he also mentioned a source link ).看起来System.Runtime.CompilerServices.Unsafe.InitBlock
现在与康拉德的回答提到的OpCodes.Initblk
指令做同样的事情(他还提到了一个源链接)。
The code to fill in the array is as follows:填充数组的代码如下:
byte[] a = new byte[N];
byte valueToFill = 255;
System.Runtime.CompilerServices.Unsafe.InitBlock(ref a[0], valueToFill, (uint) a.Length);
Or use P/Invoke way :或者使用 P/Invoke 方式:
[DllImport("msvcrt.dll",
EntryPoint = "memset",
CallingConvention = CallingConvention.Cdecl,
SetLastError = false)]
public static extern IntPtr MemSet(IntPtr dest, int c, int count);
static void Main(string[] args)
{
byte[] arr = new byte[3];
GCHandle gch = GCHandle.Alloc(arr, GCHandleType.Pinned);
MemSet(gch.AddrOfPinnedObject(), 0x7, arr.Length);
}
If performance is absolutely critical, then Enumerable.Repeat(n, m).ToArray()
will be too slow for your needs.如果性能绝对至关重要,那么Enumerable.Repeat(n, m).ToArray()
对您的需求来说太慢了。 You might be able to crank out faster performance using PLINQ or Task Parallel Library :您可以使用 PLINQ 或Task Parallel Library提高性能:
using System.Threading.Tasks;
// ...
byte initialValue = 20;
byte[] data = new byte[size]
Parallel.For(0, size, index => data[index] = initialValue);
All answers are writing single bytes only - what if you want to fill a byte array with words?所有答案都只写入单个字节 - 如果您想用单词填充字节数组怎么办? Or floats?还是漂浮? I find use for that now and then.我偶尔会发现它的用处。 So after having written similar code to 'memset' in a non-generic way a few times and arriving at this page to find good code for single bytes, I went about writing the method below.因此,在以非通用方式编写了几次类似的“memset”代码并到达此页面以找到单字节的好代码之后,我开始编写下面的方法。
I think PInvoke and C++/CLI each have their drawbacks.我认为 PInvoke 和 C++/CLI 各有其缺点。 And why not have the runtime 'PInvoke' for you into mscorxxx?为什么不在 mscorxxx 中为您提供运行时“PInvoke”? Array.Copy and Buffer.BlockCopy are native code certainly. Array.Copy 和 Buffer.BlockCopy 当然是原生代码。 BlockCopy isn't even 'safe' - you can copy a long halfway over another, or over a DateTime as long as they're in arrays. BlockCopy 甚至不是“安全的”——只要它们在数组中,您就可以在另一个或 DateTime 的中途复制很长的一段。
At least I wouldn't go file new C++ project for things like this - it's a waste of time almost certainly.至少我不会为这样的事情提交新的 C++ 项目——这几乎肯定是在浪费时间。
So here's basically an extended version of the solutions presented by Lucero and TowerOfBricks that can be used to memset longs, ints, etc as well as single bytes.所以这里基本上是 Lucero 和 TowerOfBricks 提供的解决方案的扩展版本,可用于 memset longs、ints 等以及单个字节。
public static class MemsetExtensions
{
static void MemsetPrivate(this byte[] buffer, byte[] value, int offset, int length) {
var shift = 0;
for (; shift < 32; shift++)
if (value.Length == 1 << shift)
break;
if (shift == 32 || value.Length != 1 << shift)
throw new ArgumentException(
"The source array must have a length that is a power of two and be shorter than 4GB.", "value");
int remainder;
int count = Math.DivRem(length, value.Length, out remainder);
var si = 0;
var di = offset;
int cx;
if (count < 1)
cx = remainder;
else
cx = value.Length;
Buffer.BlockCopy(value, si, buffer, di, cx);
if (cx == remainder)
return;
var cachetrash = Math.Max(12, shift); // 1 << 12 == 4096
si = di;
di += cx;
var dx = offset + length;
// doubling up to 1 << cachetrash bytes i.e. 2^12 or value.Length whichever is larger
for (var al = shift; al <= cachetrash && di + (cx = 1 << al) < dx; al++) {
Buffer.BlockCopy(buffer, si, buffer, di, cx);
di += cx;
}
// cx bytes as long as it fits
for (; di + cx <= dx; di += cx)
Buffer.BlockCopy(buffer, si, buffer, di, cx);
// tail part if less than cx bytes
if (di < dx)
Buffer.BlockCopy(buffer, si, buffer, di, dx - di);
}
}
Having this you can simply add short methods to take the value type you need to memset with and call the private method, eg just find replace ulong in this method:有了这个,您可以简单地添加简短的方法来获取您需要使用 memset 的值类型并调用私有方法,例如只需在此方法中找到替换 ulong:
public static void Memset(this byte[] buffer, ulong value, int offset, int count) {
var sourceArray = BitConverter.GetBytes(value);
MemsetPrivate(buffer, sourceArray, offset, sizeof(ulong) * count);
}
Or go silly and do it with any type of struct (although the MemsetPrivate above only works for structs that marshal to a size that is a power of two):或者愚蠢地使用任何类型的结构(尽管上面的 MemsetPrivate 仅适用于编组为 2 的幂的大小的结构):
public static void Memset<T>(this byte[] buffer, T value, int offset, int count) where T : struct {
var size = Marshal.SizeOf<T>();
var ptr = Marshal.AllocHGlobal(size);
var sourceArray = new byte[size];
try {
Marshal.StructureToPtr<T>(value, ptr, false);
Marshal.Copy(ptr, sourceArray, 0, size);
} finally {
Marshal.FreeHGlobal(ptr);
}
MemsetPrivate(buffer, sourceArray, offset, count * size);
}
I changed the initblk mentioned before to take ulongs to compare performance with my code and that silently fails - the code runs but the resulting buffer contains the least significant byte of the ulong only.我更改了之前提到的 initblk 以使用 ulong 将性能与我的代码进行比较,并且默默地失败了 - 代码运行但结果缓冲区仅包含 ulong 的最低有效字节。
Nevertheless I compared the performance writing as big a buffer with for, initblk and my memset method.尽管如此,我还是将写入作为大缓冲区的性能与 for、initblk 和 memset 方法进行了比较。 The times are in ms total over 100 repetitions writing 8 byte ulongs whatever how many times fit the buffer length.无论多少次适合缓冲区长度,总时间以毫秒为单位,超过 100 次重复写入 8 字节 ulong。 The for version is manually loop-unrolled for the 8 bytes of a single ulong. for 版本是为单个 ulong 的 8 个字节手动循环展开的。
Buffer Len #repeat For millisec Initblk millisec Memset millisec
0x00000008 100 For 0,0032 Initblk 0,0107 Memset 0,0052
0x00000010 100 For 0,0037 Initblk 0,0102 Memset 0,0039
0x00000020 100 For 0,0032 Initblk 0,0106 Memset 0,0050
0x00000040 100 For 0,0053 Initblk 0,0121 Memset 0,0106
0x00000080 100 For 0,0097 Initblk 0,0121 Memset 0,0091
0x00000100 100 For 0,0179 Initblk 0,0122 Memset 0,0102
0x00000200 100 For 0,0384 Initblk 0,0123 Memset 0,0126
0x00000400 100 For 0,0789 Initblk 0,0130 Memset 0,0189
0x00000800 100 For 0,1357 Initblk 0,0153 Memset 0,0170
0x00001000 100 For 0,2811 Initblk 0,0167 Memset 0,0221
0x00002000 100 For 0,5519 Initblk 0,0278 Memset 0,0274
0x00004000 100 For 1,1100 Initblk 0,0329 Memset 0,0383
0x00008000 100 For 2,2332 Initblk 0,0827 Memset 0,0864
0x00010000 100 For 4,4407 Initblk 0,1551 Memset 0,1602
0x00020000 100 For 9,1331 Initblk 0,2768 Memset 0,3044
0x00040000 100 For 18,2497 Initblk 0,5500 Memset 0,5901
0x00080000 100 For 35,8650 Initblk 1,1236 Memset 1,5762
0x00100000 100 For 71,6806 Initblk 2,2836 Memset 3,2323
0x00200000 100 For 77,8086 Initblk 2,1991 Memset 3,0144
0x00400000 100 For 131,2923 Initblk 4,7837 Memset 6,8505
0x00800000 100 For 263,2917 Initblk 16,1354 Memset 33,3719
I excluded the first call every time, since both initblk and memset take a hit of I believe it was about .22ms for the first call.我每次都排除了第一次调用,因为 initblk 和 memset 都受到了影响,我相信第一次调用大约是 0.22 毫秒。 Slightly surprising my code is faster for filling short buffers than initblk, seeing it got half a page full of setup code.有点令人惊讶的是,我的代码填充短缓冲区的速度比 initblk 快,因为它有半页的设置代码。
If anybody feels like optimizing this, go ahead really.如果有人想对此进行优化,请继续。 It's possible.这是可能的。
Tested several ways, described in different answers.测试了几种方法,在不同的答案中描述。 See sources of test in c# test class在 c# 测试类中查看测试源
您可以在初始化数组时执行此操作,但我认为这不是您的要求:
byte[] myBytes = new byte[5] { 1, 1, 1, 1, 1};
With the advent of Span<T>
(which is dotnet core only, but it is the future of dotnet ) you have yet another way of solving this problem:随着Span<T>
(它只是 dotnet 核心,但它是 dotnet 的未来)的出现,您还有另一种解决这个问题的方法:
var array = new byte[100];
var span = new Span<byte>(array);
span.Fill(255);
.NET Core has a built-in Array.Fill() function, but sadly .NET Framework is missing it. .NET Core 有一个内置的 Array.Fill() 函数,但遗憾的是 .NET Framework 缺少它。 .NET Core has two variations: fill the entire array and fill a portion of the array starting at an index. .NET Core 有两种变体:填充整个数组和从索引开始填充数组的一部分。
Building on the ideas above, here is a more generic Fill function that will fill the entire array of several data types.基于上述想法,这里有一个更通用的 Fill 函数,它将填充多种数据类型的整个数组。 This is the fastest function when benchmarking against other methods discussed in this post.当与本文中讨论的其他方法进行基准测试时,这是最快的功能。
This function, along with the version that fills a portion an array are available in an open source and free NuGet package ( HPCsharp on nuget.org ).此函数以及填充数组一部分的版本在开源和免费 NuGet 包( nuget.org 上的 HPCsharp )中可用。 Also included is a slightly faster version of Fill using SIMD/SSE instructions that performs only memory writes, whereas BlockCopy-based methods perform memory reads and writes.还包括使用 SIMD/SSE 指令的稍快版本的 Fill,该指令仅执行内存写入,而基于 BlockCopy 的方法执行内存读取和写入。
public static void FillUsingBlockCopy<T>(this T[] array, T value) where T : struct
{
int numBytesInItem = 0;
if (typeof(T) == typeof(byte) || typeof(T) == typeof(sbyte))
numBytesInItem = 1;
else if (typeof(T) == typeof(ushort) || typeof(T) != typeof(short))
numBytesInItem = 2;
else if (typeof(T) == typeof(uint) || typeof(T) != typeof(int))
numBytesInItem = 4;
else if (typeof(T) == typeof(ulong) || typeof(T) != typeof(long))
numBytesInItem = 8;
else
throw new ArgumentException(string.Format("Type '{0}' is unsupported.", typeof(T).ToString()));
int block = 32, index = 0;
int endIndex = Math.Min(block, array.Length);
while (index < endIndex) // Fill the initial block
array[index++] = value;
endIndex = array.Length;
for (; index < endIndex; index += block, block *= 2)
{
int actualBlockSize = Math.Min(block, endIndex - index);
Buffer.BlockCopy(array, 0, array, index * numBytesInItem, actualBlockSize * numBytesInItem);
}
}
Most of answers is for byte memset but if you want to use it for float or any other struct you should multiply index by size of your data.大多数答案是针对字节 memset 的,但是如果您想将它用于浮点数或任何其他结构,您应该将索引乘以数据的大小。 Because Buffer.BlockCopy will copy based on the bytes.因为 Buffer.BlockCopy 将根据字节进行复制。 This code will be work for float values此代码适用于浮点值
public static void MemSet(float[] array, float value) {
if (array == null) {
throw new ArgumentNullException("array");
}
int block = 32, index = 0;
int length = Math.Min(block, array.Length);
//Fill the initial array
while (index < length) {
array[index++] = value;
}
length = array.Length;
while (index < length) {
Buffer.BlockCopy(array, 0, array, index * sizeof(float), Math.Min(block, length-index)* sizeof(float));
index += block;
block *= 2;
}
}
Here's 'unsafe/unchecked' code that performs memset qword aligned and qword at a time.这是一次执行 memset qword 对齐和 qword 的“不安全/未检查”代码。 Should be pretty fast.应该很快。
// Copyright (c) 2022 Hafthor Stefansson
// Distributed under the MIT/X11 software license
// Ref: http://www.opensource.org/licenses/mit-license.php.
static unsafe void UnsafeSet(byte[] a, uint d, uint c, byte v) {
unchecked {
ushort v2 = (ushort)(v << 8 | v);
uint v4 = (uint)v2 << 16 | v2;
ulong v8 = (ulong)v4 << 32 | v4;
fixed (byte* p = a) {
byte* di = p + d;
if (c >= 1 && (d & 1) != 0) { // word align
*((byte*)di) = v; di++; c--; d++;
}
if (c >= 2 && (d & 2) != 0) { // dword align
*((ushort*)di) = v2; di += 2; c -= 2; d += 2;
}
if (c >= 4 && (d & 4) != 0) { // qword align
*((uint*)di) = v4; di += 4; c -= 4;
}
while (c >= 8) { // qword set
*((ulong*)di) = v8; di += 8; c -= 8;
}
if (c >= 4) { // dword remainder
*((uint*)di) = v4; di += 4; c -= 4;
}
if (c >= 2) { // word remainder
*((ushort*)di) = v2; di += 2; c -= 2;
}
if (c >= 1) { // byte remainder
*((byte*)di) = v; // di++; c--;
}
}
}
}
The Array object has a method called Clear. Array 对象有一个名为 Clear 的方法。 I'm willing to bet that the Clear method is faster than any code you can write in C#.我敢打赌,Clear 方法比您可以用 C# 编写的任何代码都快。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.