简体   繁体   English

将一个int(16位)数组设置为short的最大值

[英]Memset an int (16 bit) array to short's max value

Can't seem to find the answer to this anywhere, How do I memset an array to the maximum value of the array's type? 似乎在任何地方都找不到答案,如何将数组设置为数组类型的最大值? I would have thought memset(ZBUFFER,0xFFFF,size) would work where ZBUFFER is a 16bit integer array. 我以为memset(ZBUFFER,0xFFFF,size)在ZBUFFER是16位整数数组的情况下会起作用。 Instead I get -1s throughout. 相反,我得到-1s。

Also, the idea is to have this work as fast as possible (it's a zbuffer that needs to initialize every frame) so if there is a better way (and still as fast or faster), let me know. 另外,我们的想法是使这项工作尽可能快(这是一个zbuffer,需要初始化每个帧),因此,如果有更好的方法(并且仍然更快或更快),请告诉我。

edit: as clarification, I do need a signed int array. 编辑:作为澄清,我确实需要一个带符号的int数组。

In C++ , you would use std::fill, and std::numeric_limits. C ++中 ,您将使用std :: fill和std :: numeric_limits。

#include <algorithm>
#include <iterator>
#include <limits>

template <typename IT>
void FillWithMax( IT first, IT last )
{
    typedef typename std::iterator_traits<IT>::value_type T;
    T const maxval = std::numeric_limits<T>::max();
    std::fill( first, last, maxval );
}

size_t const size=32;
short ZBUFFER[size];
FillWithMax( ZBUFFER, &ZBUFFER[0]+size );

This will work with any type. 这将适用于任何类型。

In C , you'd better keep off memset that sets the value of bytes. C中 ,最好不要设置用于设置字节值的memset To initialize an array of other types than char (ev. unsigned ), you have to resort to a manual for loop. 要初始化其它类型的比阵列char (EV。 unsigned ),则必须诉诸于手动for循环。

-1 and 0xFFFF are the same thing in a 16 bit integer using a two's complement representation. -1和0xFFFF是使用二进制补码表示的16位整数中的相同内容。 You are only getting -1 because either you have declared your array as short instead of unsigned short . 您只得到-1,因为您已将数组声明为short而不是unsigned short Or because you are converting the values to signed when you output them. 或者是因为在输出它们时将值转换为带符号。

BTW your assumption that you can set something except bytes using memset is wrong. 顺便说一句,您可以使用memset设置除字节以外的内容的假设是错误的。 memset(ZBUFFER, 0xFF, size) would have done the same thing. memset(ZBUFFER, 0xFF, size)会做同样的事情。

In C++ you can fill an array with some value with the std::fill algorithm. 在C ++中,您可以使用std::fill算法为数组填充一些值。

std::fill(ZBUFFER, ZBUFFER+size, std::numeric_limits<short>::max());

This is neither faster nor slower than your current approach. 这既不会比您当前的方法快也不会比它慢。 It does have the benefit of working, though. 但是,这样做确实有工作的好处。

Don't attribute speed to language. 不要将速度归因于语言。 That's for implementations of C. There are C compilers that produce fast, optimal machine code and C compilers that produce slow, inoptimal machine code. 这是针对C的实现的。有些C编译器可以生成快速,最佳的机器代码,而C编译器则可以生成缓慢的,非最佳的机器代码。 Likewise for C++. 对于C ++也是如此。 A "fast, optimal" implementation might be able to optimise code that seems slow . “快速,最佳”的实现可能能够优化似乎很慢的代码。 Hence, it doesn't make sense to call one solution faster than another. 因此,调用一个解决方案要比调用另一个解决方案没有任何意义。 I'll talk about the correctness , and then I'll talk about performance , however insignificant it is. 我将讨论正确性 ,然后再讨论性能 ,尽管它无关紧要。 It'd be a better idea to profile your code, to be sure that this is in fact the bottleneck, but let's continue. 最好对您的代码进行概要分析,以确保实际上这是瓶颈,但是让我们继续。

Let us consider the most sensible option, first: A loop that copies int values. 让我们考虑最明智的选择,首先:复制int值的循环。 It is clear just by reading the code that the loop will correctly assign SHRT_MAX to each int item. 仅通过阅读代码就可以清楚地看到循环将正确地将SHRT_MAX分配给每个int项。 You can see a testcase of this loop below, which will attempt to use the largest possible array allocatable by malloc at the time. 您可以在下面看到此循环的测试用例,它将尝试使用当时malloc可分配的最大数组。

#include <limits.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void) {
    size_t size = SIZE_MAX;
    volatile int *array = malloc(size);

    /* Allocate largest array */
    while (array == NULL && size > 0) {
        size >>= 1;
        array = malloc(size);
    }

    printf("Copying into %zu bytes\n", size);

    for (size_t n = 0; n < size / sizeof *array; n++) {
        array[n] = SHRT_MAX;
    }

    puts("Done!");
    return 0;
}

I ran this on my system, compiled with various optimisations enabled ( -O3 -march=core2 -funroll-loops ). 我在系统上运行了该代码,并启用了各种优化功能( -O3 -march=core2 -funroll-loops )。 Here's the output: 这是输出:

Copying into 1073741823 bytes
Done!

Process returned 0 (0x0)   execution time : 1.094 s
Press any key to continue.

Note the "execution time"... That's pretty fast! 注意“执行时间” ...这非常快! If anything, the bottleneck here is the cache locality of such a large array, which is why a good programmer will try to design systems that don't use so much memory... Well, then let us consider the memset option. 如果有的话,这里的瓶颈是这么大数组的缓存局部性,这就是为什么一个好的程序员会尝试设计不使用太多内存的系统的原因……那么,让我们考虑一下memset选项。 Here's a quote from the memset manual : 这是memset手册的引文:

The memset() function copies c (converted to an unsigned char ) into each of the first n bytes of the object pointed to by s. memset()函数将c(转换为无符号char )复制到s所指向的对象的前n个字节中的每个字节中。

Hence, it'll convert 0xFFFF to an unsigned char (and potentially truncate that value), then assign the converted value to the first size bytes. 因此,它将0xFFFF转换为无符号字符(并可能截断该值),然后将转换后的值分配给第一个size字节。 This results in incorrect behaviour. 这会导致错误的行为。 I don't like relying upon the value SHRT_MAX to be represented as a sequence of bytes storing the value (unsigned char) 0xFFFF , because that's relying upon coincidence. 我不喜欢将值SHRT_MAX表示为存储值(unsigned char) 0xFFFF的字节序列,因为这依赖于巧合。 In other words, the main problem here is that memset isn't suitable for your task. 换句话说,这里的主要问题是memset不适合您的任务。 Don't use it. 不要使用它。 Having said that, here's a test, derived from the test above, which will be used to test the speed of memset: 话虽如此,这是一个源自上述测试的测试,将用于测试memset的速度:

#include <limits.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void) {
    size_t size = SIZE_MAX;
    volatile int *array = malloc(size);

    /* Allocate largest array */
    while (array == NULL && size > 0) {
        size >>= 1;
        array = malloc(size);
    }

    printf("Copying into %zu bytes\n", size);

    memset(array, 0xFFFF, size);

    puts("Done!");
    return 0;
}

A trivial byte-copying memset loop will iterate sizeof (int) times more than the loop in my first example. 一个普通的字节复制memset循环将比我的第一个示例中的循环多循环sizeof (int)倍。 Considering that my implementation uses a fairly optimal memset, here's the output: 考虑到我的实现使用了相当理想的内存集,下面是输出:

Copying into 1073741823 bytes
Done!

Process returned 0 (0x0)   execution time : 1.060 s
Press any key to continue.

These tests are likely to vary, however significantly. 这些测试可能会有所不同,但是会有很大差异。 I only ran them once each to get a rough idea. 我每个人只运行一次,以获得一个大概的想法。 Hopefully you've come to the same conclusion that I have: Common compilers are pretty good at optimising simple loops, and it's not worth postulating about micro-optimisations here. 希望您能得出与我相同的结论:通用编译器非常擅长优化简单循环,因此此处不值得进行微优化。

In summary: 综上所述:

  1. Don't use memset to fill ints with values (with an exception for the value 0), because it's not suitable. 不要使用memset来用值填充整数(值0除外),因为它不合适。
  2. Don't postulate about optimisations prior to running tests. 在运行测试之前,请勿假设优化。 Don't run tests until you have a working solution. 在找到有效的解决方案之前,请勿运行测试。 By working solution I mean "A program that solves an actual problem". 工作解决方案是指“解决实际问题的程序”。 Once you have that, use your profiler to identify more significant opportunities to optimise! 一旦有了这些,就使用您的探查器来确定更重要的优化机会!

This is because of two's complement . 这是因为补码 You have to change your array type to unsigned short , to get the max value, or use 0x7FFF . 您必须将数组类型更改为unsigned short ,以获取最大值,或者使用0x7FFF

for (int i = 0; i < SIZE / sizeof(short); ++i) {
    ZBUFFER[i] = SHRT_MAX;
}

Note this does not initialize the last couple bytes, if (SIZE % sizeof(short)) 请注意, if (SIZE % sizeof(short)) ,这不会初始化最后几个字节。

In C, you can do it like Adrian Panasiuk said, and you can also unroll the copy loop. 在C语言中,您可以像Adrian Panasiuk所说的那样进行操作,还可以展开复制循环。 Unrolling means copying larger chunks at a time. 展开意味着一次复制更大的块。 The extreme end of loop unrolling is copying the whole frame over with a zero frame, like this: 循环展开的最末端是将整个帧复制为零帧,如下所示:

init()
{
    for (int i = 0; i < sizeof(ZBUFFER) / sizeof(ZBUFFER[0]; ++i) {
        empty_ZBUFFER[i] = SHRT_MAX;
    }
}

actual clearing: 实际清算:

memcpy(ZBUFFER, empty_ZBUFFER, SIZE);

(You can experiment with different sizes of the empty ZBUFFER, from four bytes and up, and then have a loop around the memcpy.) (您可以尝试从四个字节开始使用不同大小的空ZBUFFER,然后在memcpy周围进行循环。)

As always, test your findings, if a) it's worth optimizing this part of the program and b) what difference the different initializing techniques makes. 和往常一样,测试您的发现,如果a)值得优化程序的这一部分, b)不同的初始化技术有什么不同。 It will depend on a lot of factors. 这将取决于很多因素。 For the last few per cents of performance, you may have to resort to assembler code. 对于最后百分之几的性能,您可能不得不求助于汇编代码。

#include <algorithm>
#include <limits>

std::fill_n(ZBUFFER, size, std::numeric_limits<FOO>::max())

where FOO is the type of ZBUFFER 's elements. 其中FOOZBUFFER元素的类型。

When you say "memset" do you actually have to use that function? 当您说“ memset”时,您实际上必须使用该功能吗? That is only a byte-by-byte assign so it won't work with signed arrays. 那只是一个字节一个字节的分配,因此它不适用于带符号数组。

If you want to set each value to the maximum you would use something like: 如果您想将每个值设置为最大值,则可以使用以下方法:

std::fill( ZBUFFER, ZBUFFER+len, std::numeric_limits<short>::max() )

when len is the number of elements (not the size in bytes of your array) len是元素数(而不是数组的字节大小)时

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM