简体   繁体   中英

memset on float array does not fully zero out array

I have some array,

float *large_float_array;

that I calloc and attempt to memset after a loop

large_float_array = (float*)calloc(NUM, sizeof(float));

// .. do stuff with float array

memset(large_float_array, 0, sizeof(large_float_array);

// .. do other stuff

but it seems large_float_array is not actually zero'd out. Why is this? Is this due to the special properties of floating point numbers? If so, I'm wondering how I could fix this.

PS does calloc actually work here too?

Since large_float_array is a pointer, you cannot use sizeof(large_float_array) to specify the number of bytes to clear: sizeof(large_float_array) is just the size of the pointer, typically 4 or 8 bytes. To clear the whole array, you must write:

memset(large_float_array, 0, NUM * sizeof(float));

or

memset(large_float_array, 0, NUM * sizeof(*large_float_array));

Note that memset sets the memory to all bits zero. The C standard does not guarantee that all bits zero be a representation of 0.0 , But in all current architectures it is the case, so it will work as expected.

The strictly conforming way to clear your array is this:

#include <stdlib.h>

...
for (size_t i = 0; i < NUM; i++) {
    large_float_array[i] = 0;
}

If the target systems represents 0.0 as all bits zero, modern compilers will optimize the above loop as a call to memset as can be verified with Goldbolt's Compiler Explorer .

PS: under the same assumption, calloc() would be a better choice: the block returned is initialized to all bits zero and for large sparse matrices, allocating via calloc() may actually be more cache efficient on many systems.

You can't use memset because it accepts int as value type, but float may be stored in different way, so you should use a regular cycle

for (size_t i = 0; i < NUM; ++i) {
    large_float_array[i] = 0.;
}

The effect of intializing memory area by calloc and using memset function with 0 value is actually the same. In other words, calloc is no more "type-wise" than memset . They both just set memory with all bits zero*.

C11 (N1570) 7.22.3.2/2 The calloc function :

The calloc function allocates space for an array of nmemb objects, each of whose size is size. The space is initialized to all bits zero. 296)

Footnote 296 (informative only):

Note that this need not be the same as the representation of floating-point zero or a null pointer constant.


*) It's conceivable that calloc could return address of memory location that is already pre-initialized with zeros, thus it may be faster than malloc + memset combo.

Post has multiple issues.

Following does not have balanced () . Unclear what OP's true code may be.

memset(large_float_array, 0, sizeof(large_float_array);

Assuming the above simple used another ) , then code was only zero filling a few bytes with zero as sizeof(large_float_array) is the size of a pointer.

memset(large_float_array, 0, sizeof(large_float_array));

OP likely wanted something like which would zero fill the allocated space with zero bits. This will bring the allocated memory into the same state as calloc() .

memset(large_float_array, 0, NUM *sizeof *large_float_array);

Code should use the following. A good compiler will optimize this trivial loop into fast code.

for (size_t i=0; i<NUM; i++) {
  large_float_array[i] = 0.0f;
}

A way to use the fast memcpy() is to copy the float into the first element of the array, then the first element to the 2nd element, then the first 2 elements to elements 3 & 4, then the first 4 elements to elements 4,5,6,7 ...

void *memset_size(void *dest, size_t n, const char *src, size_t element) {
  if (n > 0 && element > 0) {
    memcpy(dest, src, element);  // Copy first element
    n *= element;
    while (n > element) {
      size_t remaining = n - element;
      size_t n_this_time = remaining > element ? element : remaining;
      memcpy((char*)dest + element, dest, n_this_time);
      element += n_this_time;
    }
  }
  return dest;
}

float pi = M_PI;
float a[10000];
memset_size(a, 10000, &pi, sizeof pi);

As with all optimizations, consider clearly written code first and then employ candidates such as the above in the select cases that warrant it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM