Efficient way to convert a 16-bit short array to a 32-bit int array?

Question

将无符号短数组（每个值16位）转换为无符号整数数组（每个值32位）的最有效方法是什么？

Answer 1

Copy it.

unsigned short source[]; // …
unsigned int target[]; // …
unsigned short* const end = source + sizeof source / sizeof source[0];
std::copy(source, end, target);

std::copy internally choses the best copying mechanism for given input types. In this case, however, there's probably no better way than copying the elements individually in a loop.

Answer 2

Use std::copy in C++:

#include<algorithm> //must include

unsigned short ushorts[M]; //where M is some const +ve integer
unsigned int   uints[N]; //where N >= M
//...fill ushorts
std::copy(ushorts, ushorts+M, uints);

And in C, use manual loop (in fact, you can use manual loop both in C and C++):

int i = 0;
while( i < M ) { uints[i] = ushorts[i]; ++i; }

Answer 3

Here an unrolled loop accessing in 64 bits chunks. It might be a little bit faster than the simple loop, but testing is the only way to know.

Assuming that N is a multiple of four, that sizeof (short) is 16 bit and working with 64 bit registers works.

 typedef union u {
     uint16_t    us[4];
     uint32_t    ui[2];
     uint64_t    ull;
 } u_t;
 ushort_t src[N] = ...;
 uint_t dst[N];

 u_t *p_src = (u_t *) src;
 u_t *p_dst = (u_t *) dst;
 uint_t i;
 u_t tmp, tmp2;
 for(i=0; i<N/4; i++) {
     tmp = p_src[i];    /* Read four shorts in one read access */
     tmp2.ui[0] = tmp.us[0];   /* The union trick avoids complicated shifts that are furthermore dependent on endianness. */
     tmp2.ui[1] = tmp.us[1];   /* The compiler should take care of optimal assembly decomposition. */ 
     p_dst[2*i] = tmp2;  /* Write the two first ints in one write access. */
     tmp2.ui[0] = tmp.us[2];
     tmp2.ui[1] = tmp.us[3];
     p_dst[2*i+1] = tmp2; /* Write the 2 next ints in 1 write access. */
 }

EDIT

So I just tested it on SUN M5000 (SPARC64 VII 2.5 GHz) with GCC 3.4.1 in 64-bit mode on a 4,000,000 element array. The naive implementation is a bit faster. I tried with SUNStudio 12 and with GCC 4.3, but I haven't been able to even compile the program because of the array size.

EDIT2

I managed to compile it now on GCC 4.3. The optimized version is a bit faster than the naive one.

              GCC 3.4          GCC 4.3
naive         11.1 ms          11.8 ms
optimized     12.4 ms          10.0 ms

EDIT3

We can conclude from that, as far as C is concerned, don't bother with an optimized version of the copy loop, the gain is so low that the risk of error outweighs the benefit.

Answer 4

What about

unsigned short src[N] = ...;
unsigned int dst[N];

for(i=0; i<N; ++i)
    dst[i] = src[i];

For a C++ version Konrad's or Nawaz's answers are surely better suited.

Answer 5

Initialize an int[] with the same length as the short[] .
Iterate over the short[] , assigning the i ^th element of the short[] to the i ^th position of the int[] .

Answer 6

On many architectures a decrementing do-while may be faster than the for and while loops proposed here. Something like:

unsigned short ushorts[M];
unsigned int uints[N];

int i = M-1;
do{
    uints[i] = ushorts[i];
    i--;
} while(i >= 0);

The compiler can take care of most optimizations such as loop unrolling, but generally the above is faster (on many architectures) because:

You get the first iteration for free in a do-while vs. a while or for
The loop ends when i = 0. Checking for 0 can save an instruction because the zero flag is set automatically. If the loop incremented and ended when i = M then it may need an extra compare instruction to test if i < M.

There may be faster ways as well, such as doing it entirely with pointer arithmetic. This could turn into a fun exercise of disassembling the code and analyzing to see which appears faster. It is all architecture dependent. Fortunately others have done this work for you with std::copy.

Answer 7

Just copy the address of the short array to access each element of the short array, like pTp32[0...LEN-1].arr[0..1] :

unsigned short shrtArray[LEN]; //..
union type32
{
    short arr[2];
    int value;
};
type32 * pTp32 = (type32*)shrtArray;

Efficient way to convert a 16-bit short array to a 32-bit int array?

Question

7 answers

solution1
14 ACCPTED 2011-09-06 13:59:49

solution2
9 2011-09-06 14:00:03

solution3
6 2011-09-06 15:04:34

solution4
2 2011-09-06 14:01:04

solution5
1 2011-09-06 14:00:41

solution6
1 2011-09-06 14:35:38

solution7
1 2011-12-29 11:36:03

Efficient way to convert a 16-bit short array to a 32-bit int array?

Question

7 answers

solution1 14 ACCPTED 2011-09-06 13:59:49

solution2 9 2011-09-06 14:00:03

solution3 6 2011-09-06 15:04:34

solution4 2 2011-09-06 14:01:04

solution5 1 2011-09-06 14:00:41

solution6 1 2011-09-06 14:35:38

solution7 1 2011-12-29 11:36:03

solution1
14 ACCPTED 2011-09-06 13:59:49

solution2
9 2011-09-06 14:00:03

solution3
6 2011-09-06 15:04:34

solution4
2 2011-09-06 14:01:04

solution5
1 2011-09-06 14:00:41

solution6
1 2011-09-06 14:35:38

solution7
1 2011-12-29 11:36:03