將16位短數組轉換為32位int數組的有效方法？

Question

將無符號短數組（每個值16位）轉換為無符號整數數組（每個值32位）的最有效方法是什么？

Answer 1

復制它。

unsigned short source[]; // …
unsigned int target[]; // …
unsigned short* const end = source + sizeof source / sizeof source[0];
std::copy(source, end, target);

std::copy內部選擇給定輸入類型的最佳復制機制。 但是，在這種情況下，可能沒有比在循環中單獨復制元素更好的方法了。

Answer 2

在C ++中使用std::copy ：

#include<algorithm> //must include

unsigned short ushorts[M]; //where M is some const +ve integer
unsigned int   uints[N]; //where N >= M
//...fill ushorts
std::copy(ushorts, ushorts+M, uints);

在C語言中，使用手動循環（實際上，您可以在C和C ++中使用手動循環）：

int i = 0;
while( i < M ) { uints[i] = ushorts[i]; ++i; }

Answer 3

這里是一個以64位塊訪問的展開循環。 它可能比簡單循環快一點，但測試是唯一可以知道的方法。

假設N是4的倍數，那個sizeof（short）是16位，並且使用64位寄存器。

 typedef union u {
     uint16_t    us[4];
     uint32_t    ui[2];
     uint64_t    ull;
 } u_t;
 ushort_t src[N] = ...;
 uint_t dst[N];

 u_t *p_src = (u_t *) src;
 u_t *p_dst = (u_t *) dst;
 uint_t i;
 u_t tmp, tmp2;
 for(i=0; i<N/4; i++) {
     tmp = p_src[i];    /* Read four shorts in one read access */
     tmp2.ui[0] = tmp.us[0];   /* The union trick avoids complicated shifts that are furthermore dependent on endianness. */
     tmp2.ui[1] = tmp.us[1];   /* The compiler should take care of optimal assembly decomposition. */ 
     p_dst[2*i] = tmp2;  /* Write the two first ints in one write access. */
     tmp2.ui[0] = tmp.us[2];
     tmp2.ui[1] = tmp.us[3];
     p_dst[2*i+1] = tmp2; /* Write the 2 next ints in 1 write access. */
 }

編輯

所以我只是在具有GCC 3.4.1的SUN M5000（SPARC64 VII 2.5 GHz）上以64位模式在4,000,000個元件陣列上進行了測試。 天真的實現速度要快一些。 我嘗試使用SUNStudio 12和GCC 4.3，但由於數組大小，我甚至無法編譯程序。

EDIT2

我設法在GCC 4.3上編譯它。 優化版本比天真版本快一點。

              GCC 3.4          GCC 4.3
naive         11.1 ms          11.8 ms
optimized     12.4 ms          10.0 ms

EDIT3

我們可以從中得出結論，就C而言，不要為復制循環的優化版本而煩惱，增益是如此之低，以至於錯誤的風險超過了收益。

Answer 4

關於什么

unsigned short src[N] = ...;
unsigned int dst[N];

for(i=0; i<N; ++i)
    dst[i] = src[i];

對於C ++版本，Konrad或Nawaz的答案肯定更適合。

Answer 5

使用與short[]相同的長度初始化int[] short[] 。
遍歷short[]分配i ^個所述的元件short[]到i ^個所述的位置int[]

Answer 6

在許多體系結構中，遞減的do-while可能比這里提出的for和while循環更快。 就像是：

unsigned short ushorts[M];
unsigned int uints[N];

int i = M-1;
do{
    uints[i] = ushorts[i];
    i--;
} while(i >= 0);

編譯器可以處理大多數優化，例如循環展開，但通常上述速度更快（在許多體系結構上），因為：

你得到第一次迭代免費的do-while對一個while或for
當i = 0時，循環結束。檢查0可以保存指令，因為零標志是自動設置的。 如果循環遞增並在i = M時結束，則可能需要額外的比較指令來測試i <M。

也可能有更快的方法，例如完全使用指針算法。 這可能會變成一種有趣的練習，即拆解代碼並進行分析以查看哪些代碼更快。 它取決於所有架構。 幸運的是，其他人已經使用std :: copy為您完成了這項工作。

Answer 7

只需復制短數組的地址即可訪問短數組的每個元素，如pTp32[0...LEN-1].arr[0..1] ：

unsigned short shrtArray[LEN]; //..
union type32
{
    short arr[2];
    int value;
};
type32 * pTp32 = (type32*)shrtArray;

將16位短數組轉換為32位int數組的有效方法？

問題描述

7 個解決方案

解決方案1
14 已采納 2011-09-06 13:59:49

解決方案2
9 2011-09-06 14:00:03

解決方案3
6 2011-09-06 15:04:34

解決方案4
2 2011-09-06 14:01:04

解決方案5
1 2011-09-06 14:00:41

解決方案6
1 2011-09-06 14:35:38

解決方案7
1 2011-12-29 11:36:03

將16位短數組轉換為32位int數組的有效方法？

問題描述

7 個解決方案

解決方案1 14 已采納 2011-09-06 13:59:49

解決方案2 9 2011-09-06 14:00:03

解決方案3 6 2011-09-06 15:04:34

解決方案4 2 2011-09-06 14:01:04

解決方案5 1 2011-09-06 14:00:41

解決方案6 1 2011-09-06 14:35:38

解決方案7 1 2011-12-29 11:36:03

解決方案1
14 已采納 2011-09-06 13:59:49

解決方案2
9 2011-09-06 14:00:03

解決方案3
6 2011-09-06 15:04:34

解決方案4
2 2011-09-06 14:01:04

解決方案5
1 2011-09-06 14:00:41

解決方案6
1 2011-09-06 14:35:38

解決方案7
1 2011-12-29 11:36:03