如何更有效地遍历存储{int，short，ushort，...}的字符数组？

Question

I have a char data[len] populated from unzipped data that is read off of a binary file. 我有一个从二进制文件中读取的未压缩数据填充的char data[len] 。 I know that data can only be of these types: char, uchar, short, ushort, int, uint, float, double for which I know exact number of bits needed to represent ( elesize = {8, 16, 32, 64} ). 我知道data只能是以下类型： char, uchar, short, ushort, int, uint, float, double ，这些数据我知道要表示的确切位数（ elesize = {8, 16, 32, 64} ）。

I just want to traverse the data list and, say, find the max() , min() or number of occurrences of a given number. 我只想遍历数据列表，并找到给定数字的max() ， min()或出现次数。 and I want to do this without creating another array for memory space concerns. 而且我想这样做而不创建另一个数组来解决内存空间问题。

I have come up with the following but it is slow for example for len == 34560000 我想出了以下内容，但是它很慢，例如len == 34560000

So I was wondering if anyone has a 'one-liner' or a more efficient way for doing this (either C or C++). 所以我想知道是否有人有“单线”或更有效的方式（C或C ++）来做到这一点。

char data[len];
double mymax = -std::numeric_limits<double>::max()
for (size_t i=0; i<len; i += elesize)
{
    double x;
    char *r = data+i;
    if (elementtype == "char")
        x = static_cast<double>(*r);
    else if (elementtype == "uchar")
        x = static_cast<double>(*((unsigned char *)r));
    else if (elementtype == "short")
        x = static_cast<double>(*((int16_t *)r));
    else if (elementtype == "ushort")
        x = static_cast<double>(*((uint16_t *)r));
    else if (elementtype == "int")
        x = static_cast<double>(*((int32_t *)r));
    else if (elementtype == "uint")
        x = static_cast<double>(*((uint32_t *)r));
    else if (elementtype == "float")
        x = static_cast<double>(*((float *)r));
    else if (elementtype == "double")
        x = *((double *)r);
    if (x > mymax)
        mymax = x;
}

Answer 1

A template should do nicely: 模板应该做得很好：

#include <algorithm>

template <typename T>
T read_and_advance(const unsigned char * & p)
{
  T x;
  unsigned char * const px = reinterpret_cast<unsigned char *>(&x);

  std::copy(p, p + sizeof(T), px);
  P += sizeof(T);

  return x;
}

Usage: 用法：

const unsigned char * p = the_data;
unsigned int max = 0;

while (p != the_data + data_length)
{
  max = std::max(max, read_and_advance<unsigned int>(p));
}

Scrap this, I thought originally the question was for C. 废话不说，我本来以为是C的问题。

~~Here's a macro:~~ ~~这是一个宏：~~

 #define READ_TYPE(T, buf, res) do { memcpy(&res, buf, sizeof(T)); buf += sizeof(T); } while (false)

~~Usage:~~ ~~用法：~~

 int max = 0; unsigned char * p = data; while (true) { unsigned int res; READ_TYPE(unsigned int, p, res); if (res > max) max = res; }

~~You don't really get around specifying the type , though.~~ ~~但是，您实际上并没有绕过指定类型。~~ ~~In C++ this could be done a bit more elegantly.~~ ~~在C ++中，这可以更优雅地完成。~~

~~Alternatively you can wrap it all in one:~~ ~~另外，您也可以将它们全部包装在一起：~~

 #define READ_TYPE_AND_MAX(T, buf, max) \\ do { T x; memcpy(&x, buf, sizeof(T)); \\ buf += sizeof(T); \\ if (max < x) max = x; \\ } while (false) // Usage: unsigned int max = 0; unsigned char * p = data; while (true) { READ_TYPE_AND_MAX(unsigned int, p, max); }

Answer 2

Given that elementtype is loop-invariant, you would better do the comparison only once outside the for . 鉴于elementtype是循环不变的，您最好只在for之外进行一次比较。 By the way, I hope elementtype is of type std::string or something that meaningfully compares to string literals. 顺便说一句，我希望elementtype的类型为std::string或与字符串文字有意义地比较的东西。

Ultimately, I would write a template function that does the whole processing loop and then call it with the appropiate template argument according to elementtype . 最终，我将编写一个执行整个处理循环的模板函数，然后根据elementtype用适当的模板参数进行调用。

Answer 3

Put the conditional code outside the loop, so the loop runs fast. 将条件代码放在循环之外，因此循环运行很快。 Try something like this: 尝试这样的事情：

char data[len];
double mymax = -std::numeric_limits<double>::max()
double x;
if (elementtype == "char") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*r);
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "uchar") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*((unsigned char *)r));
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "short")

..etc..etc

Answer 4

As others indicated, you should check the type only once. 正如其他人指出的那样，您应该只检查一次类型。 Then you should call appropriate sub-function that only deals with one type. 然后，您应该调用仅处理一种类型的适当子功能。 You should also not be casting to doubles for comparing to my_max when the elementtype is not double. 当elementtype不是double时，也不应将double转换为与my_max进行比较。 Otherwise you are needlessly converting to double and doing comparisons with doubles. 否则，您将不必要地转换为double并使用double进行比较。 If elementtype is uint, then you should never be converting anything to double, just compare with a my_max var that is also uint. 如果elementtype是uint，则永远不要将任何内容都转换为double，只需与也是uint的my_max var进行比较即可。

如何更有效地遍历存储{int，short，ushort，...}的字符数组？

问题描述

4 个解决方案

解决方案1
1 已采纳 2011-11-17 20:06:53

解决方案2
0 2011-11-17 19:50:56

解决方案3
0 2011-11-17 19:54:13

解决方案4
0 2011-11-17 20:13:11

如何更有效地遍历存储{int，short，ushort，...}的字符数组？

问题描述

4 个解决方案

解决方案1 1 已采纳 2011-11-17 20:06:53

解决方案2 0 2011-11-17 19:50:56

解决方案3 0 2011-11-17 19:54:13

解决方案4 0 2011-11-17 20:13:11

解决方案1
1 已采纳 2011-11-17 20:06:53

解决方案2
0 2011-11-17 19:50:56

解决方案3
0 2011-11-17 19:54:13

解决方案4
0 2011-11-17 20:13:11