效率更高：大阵列或许多标量

Question

Working in an embedded (PIC) environment, programming in c. 在嵌入式（PIC）环境中工作，用c编程。

I have to keep track of 1000 values for a variable (history) and return the moving average of those values. 我必须跟踪变量（历史）的1000个值并返回这些值的移动平均值。 I'm just wondering if it will be more efficient in terms of speed, ROM and RAM usage if I use an array or 1000 16bit variables. 我只是想知道如果我使用数组或1000个16位变量，它在速度，ROM和RAM使用方面是否更有效。 Is there a difinitive answer to that? 对此有一个不同的答案吗？ Or would i have to just try both and see what works best? 或者我必须尝试两者，看看什么效果最好？

Thanks. 谢谢。

EDIT: Hmm... I already ran into another problem. 编辑：嗯......我已经遇到了另一个问题。 The compiler limits me to an array size maximum of 256. 编译器将我限制为最大值为256的数组。

EDIT2: EDIT2：

For clarification... the code fires about 1000 times a second. 为了澄清......代码每秒发射大约1000次。 Each time, a value for history[n] (or history_n) is calculated and stored. 每次计算并存储history [n]（或history_n）的值。 Each time I need to calculate the average of the 1000 most recent history values (including current). 每次我需要计算1000个最近历史值（包括当前值）的平均值。 So (history[1000] + history[999] + ... + history[1] + history[0]) / 1000; 所以(history[1000] + history[999] + ... + history[1] + history[0]) / 1000; or something to that effect. 或者那种效果。 Obviously each time I need to kick out the oldest and add the newest. 显然，每次我需要踢出最老的并添加最新的。

EDIT3: EDIT3：

I've re-worked the code such that now the 256 array size is not an issue. 我重新编写了代码，现在256个数组大小不是问题。 A sample size of around 100 is now suitable. 现在大约100的样本大小是合适的。

Answer 1

Assuming you need to keep the history, and given your 256 element array limit, here's a way to manage it: 假设你需要保留历史记录，并给出你的256元素数组限制，这是一种管理它的方法：

int history1[256];
int history2[256];
int history3[256];
int history4[256];
int* arrays[] = {history1,history2,history3,history4}
int idx=0;
int sum = 0;
int n = 0;

int updateAverage(int newValue)
{
  int ai = (idx++)>>8;
  int* target = arrays[ai]+(idx&0xFF);

  sum -=*target;
  *target = newValue;
  sum += *target;
  n++;
  n=n<1000?n:1000;
  idx = (idx<1000)?idx:0;
  return sum/n;
}

Answer 2

I'm not sure if I completely understand your question. 我不确定我是否完全理解你的问题。 Are you asking for the difference between the code generated for 'short history[1000]', and 'short history1, history2, ..., history1000;'? 您是否要求为“短历史[1000]”和“短历史1，历史2，...，历史1000;”生成的代码之间存在差异？

Both should use similar amounts of RAM: each entry is going to take be stored in a single file register (assuming you're using a 16-bit PIC). 两者都应使用相似数量的RAM：每个条目将被存储在单个文件寄存器中（假设您使用的是16位PIC）。 The code to calculate the average of the latter is going to be ugly though, and will likely take quite a bit of ROM, as it is going to need to reference each value separately (rather than just offsetting the base). 计算后者平均值的代码虽然很难看，但可能需要相当多的ROM，因为它需要分别引用每个值（而不是仅仅偏移基数）。

Edit: The reason for the 256 element limit is because of file register paging on the PIC. 编辑：256元素限制的原因是因为PIC上的文件寄存器分页。 You can't address a larger array by just offsetting the base register, because you may need to request a page change. 您只能通过偏移基址寄存器来寻址更大的数组，因为您可能需要请求更改页面。

Do you absolutely have to calculate a running average? 你绝对需要计算一个平均值吗？ Or can you do an overall average? 或者你可以做一个整体平均值？ If an overall average is okay, then use a variant of Alphaneo's answer: just keep the sum, and the number of values collected in two variables, and divide any time you need the average. 如果整体平均值没问题，那么使用Alphaneo答案的变体：只需保留总和，以及在两个变量中收集的值的数量，并在需要平均值时划分。

Answer 3

If you use an array the generated code will be much smaller. 如果使用数组，生成的代码将会小得多。 I'm not sure on this but I think an array access would use less memory since you don't have to keep track of multiple variables you just have a pointer to one Chunk. 我不确定这个，但我认为数组访问会使用更少的内存，因为你不必跟踪多个变量，你只需要指向一个Chunk的指针。 If your stack size is an issue an Array on the heap may be the best way to go since C variables are stored on the stack. 如果您的堆栈大小是一个问题，堆上的Array可能是最好的方法，因为C变量存储在堆栈中。

Answer 4

If you want to calculate an average by storing it in an array, will be definitely more expensive than calculating at run-time. 如果要通过将其存储在数组中来计算平均值，那么肯定比在运行时计算更昂贵。

Reason-1: If you calculate it at run-time, you will justing keep adding for example look at the following flow 原因1：如果您在运行时计算它，您将继续添加例如查看以下流程

    init-0: _tempSum_ = 0
    step-1: Read current value to _currVal_
    step-2: Add current value to _tempSum_
    step-3: Check if we have required of values _num_ (eg. 1000), if not goto-1
    step-4: Calculate _avg_ = _tempSum_ / _num_ and return
    step-5: goto init-0

If you store in a temp array of 1000 values, actually things you will all the steps from init-0 to step-5, except that you will end up using a 1000 value array. 如果存储在1000个值的临时数组中，实际上您将完成从init-0到步骤5的所有步骤，但最终将使用1000值数组。

It might be slower, based on the array access timing ... so beware 基于阵列访问时间，它可能会更慢......所以要小心

Answer 5

First, you can change your linker file to allow a larger section. 首先，您可以更改链接器文件以允许更大的部分。 You will then have to put your history array in that section using pragmas. 然后，您必须使用编译指示将历史数组放在该部分中。

Second, the array method is much better. 其次，数组方法要好得多。 To improve the performance you will also need a 32-bit integer to keep a running total of the history array. 要提高性能，还需要一个32位整数来保持历史数组的运行总和。

For each firing of the history function you will subtract the oldest value from the HistoryRunningTotal and add in the new history value. 对于历史记录功能的每次触发，您将从HistoryRunningTotal中减去最旧的值并添加新的历史记录值。 You will also need a OldestHistoryIndex variable to keep track of where the newest value will go (and overwrite the old history). 您还需要一个OldestHistoryIndex变量来跟踪最新值的位置（并覆盖旧历史记录）。

Answer 6

If you have enough memory for the 1000 values, then why not statically allocate the memory, make a rotating buffer and move a pointer through the values to calculate the sum. 如果1000个值有足够的内存，那么为什么不静态分配内存，创建一个旋转缓冲区并在值中移动指针来计算总和。 (if the running avg isn't what you need). （如果运行的平均值不是你需要的）。 Just keep putting the new value over the oldest value, calculate the sum of the 1000 values and divide by 1000. So actually two pointers, one for inserting the new value and one to iterate over the whole buffer. 只需继续将新值放在最旧的值上，计算1000个值的总和除以1000.实际上是两个指针，一个用于插入新值，另一个用于迭代整个缓冲区。

Answer 7

The compiler limits arrays to 256? 编译器将数组限制为256？ That's pretty bad. 那太糟糕了。 That imposes a burden on the programmer that the compiler really should be taking care of. 这给程序员带来了负担，编译器确实应该照顾它。 Are you sure there isn't a compiler option for eg a "large" memory model? 你确定没有编译器选项，例如“大”内存模型吗？ If not, investigate a different micro and/or compiler. 如果没有，请研究不同的微观和/或编译器。

Or, to keep things small, consider using a small IIR filter rather than a large FIR filter. 或者，为了保持小巧，可以考虑使用小型IIR滤波器而不是大型FIR滤波器。

Anyway, to answer your original question: an array is the logical choice for such an algorithm, for the benefits of sensible (maintainable) code and small code size. 无论如何，回答你原来的问题：数组是这种算法的合理选择，因为它具有合理（可维护）代码和小代码大小的优点。

Answer 8

If the array is dynamically allocated, then using the static variables will be faster. 如果数组是动态分配的，那么使用静态变量会更快。 If the array is statically allocated, then the compiler should be able to optimize it to be equivalently fast as static variables. 如果数组是静态分配的，那么编译器应该能够将其优化为与静态变量等效的快速。 Try both on some trivial code and profile it. 尝试使用一些简单的代码并对其进行分析。

效率更高：大阵列或许多标量

问题描述

8 个解决方案

解决方案1
4 已采纳 2009-09-16 02:37:14

解决方案2
2 2009-09-15 23:58:07

解决方案3
1 2009-09-15 23:55:20

解决方案4
1 2009-09-16 00:15:33

解决方案5
1 2009-09-16 12:24:09

解决方案6
0 2009-09-16 02:10:34

解决方案7
0 2009-09-16 23:35:16

解决方案8
0 2009-09-18 14:26:57

效率更高：大阵列或许多标量

问题描述

8 个解决方案

解决方案1 4 已采纳 2009-09-16 02:37:14

解决方案2 2 2009-09-15 23:58:07

解决方案3 1 2009-09-15 23:55:20

解决方案4 1 2009-09-16 00:15:33

解决方案5 1 2009-09-16 12:24:09

解决方案6 0 2009-09-16 02:10:34

解决方案7 0 2009-09-16 23:35:16

解决方案8 0 2009-09-18 14:26:57

解决方案1
4 已采纳 2009-09-16 02:37:14

解决方案2
2 2009-09-15 23:58:07

解决方案3
1 2009-09-15 23:55:20

解决方案4
1 2009-09-16 00:15:33

解决方案5
1 2009-09-16 12:24:09

解决方案6
0 2009-09-16 02:10:34

解决方案7
0 2009-09-16 23:35:16

解决方案8
0 2009-09-18 14:26:57