[英]Efficient reformatting of ring buffer (circular buffer) to continous array
我已經成功實現了一個功能,該功能可以將從環形緩沖區中任意點開始的任意數量的值復制到連續數組中,但是我想使其更加高效。 這是我的代碼的最小示例:
#include <string.h>
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
/*Foo: a function*/
void Foo(int * print_array, int print_amount){
/*Simulate overhead*/
this_thread::sleep_for(chrono::microseconds(1000));
int sum = 0;
for (int i = 0; i < print_amount; i++){
sum += print_array[i]; //Linear operation
// cout << print_array[i] << " "; //Uncomment to check if correct funtionality
}
}
/*Example function*/
int main(){
/*Initialze ring buffer*/
int ring_buffer_elements = 32; //A largeish size
int ring_buffer_size = ring_buffer_elements * sizeof(int);
int * ring_buffer = (int *) malloc(ring_buffer_size);
for (int i = 0; i < ring_buffer_elements; i++)
ring_buffer[i] = i; //Fill buffer with ordered numbers
/*Initialze array*/
int array_elements = 16; //A smaller largeish size
int array_size = array_elements * sizeof(int);
int * array = (int *) malloc(array_size);
/*Set reference pointers*/
int * start_pointer = ring_buffer;
int * end_pointer = ring_buffer + ring_buffer_elements;
/*Set moving copy pointer*/
int * copy_pointer = start_pointer;
/*Set "random" amount to be copied at each iteration*/
int copy_amount = 11;
/*Set loop amount to check functionality or run time*/
int loop_amount = 1000; //Set lower if checking functionality
/***WORKING METHOD***/
/*Start timer*/
auto start_time = chrono::high_resolution_clock::now();
/*"Continuous" loop*/
for (int i = 0; i < loop_amount; i++){
/*Copy loop*/
for (int j = 0; j < copy_amount; j++){
array[j] = *copy_pointer; //Copy value from ring buffer
copy_pointer++; //Move pointer
if (copy_pointer >= end_pointer)
copy_pointer = start_pointer; //Reset pointer if reached end of ring buffer
}
Foo(array, copy_amount); //Call a function
}
/*Check run time*/
chrono::duration<double> run_time_ticks = chrono::high_resolution_clock::now() - start_time;
double run_time = run_time_ticks.count();
/*Print result*/
cout << endl << run_time << endl;
/***NAIVE METHOD***/
/*Reset moving pointer*/
copy_pointer = start_pointer;
/*Start timer*/
start_time = chrono::high_resolution_clock::now();
/*"Continuous" loop*/
for (int i = 0; i < loop_amount; i++){
/*Compute how many elements must be copied after reaching end of ring buffer*/
int copy_remainder = copy_pointer + copy_amount - end_pointer; //Ugly pointer arithmetic?
/*Check if we need to loop back or not*/
if (copy_remainder <= 0){
Foo(copy_pointer, copy_amount); //Call function
copy_pointer += copy_amount; //Move pointer
} else {
Foo(copy_pointer, copy_amount-copy_remainder); //Call function with part of values from copy pointer
Foo(start_pointer, copy_remainder); //Call function with remainder of values from start of ring buffer
copy_pointer = start_pointer + copy_remainder; //Move pointer
}
}
/*Check run time*/
run_time_ticks = chrono::high_resolution_clock::now() - start_time;
run_time = run_time_ticks.count();
/*Print result*/
cout << endl << run_time << endl;
/***memcpy METHOD***/
/*Reset moving pointer*/
copy_pointer = start_pointer;
/*Initialize size reference*/
int int_size = (int) sizeof(int);
/*Start timer*/
start_time = chrono::high_resolution_clock::now();
/*"Continuous" loop*/
for (int i = 0; i < loop_amount; i++){
/*Compute how many elements must be copied after reaching end of ring buffer*/
int copy_remainder = copy_pointer + copy_amount - end_pointer; //Ugly pointer arithmetic?
/*Check if we need to loop back or not*/
if (copy_remainder <= 0){
memcpy(array, copy_pointer, copy_amount*int_size); //Use memcpy
copy_pointer += copy_amount; //Move pointer
} else {
memcpy(array, copy_pointer, (copy_amount-copy_remainder)*int_size); //Use memcpy with part of values from copy pointer
memcpy(array+(copy_amount-copy_remainder), start_pointer, copy_remainder*int_size); //Use memcpy wih remainder of values from start of ring buffer
copy_pointer = start_pointer + copy_remainder; //Move pointer
}
/*Call a function*/
Foo(array, copy_amount);
}
/*Check run time*/
run_time_ticks = chrono::high_resolution_clock::now() - start_time;
run_time = run_time_ticks.count();
/*Print result*/
cout << endl << run_time << endl;
}
環形緩沖區用於連續更新音頻數據流,因此必須將引入的延遲量保持在最低水平,這就是為什么我要對其進行改進的原因。
我當時認為在WORKING METHOD中復制值是多余的,應該只傳遞原始的環形緩沖區數據就可以了。 我這樣做的天真的方法是使用原始數據進行寫入,並且每當數據循環回時,都應再次寫入(請參見“原始改進”)。
實際上,在這個最小示例中,這種改進要快幾個數量級。 但是,在我的實際應用程序中, Foo被寫有硬件緩沖區的函數所取代,並且具有相當大的開銷̣̣̣̣̣-最終結果比WORKING METHOD代碼慢,這意味着我永遠都不要使用它(或者在這種情況下為Foo)多次(每次寫入音頻數據)。 ( 編輯將模擬開銷添加到Foo中,以准確描述此問題)。
因此,我的問題是是否有更快的方法將數據從環形緩沖區復制到單個連續數組?
(此外,環形緩沖區每次寫入都不需要回環超過一次:copy_amount始終小於ring_buffer_elements)
謝謝!
編輯按照Passer By的建議,用最少的示例替換了原始代碼段。
編輯2根據duong_dajgja的建議,添加了模擬開銷和memcpy。 在示例中,memcpy方法和工作方法具有基本相同的性能(后者具有某些優勢)。 在我的應用程序中,使用盡可能小的緩沖區時,memcpy比工作方法快大約3-4%。 如此之快,但遺憾的是遠非如此。
我想建議一個環形緩沖區-實際上是一個數組的結構。
由於您的數據是音頻流,因此我想數據會按順序排列(例如時間序列數據)。
假設buff的容量為100個元素。 當buff變滿並且您要在末尾添加新元素時,首先必須使用memmove
將區域從buff [1]左移到buff [99],然后只需將新元素放在buff [99] 。 這樣做可以確保環形緩沖區始終是具有正確順序(從buff [0]到buff [99])的數組。 現在,簡單memcpy
從BUFF全區[0],以擦亮[99]到目標數組,你的願望。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.