简体   繁体   中英

Fast copying between random addresses

I'm developing an application which needs to perform a massive copying data byte-by-byte from one addresses to another addresses. Now'm using for loop in multithread. Size of arrays can be from 100k elements to 2M elements. It works relatively fast, but not enough. Is there a faster way to perform this task?

std::vector<uchar*> src, dst
//Filling src and dst vectors with pointers. src.size() and dst.size() are equal.

for (int i=0; i<src.size();i++)
   *dst[i]=*src[i]

UPD: It's an image processing task where pixel is 8-bit grayscale. Ready-to-use solutions such as OpenCV isn't suitable because it's even slower (up to 20 times). Maybe GPU solution is possible?

I'm developing an application which needs to perform a massive copying data byte-by-byte

That's very unlikely.

The only reason to create a copy is that the data is being modified in some way (and different pieces of code can't just share the same data in a "read only" way); and if the data is being modified in some way then it's very likely that you can merge the modification into the copying.

Maybe you're doing the same changes to all pixels, and it can be (eg) a "read 16 pixels from source, modify 16 pixels, write 16 pixels to destination" loop (where the work involved in modifying the pixels happens in parallel with pre-fetching the next pixels into cache, etc).

Maybe you're only modifying some pixels, and can do (eg) a lazy if( pointer_to_row[row] == NULL) { pointer_to_row[row] = create_copy_of_row(row); } modify_row(pointer_to_rows[row]); if( pointer_to_row[row] == NULL) { pointer_to_row[row] = create_copy_of_row(row); } modify_row(pointer_to_rows[row]); to avoid copying all the rows of pixels you don't modify. Maybe you can create a shared memory mapping of the data and let the operating system's "copy on write" virtual memory management take care of the copying for you.

Maybe you can have some kind of journal of changes and leave the original data alone (where you might have an int get_pixel(int x, int y ) { int temp = check_journal(x, y); if(temp;= NOT_PRESENT) return temp, else return get_original_pixel_data(x; y); } .

Maybe you can combine multiple techniques (eg a small journal for each row of pixels, with a lazy "if/when journal for row becomes full, create new row from old row and journal, and reset the journal to empty").

I moved entire project on GPU, using GLSL. Arrays are replaced with 2D samplers. Even low-end Intel UHD can handle high resolutions at high framerate.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM