简体   繁体   中英

ArrayPool create method giving error in C#

Basically, i want to read data from source file to target file in azure data lake parallelly using ConcurrentAppend API.

Also, i dont want to read the data from files all at once but in chunks, i am using buffers for that. i want to create 5 buffers of 1 MB, 5 buffers of 2 MB, and 5 buffers of 4 Mb. whenever a source file arrives, it will use the appropriate buffer according to its size and i will append to target using that buffer. I dont want buffers to exceed 5 in each case/configuration.

I was using a shared ArrayPool for renting buffers. But since i have this condition that allocation should not exceed beyond 5 arrays in each case ( 1, 2 and 4 MB) -> i had to use some conditions to limit that.

I would rather like to use a custom pool which i can create like:

ArrayPool<byte> pool = ArrayPool<byte>.Create( One_mb , 5)

this will take care that my allocations dont go beyond 5 arrays and max size of array will be 1 MB. Similarly i can create two more buffer pool for 2 and 4 mb case. This way i wont need to put those conditions to limit it to 5.

Problem:

when i use this custom pool, i get corrupted data in my target file. Moreover, target file size gets doubled, like if sum of input is 10 mb -> target file shows 20 mb.

If i use the same code and rent from single shared ArrayPool rather than these custom pools, i get correct result.

What am i doing wrong?

My code: https://github.com/ChahatKumar/ADLS/blob/master/CreatePool/Program.cs

FileStream.Read returns the number of bytes read. This will not necessarily be the size of your array and could very well be smaller (or zero if no byes were read). The code in your github example is ignoring the value of Read and making the incorrect assumption that the buffer was filled by telling the next method to use the entire buffer. Because your arrays are so large, it is possible (and perhaps likely) that you will not read them entirely with a single call to Read (even if the files are actually that large, FileStream has its own internal buffer and buffer size).

Your method should likely look like the following. Note I pass the actual number of bytes read to ConcurrentAppend (which I assume to be well conforming in that it respects the length argument):

int read;
while ((read = file.Read(buffer1, 0, buffer1.Length) > 0)
{
   c.ConcurrentAppend(filename, true, buffer1, 0, read);
} 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM