[英]Read file into ByteArrays of 4 bytes
I would like to know how I could read a file into ByteArrays that are 4 bytes long. 我想知道如何将文件读入4字节长的ByteArrays中。 These arrays will be manipulated and then have to be converted back to a single array ready to be written to a file.
这些数组将被处理,然后必须转换回单个数组以准备写入文件。
EDIT: Code snippet. 编辑:代码段。
var arrays = new List<byte[]>();
using (var f = new FileStream("file.cfg.dec", FileMode.Open))
{
for (int i = 0; i < f.Length; i += 4)
{
var b = new byte[4];
var bytesRead = f.Read(b, i, 4);
if (bytesRead < 4)
{
var b2 = new byte[bytesRead];
Array.Copy(b, b2, bytesRead);
arrays.Add(b2);
}
else if (bytesRead > 0)
arrays.Add(b);
}
}
foreach (var b in arrays)
{
BitArray source = new BitArray(b);
BitArray target = new BitArray(source.Length);
target[26] = source[0];
target[31] = source[1];
target[17] = source[2];
target[10] = source[3];
target[30] = source[4];
target[16] = source[5];
target[24] = source[6];
target[2] = source[7];
target[29] = source[8];
target[8] = source[9];
target[20] = source[10];
target[15] = source[11];
target[28] = source[12];
target[11] = source[13];
target[13] = source[14];
target[4] = source[15];
target[19] = source[16];
target[23] = source[17];
target[0] = source[18];
target[12] = source[19];
target[14] = source[20];
target[27] = source[21];
target[6] = source[22];
target[18] = source[23];
target[21] = source[24];
target[3] = source[25];
target[9] = source[26];
target[7] = source[27];
target[22] = source[28];
target[1] = source[29];
target[25] = source[30];
target[5] = source[31];
var back2byte = BitArrayToByteArray(target);
arrays.Clear();
arrays.Add(back2byte);
}
using (var f = new FileStream("file.cfg.enc", FileMode.Open))
{
foreach (var b in arrays)
f.Write(b, 0, b.Length);
}
EDIT 2: Here is the Ugly Betty-looking code that accomplishes what I wanted. 编辑2:这是看起来像丑陋的贝蒂的代码,完成了我想要的。 Now I must refine it for performance...
现在我必须对其进行优化以提高性能...
var arrays_ = new List<byte[]>();
var arrays_save = new List<byte[]>();
var arrays = new List<byte[]>();
using (var f = new FileStream("file.cfg.dec", FileMode.Open))
{
for (int i = 0; i < f.Length; i += 4)
{
var b = new byte[4];
var bytesRead = f.Read(b, 0, b.Length);
if (bytesRead < 4)
{
var b2 = new byte[bytesRead];
Array.Copy(b, b2, bytesRead);
arrays.Add(b2);
}
else if (bytesRead > 0)
arrays.Add(b);
}
}
foreach (var b in arrays)
{
arrays_.Add(b);
}
foreach (var b in arrays_)
{
BitArray source = new BitArray(b);
BitArray target = new BitArray(source.Length);
target[26] = source[0];
target[31] = source[1];
target[17] = source[2];
target[10] = source[3];
target[30] = source[4];
target[16] = source[5];
target[24] = source[6];
target[2] = source[7];
target[29] = source[8];
target[8] = source[9];
target[20] = source[10];
target[15] = source[11];
target[28] = source[12];
target[11] = source[13];
target[13] = source[14];
target[4] = source[15];
target[19] = source[16];
target[23] = source[17];
target[0] = source[18];
target[12] = source[19];
target[14] = source[20];
target[27] = source[21];
target[6] = source[22];
target[18] = source[23];
target[21] = source[24];
target[3] = source[25];
target[9] = source[26];
target[7] = source[27];
target[22] = source[28];
target[1] = source[29];
target[25] = source[30];
target[5] = source[31];
var back2byte = BitArrayToByteArray(target);
arrays_save.Add(back2byte);
}
using (var f = new FileStream("file.cfg.enc", FileMode.Open))
{
foreach (var b in arrays_save)
f.Write(b, 0, b.Length);
}
EDIT 3: Loading a big file into byte arrays of 4 bytes wasn't the smartest idea... I have over 68 million arrays being processed and manipulated. 编辑3:将大文件加载到4个字节的字节数组中并不是最聪明的主意……我有超过6800万个数组正在处理和操纵。 I really wonder if its possible to load it into a single array and still have the bit manipulation work.
我真的很想知道是否有可能将其加载到单个数组中,并且仍然可以进行位操作。 :/
:/
Here's another way, similar to @igofed's solution: 这是另一种方式,类似于@igofed的解决方案:
var arrays = new List<byte[]>();
using (var f = new FileStream("test.txt", FileMode.Open))
{
for (int i = 0; i < f.Length; i += 4)
{
var b = new byte[4];
var bytesRead = f.Read(b, i, 4);
if (bytesRead < 4)
{
var b2 = new byte[bytesRead];
Array.Copy(b, b2, bytesRead);
arrays.Add(b2);
}
else if (bytesRead > 0)
arrays.Add(b);
}
}
//make changes to arrays
using (var f = new FileStream("test-out.txt", FileMode.Create))
{
foreach (var b in arrays)
f.Write(b, 0, b.Length);
}
Here is what you want: 这是您想要的:
using (var reader = new StreamReader("inputFileName"))
{
using (var writer = new StreamWriter("outputFileName"))
{
char[] buff = new char[4];
int readCount = 0;
while((readCount = reader.Read(buff, 0, 4)) > 0)
{
//manipulations with buff
writer.Write(buff);
}
}
}
IEnumerable<byte[]> arraysOf4Bytes = File
.ReadAllBytes(path)
.Select((b,i) => new{b, i})
.GroupBy(x => x.i / 4)
.Select(g => g.Select(x => x.b).ToArray())
Regarding your "Edit 3" ... I'll bite, although it's really a diversion from the original question. 关于您的“ Edit 3” ...我会咬,尽管这确实是对原始问题的转移。
There's no reason you need Lists of arrays, since you're just breaking up the file into a continuous list of 4-byte sequences, looping through and processing each sequence, and then looping through and writing each sequence. 没有理由需要数组列表,因为您只是将文件分解为一个连续的4字节序列列表,依次遍历和处理每个序列,然后遍历并写入每个序列。 You can do much better.
您可以做得更好。 NOTE: The implementation below does not check for or handle input files whose lengths are not exactly multiples of 4. I leave that as an exercise to you, if it is important.
注意:下面的实现不检查或处理长度不完全是4的倍数的输入文件。如果重要,我将其留给您练习。
To directly address your comment, here is a single-array solution. 为了直接发表您的评论,这是一个单阵列解决方案。 We'll ditch the List objects, read the whole file into a single byte[] array, and then copy out 4-byte sections of that array to do your bit transforms, then put the result back.
我们将放弃List对象,将整个文件读取到一个byte []数组中,然后复制该数组的4字节部分进行位转换,然后将结果放回去。 At the end we'll just slam the whole thing into the output file.
最后,我们将整个过程放入输出文件中。
byte[] data;
using (Stream fs = File.OpenRead("E:\\temp\\test.bmp")) {
data = new byte[fs.Length];
fs.Read(data, 0, data.Length);
}
byte[] element = new byte[4];
for (int i = 0; i < data.Length; i += 4) {
Array.Copy(data, i, element, 0, element.Length);
BitArray source = new BitArray(element);
BitArray target = new BitArray(source.Length);
target[26] = source[0];
target[31] = source[1];
// ...
target[5] = source[31];
target.CopyTo(data, i);
}
using (Stream fs = File.OpenWrite("E:\\temp\\test_out.bmp")) {
fs.Write(data, 0, data.Length);
}
All of the ugly initial read code is gone since we're just using a single byte array. 由于我们仅使用一个单字节数组,所有丑陋的初始读取代码都消失了。 Notice I reserved a single 4-byte array before the processing loop to re-use, so we can save the garbage collector some work.
注意,我在处理循环之前保留了一个4字节的数组以供重用,因此我们可以为垃圾收集器节省一些工作。 Then we loop through the giant data array 4 bytes at a time and copy them into our working array, use that to initialize the BitArrays for your transforms, and then the last statement in the block converts the BitArray back into a byte array, and copies it directly back to its original location within the giant data array.
然后,我们一次循环遍历4个字节的巨型数据数组,并将其复制到我们的工作数组中,使用该数组初始化您的转换的BitArrays,然后该块中的最后一条语句将BitArray转换回字节数组,然后进行复制它直接返回到其在巨型数据阵列中的原始位置。 This replaces
BitArrayToByteArray
method, since you did not provide it. 由于您未提供该方法,因此它将替换
BitArrayToByteArray
方法。 At the end, writing is also easy since it's just slamming out the now-transformed giant data array. 最后,写操作也很容易,因为它只是将现在已转换的巨型数据阵列都扔掉了。
When I ran your original solution I got an OutOfMemory exception on my original test file of 100MB, so I used a 44MB file. 运行原始解决方案时,我的100MB原始测试文件出现了OutOfMemory异常,因此我使用了44MB的文件。 It consumed 650MB in memory and ran in 30 seconds.
它消耗了650MB的内存,并在30秒内运行。 The single-array solution used 54MB of memory and ran in 10 seconds.
单阵列解决方案使用了54MB的内存,并在10秒内运行。 Not a bad improvement, and it demonstrates how bad holding onto millions of small array objects is.
这不是一个不好的改进,它表明了保持数百万个小数组对象的糟糕程度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.