I have a large multi-dimensional array that needs to be stored with protobuf. The array could have up to 5120*5120 = 26,214,400 items in it. Protobuf does not support storing multi-dimensional arrays, unfortunately.
As a test, I wrote two functions and an extra class. The class stores and x,y which points to the location inside of the array (array[x, y]). The class has a "value" that is the data from the array[x,y]. I use a List to store this data.
When I generate a fairly small array (1024*1024) I get an output file that is over 169MB. From my testing, it loads and generates the file extremely fast so there's no issue there. However, the file size is huge - I definitely need to cut down on size.
Is this a normal file size, or do I to rethink my entire process? Should I compress the data before saving it (zipping the file takes it from 169MB to 6MB)? If so, what's the fastest/easiest way to zip a file in C#?
This is pseudo code that is based on my real code.
[ProtoContract]
public class Example
{
[ProtoIgnore]
public string[,] MyArray { get; set; }
[ProtoMember(0)]
private List<MultiArray> Storage { get; set; }
public void MoveToList()
{
for (int x = 0; x < MyArray.GetLength(0); x++)
{
for (int y = 0; y < MyArray.GetLength(1); y++)
{
Storage.Add(new MultiArray
{
_x = x,
_y = y,
value = MyArray[x, y]
});
}
}
}
public void MoveToArray()
{
MyArray = new string[1024, 1024];
for (int i = 0; i < Storage.Count; i++)
{
MyArray[Storage[i].X, Storage[i].Y] = Storage[i]._value;
}
}
}
[ProtoContract]
public class MultiArray
{
[ProtoMember(0)]
public int _y { get; set; }
[ProtoMember(1)]
public int _x { get; set; }
[ProtoMember(2)]
public string _value { get; set; }
}
Notes: The value must be the correct x/y of the array.
I appreciate any suggestions.
I don't know about the storage but this is probably not the right way to do it.
The way you are doing it, you are creating a MultiArray object for every cell of your array.
A simplier and more efficient solution would be to do that:
String[] Storage = new String[1024*1024];
int width = 1024
int height = 1024;
for (int x = 0; x < width; x++)
{
for (int y = 0; y < height; y++)
{
Storage[x*width+y]=MyArray[x,y];
}
}
Ultimately, the protobuf format doesn't have a concept of arrays of higher dimension than one.
At the library level since you're using protobuf-net we could have the library do some magic here, essentially treating it as;
message Array_T {
repeated int32 dimensions;
repeated T items; // packed when possible
}
(noting that.proto doesn't actually support generics, but that doesn't really matter at the library level)
However, this would be a little awkward from a x-plat perspective.
But to test whether this would help, you could linearize your 2D array, and see what space it takes.
In your case, I suspect the real problem (re the size) is the quantity of strings. Protobuf writes string contents every time , without any attempt at lookup tables. It may also be worth checking what the sunlm total of string lengths (in UTF-8 bytes) is for your array contents.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.