In my application I have a lot of small files that are saved in a certain path structure. I am creating a container file where I want to store all existing files starting with a header that contains information like offset of each file or file size. I am writing to this file with a BinaryWriter. But there are a lot of duplicates that are only added to my container file once. Therefore I create a hash value for each file and compare it to existing hash values in a data table. This works just like it should, but I want to know wether this is good practice or not because there could be a huge amount of data. Are there better or more performant ways to achieve my goal?
Here is my actual code:
// I parsed through my files and created my header
// all file paths were added to my tileList
DataTable dtImageInfos = new DataTable();
dtImageInfos.Columns.Add("tilename", typeof(String));
dtImageInfos.Columns.Add("hash", typeof(String));
dtImageInfos.Columns.Add("offset", typeof(long));
foreach (String tile in tileList)
{
FileInfo f = new FileInfo(tile);
int tileSize = Convert.ToInt32(f.Length);
if (tileSize <= MAX_CHECK_SIZE)
{
Image tileImg = Image.FromFile(tile);
String tileHash = createHashForImage(tileImg);
DataTable dtCheck = dtImageInfos.Clone();
if (dtImageInfos.Rows.Count > 0)
dtImageInfos.AsEnumerable().Where(t => t.Field<String>("hash").Equals(tileHash))
.CopyToDataTable(dtCheck, LoadOption.OverwriteChanges);c#
if (dtCheck.Rows.Count == 0)
{
writer.Write(tileOffset);
DataRow drNew = dtImageInfos.NewRow();
drNew["tilename"] = tile;
drNew["hash"] = tileHash;
drNew["offset"] = tileOffset;
dtImageInfos.Rows.Add(drNew);
tileOffset += tileSize;
}
else
{
DataRow drCheck = dtCheck.Rows[0];
writer.Write((long)drCheck["offset"]);
}
}
else
{
writer.Write(tileOffset);
DataRow drNew = dtImageInfos.NewRow();
drNew["tilename"] = tile;
drNew["hash"] = "";
drNew["offset"] = tileOffset;
dtImageInfos.Rows.Add(drNew);
tileOffset += tileSize;
}
writer.Write(tileSize);
}
foreach (DataRow drTile in dtImageInfos.Rows)
{
byte[] tileData = File.ReadAllBytes(drTile["tilename"].ToString());
writer.Write(tileData);
}
I hope I could make my issue understandable. Thanks in advance
Using the latest version of c#.NET you can avoid a lot of over head just using a Zip File.
System.IO.Compression.ZipFile.CreateFromDirectory(folderPathWithTiles, containerPath);
You can access the different file (tiles) in your zip archive (container file) with the .OpenRead method.
using (ZipArchive tiles = ZipFile.OpenRead(containerPath))
{
ZipArchiveEntry tile = tiles.GetEntry(relativeTilePath);
Image tileImage = Image.FromStream(tile.Open());
}
ZipFile: http://msdn.microsoft.com/en-us/library/system.io.compression.zipfile.aspx
ZipArchive: http://msdn.microsoft.com/en-us/library/system.io.compression.ziparchive.aspx
ZipArchiveEntry: http://msdn.microsoft.com/en-us/library/system.io.compression.ziparchiveentry.aspx
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.