简体   繁体   中英

reading binary from any type of file in c#

I need to read data from different types of files (wav, dll etc.) for a compression algorithm. Now the algorithm is kind of sorted out, however I'm having a problem when reading from non text files. What I need to do is read the ascii representation of each character in the file and then apply my algorithm to what I've read.

I've used this for reading (path is the string that represents the path of the file, byte[] abc):

if (path != "") {
abc = File.ReadAllBytes(path);
}

It works just fine for text files (doc, txt, .m etc) but if I try to do this for a dll file I get the following error: Value was either too large or too small for an unsigned byte. I've also tried setting abc as a string and using File.ReadAllText and then converting each character in the string to a byte value but I get the same error. I know that a wav file, for example, is composed of special characters if you open it in a text editor and so far I think that the ascii value for some of those characters is beyond 255 which may lead to the error. However I don't know if that is in fact the case and I'm a bit stuck on what I might do to sort out my issue. If anyone has any idea I would most appreciate it. It would also be nice if you could stick to the language used (C#). Thanks!

A byte is a value between 0 and 255. Every file on your computer consists of a number of bytes, regardless of whether they are wave files, dll files, text files or even files without extensions. You can ReadAllBytes from any file and all bytes returned contain values between 0 and 255.

ASCII is a character set that contains values between 0 and 127 - there are ASCII extensions or code pages that contain 256 possible values. Not all values can be represented (or displayed) though - a portion of ASCII and these extensions are control characters which have no default representation.

There are no ASCII characters beyond 255 - the characters you see is the text editor trying to make the best of it.

The error you get is from converting something (a byte?) to a ubyte which allows a value between -128 and 127, while most wave files will certainly contain values above 127.

In short: you can't use ASCII to represent every possible value for a byte. You could use an ASCII extension to hold the byte's value but it would not make sense for a non-textual file (the 'A's you see when you open a .wav file in a text editor are not meant to be 'A's).

If you do want to continue down the path you have chosen, you'll have to post the code where you convert the bytes to an unsigned byte or ASCII value. But you probably should try to "convert" your algorithm into a binary one.

Use the following code:

using (FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read))
{
   byte[] buffer = new byte[fileStream.Length];
   fileStream.Read(buffer, 0, (int) fileStream.Length);
   return buffer;
}

Tried to read kernel32.dll and user32.dll with this code, and it worked just fine

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM