简体   繁体   English

读取硬盘扇区原始数据-为什么十六进制?

[英]Reading hard disk sector raw data - Why hex?

I'm trying to read hard disk sector to get the raw data. 我正在尝试读取硬盘扇区以获取原始数据。 Now after searching a lot I found out that some people are storing that raw sector data in hex and some in char . 经过大量搜索,现在我发现有些人将原始扇区数据存储为十六进制,另一些人以char存储。

Which is better, and why ? 哪个更好?为什么 ? Which will give me better performance ? 哪个能给我更好的表现?

I'm trying to write it in C++ and OS is windows. 我正在尝试用C ++编写它,而OS是Windows。

For clarification - 为了澄清 -

#include <iostream>
#include <windows.h>
#include <winioctl.h> 
#include <stdio.h>

void main() {
DWORD nRead;
char buf[512];

HANDLE hDisk = CreateFile("\\\\.\\PhysicalDrive0", 
    GENERIC_READ, FILE_SHARE_READ,        
    NULL, OPEN_EXISTING, 0, NULL);

SetFilePointer(hDisk, 0xA00, 0, FILE_BEGIN);
ReadFile(hDisk, buf, 512, &nRead, NULL);
for (int currentpos=0;currentpos < 512;currentpos++) {
    std::cout << buf[currentpos];
}
CloseHandle(hDisk);
std::cin.get();
}

Consider the above code written by someone else and not me. 考虑上面的代码是别人而不是我写的。

Notice the datatype char buf[512]; 注意数据类型char buf [512]; . Storing with datatype as char and it hasn't been converted into hex. 将数据类型存储为char并且尚未转换为十六进制。

Raw data is just "raw data"... you store it as it is, you do not convert it. 原始数据只是“原始数据” ...按原样存储它,而不转换。 So, there no performance issue here. 因此,这里没有性能问题。 At most the difference is in representing the raw data in human readable format. 最多不同之处在于以人类可读格式表示原始数据。 In general: 一般来说:

  • representing it in char format makes easier to understand if there is some text contained in it, 以char格式表示它可以更容易理解其中是否包含某些文本,
  • while hex is better for representing numeric data (in case it follows some kind of pattern). 而十六进制则更适合表示数字数据(如果遵循某种模式)。

In your specific case: char just means 1 byte. 在您的特定情况下:char仅表示1个字节。 so you are sure you are storing your data in a 512 bytes buffer. 因此,请确保将数据存储在512字节缓冲区中。 Allocating such space in term of Integer size gets thing unnecessarily more complicated 按照整数大小分配这样的空间会使事情变得不必要地复杂

You have got yourself confused. 你让自己感到困惑。

The data on a disk is stored as binary, just a long ass stream of ones and zeros. 磁盘上的数据以二进制形式存储,只是一堆长的1和0。

The reason it is read in hex of char format is because it is easier to do. 以char格式的十六进制读取它的原因是因为它更容易实现。

decimal: 36
char:    z (potentially one way of representing this value)
hex:     24
binary:  100100

The binary is the raw bit stream you would read from the disc or mememory. 二进制文件是您从光盘或内存中读取的原始比特流。 Hex is like a shorthand representation for it, they are completely interchangeable, one Hex 'number' simple represents four bits. 十六进制就像是它的简写形式 ,它们是完全可互换的,一个十六进制的“数字”简单地表示四位。 Again, the decimal is just yet another way to represent that value. 同样,小数点是表示该值的另一​​种方式。

The char however is a little bit tricky; 字符但是有点棘手。 for my representation, I have taken the characters 0-9 to represent the values 0-9 and then az are ** representing** the values 10-36. 对于我的表示,我使用字符0-9 表示值0-9,然后az **表示**值10-36。 Equally, I could have decided to take the standard ascii value which would give me '$'. 同样,我本可以决定采用标准ascii值,该值会给我“ $”。

As to why 'char' is used when dealing with bytes, it is because the C++ 'har' type is just a single byte (which is normally 8 bits). 至于为什么在处理字节时使用'char'的原因,是因为C ++的'har'类型只是一个字节(通常为8位)。

I will also point out the problem with negative numbers. 我还将指出负数的问题。 when you have a integer number, that is signed (has positive and negative) the first bit (the most significant) represents a large negative value such that if all bits are 'one' the value will represent -1. 当您有一个整数(带正负号)时,第一位(最高有效位)代表一个较大的负值,因此,如果所有位均为“ 1”,则该值代表-1。 For example, with four bits so it is easy to see... 例如,只有四位,因此很容易看到...

0010 = +2 1000 = -8 0110 = +6 1110 = -2 0010 = +2 1000 = -8 0110 = +6 1110 = -2

The key to this problem is that it is all just how you interpret/represent the binary values. 解决此问题的关键在于,这仅是您解释/表示二进制值的方式。 The same sequence of bits can be represented more or less any way you want. 相同的位序列可以或多或少以您想要的任何方式表示。

I'm am guessing you're talking about the final data being written to some file. 我猜您正在谈论将最终数据写入某个文件。 The reason to use hex is because it's easier to read and harder to mess up. 使用十六进制的原因是因为它更易于阅读,更难以弄乱。 Generally if someone is doing some sort of human analysis on the sector they're going to use a hex editor on the raw data anyway, so if you output it as hex you skip the need for a hex viewer/editor. 通常,如果某人正在对该部门进行某种人工分析,则无论如何他们都将对原始数据使用十六进制编辑器,因此,如果将其输出为十六进制,则无需使用十六进制查看器/编辑器。

For instance, on DOS/Windows you have to make sure you open a file as binary if you're going to use characters. 例如,在DOS / Windows上,如果要使用字符,必须确保以二进制格式打开文件。 Also you might have to make sure that the operating system doesn't mess with the character format anywhere in between. 另外,您可能必须确保操作系统不会在两者之间的任何地方弄乱字符格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM