简体   繁体   English

使用C从文件读取字节

[英]Reading Bytes from a File using C

I have written a program in C which will read the bytes at a specific memory address from its own address space. 我用C语言编写了一个程序,它将从其自己的地址空间读取特定内存地址的字节。

it works like this: 它是这样的:

  1. first it reads a DWORD from a File. 首先,它从文件读取DWORD。
  2. then it uses this DWORD as a memory address and reads a byte from this memory address in the current process' address space. 然后它将这个DWORD用作内存地址,并在当前进程的地址空间中从该内存地址读取一个字节。

Here is a summary of the code: 以下是代码摘要:

FILE *fp;
char buffer[4];

fp=fopen("input.txt","rb");

// buffer will store the DWORD read from the file

fread(buffer, 1, 4, fp);

printf("the memory address is: %x", *buffer);

// I have to do all these type castings so that it prints only the byte example:
// 0x8b instead of 0xffffff8b

printf("the byte at this memory address is: %x\n", (unsigned)(unsigned char)(*(*buffer)));

// And I perform comparisons this way

if((unsigned)(unsigned char)(*(*buffer)) == 0x8b)
{
    // do something
}

While this program works, I wanted to know if there is another way to read the byte from a specific memory address and perform comparisons? 当该程序工作时,我想知道是否还有另一种方法可以从特定的内存地址读取字节并进行比较? Because each time, I need to write all the type castings. 因为每次,我都需要编写所有类型转换。

Also, now when I try to write the byte to a file using the following syntax: 另外,现在,当我尝试使用以下语法将字节写入文件时:

// fp2 is the file pointer for the output file
fwrite(fp2, 1, 1, (unsigned)(unsigned char)(*(*buffer)));

I get the warnings: 我得到警告:

test.c(64) : warning C4047: 'function' : 'FILE *' differs in levels of indirectio
n from 'unsigned int'
test.c(64) : warning C4024: 'fwrite' : different types for formal and actual para
meter 4

thanks. 谢谢。

Take note of the definition of fwrite , 注意fwrite的定义,

size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);

which means that the warnings at the last part of your question are because you should be writing from a character pointer rather than writing the actual value of the character. 这意味着问题最后部分的警告是因为您应该从字符指针进行写入,而不是写入字符的实际值。

You can remove the extra type castings by assigning the pointer you read from the file to another variable of the correct type. 您可以通过将从文件中读取的指针分配给正确类型的另一个变量来删除多余的类型转换。

Examples to think about : 要考虑的示例

#include <stdio.h>
int main() {
  union {
    char buffer[8];
    char *character;
    long long number;
  } indirect;
  /* indirect is a single 8-byte variable that can be accessed
   * as either a character array, a character pointer, or as
   * an 8-byte integer! */
  char *x = "hi";
  long long y;
  char *z;
  printf("stored in the memory beginning at x: '%s'\n", x); /* 'hi' */
  printf("bytes used to represent the pointer x: %ld\n", sizeof(x)); /* 8 */
  printf("exact value (memory location) of (pointed to by) the pointer x: %p\n", x); /* 4006c8 */
  y = (long long) x;
  printf("%llx\n", y); /* 4006c8 */
  z = (char *) y;
  printf("%s\n", z); /* 'hi' */
  /* the cool part--we can access the exact same 8 bytes of data
   * in three different ways, as a 64-bit character pointer,
   * as an 8-byte character buffer, or as
   * an 8-byte integer */
  indirect.character = z;
  printf("%s\n", indirect.character); /* 'hi' */
  printf("%s\n", indirect.buffer); /* binary garbage which is the raw pointer  */
  printf("%lld\n", indirect.number); /* 4196040 */
  return 0;
}

By the way, reading arbitrary locations from memory seems concerning. 顺便说一句,从内存中读取任意位置似乎很重要。 (You say that you are reading from a specific memory address within the program's own address space, but how do you make sure of that?) (您说您正在从程序自己的地址空间内的特定内存地址读取数据,但是如何确定呢?)

You can use the C language union construct to represent an alias for your type as shown 您可以使用C语言联合构造来表示类型的别名,如下所示

typedef union {
      char char[4];
      char *pointer;
   } alias;

alias buffer;

This assumes a 32-bit architecture (you could adjust the 4 at compile time, but would then also need to change the fread() byte count). 假定使用32位体系结构(您可以在编译时调整4 ,但随后还需要更改fread()字节数)。

Then, you can simply use *(buffer.pointer) to reference the contents of the memory location. 然后,您可以简单地使用*(buffer.pointer)来引用内存位置的内容。

From your question, the application is not clear, and the technique seems error prone. 从您的问题来看,应用程序尚不清楚,该技术似乎容易出错。 How do you take into account the movement of addresses in memory as things change? 当事情发生变化时,您如何考虑地址在内存中的移动? There may be some point in using the linker maps to extract symbolic information for locations to avoid the absolute addresses. 使用链接器映射为位置提取符号信息可能会有所避开绝对地址。

    fp=fopen("input.txt","rb");

The file has an extension of .txt and you are trying to read it as a binary file. 该文件的扩展名为.txt,您正在尝试将其读取为二进制文件。 Please name files accordingly. 请相应地命名文件。 If on Windows, name binary files with .bin extention. 如果在Windows上,请使用.bin扩展名命名二进制文件。 On Linux file extension do not matter. 在Linux上文件扩展名没关系。

    // buffer will store the DWORD read from the file

    fread(buffer, 1, 4, fp);

If you want to read 4 bytes, declare an unsinged int variable and read 4 bytes into it as shown below 如果要读取4个字节,请声明一个未ing的int变量,并向其中读取4个字节,如下所示

    fread(&uint, 1, 4, fp);

Why do you want to use a character array ? 为什么要使用字符数组? That is incorrect. 那是不对的。

    printf("the memory address is: %x", *buffer);

What are you trying to do here ? 您想在这里做什么? buffer is a pointer to a const char and the above statement prints the hex value of the first character in the array. buffer是指向const char的指针,上面的语句显示数组中第一个字符的十六进制值。 The above statement is equal to 上面的陈述等于

   printf("the memory address is: %x", buffer[0]);

   (*(*buffer)

How is this working ? 这如何运作? Aren't there any compiler warnings and errors ? 没有编译器警告和错误吗? Is it Windows or Linux ? 是Windows还是Linux? (*buffer) is a char and again de-referencing it should throw and error unless properly cast which I see you are not doing. (* buffer)是一个字符,并且再次取消引用它应该引发错误,除非正确地转换(我认为您没有这样做)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM