简体   繁体   English

将函数写入二进制文件?

[英]Writing functions to binary files?

Something that made me fairly curious was that since it's possible in C++ to pass a function as an argument under the right circumstances, that would suggest that whatever internal code handles that function can be pointed to and otherwise written and read into a binary as its executable code. 令我颇为奇怪的是,由于在适当的情况下C ++可以将函数作为参数传递,这表明可以指向处理该函数的任何内部代码,然后将其写入并读入二进制文件作为其可执行文件。码。

This is obviously coming from someone while I may have a strong background in C++, I'm not familiar with the intricate internals in just how memory is managed in the heap and especially how the executable machine code fits into the picture. 这显然来自某人,虽然我可能具有C ++的深厚背景,但我对如何在堆中管理内​​存,尤其是如何将可执行的机器代码放入图片中的复杂内部结构并不熟悉。

I assume since it's possible to pass around the reference to a function, it's possible to get the data pointed to by it and write it somewhere. 我假设由于可以将引用传递给函数,因此可以获取它指向的数据并将其写入某个地方。 I don't know. 我不知道。

Anyone want to tell me if this is possible? 有人想告诉我这是否可能吗? If so, can you give an example? 如果是这样,您能举个例子吗? If not, please tell me why! 如果没有,请告诉我原因! I love learning more in-depth about how C++ actually works internally. 我喜欢更深入地了解C ++内部实际工作方式。

20 years ago your suggestions could be fresh and usable. 20年前,您的建议可能是新鲜且有用的。 People were saving memory by loading code from file on demand , then calling it, then unloading. 人们通过按需从文件中加载代码,然后调用它,然后卸载来节省内存。 That was called overlays. 那就是所谓的叠加。 To certain level it IS usable, but in form that is standardized in platform and platform's API is what manages it. 它在某种程度上是可用的,但是以平台和平台API标准化的形式来管理它。 Mechanism behind shared libraries (.so in POSIX system, .dll in Windows) is that library's file contains labels where certain functions are , what their name is, as well as data about how stack and data segment should be initialized. 共享库(在POSIX系统中为.so,在Windows中为.dll)背后的机制是该库的文件包含某些功能所在的标签,它们的名称以及有关如何初始化堆栈和数据段的数据。 It can be done by system automatically, when program is loaded. 程序加载后,可以由系统自动完成。 Otherwise you can load library manually and load pointer to function. 否则,您可以手动加载库并加载指向函数的指针。 Eg on Windows that would be by LoadLibrary() and GetProcAddress(), dlopen() and dlsym() on Linux. 例如,在Windows上将由Linux上的LoadLibrary()和GetProcAddress(),dlopen()和dlsym()来实现。

Reason why it isn't possible now in high level language: security, protection from malicious code in data segment. 为什么现在无法使用高级语言:安全性,防止数据段中的恶意代码。 Run-time library usually handles it. 运行时库通常会处理它。 It is still possible using assembler, but you will challenge antivirus and system security measures. 仍然可以使用汇编程序,但是您将挑战防病毒和系统安全措施。 With careful programming you may create own "linker" be able to create your own library and load , I suppose. 我想,通过仔细的编程,您可以创建自己的“链接器”,并且可以创建自己的库并加载。

No, you can't really do this. 不,您不能真正做到这一点。 There are a whole lot of reasons, but here's a simple and intuitive one: functions may call other functions. 有很多原因,但这是一个简单而直观的原因:函数可以调用其他函数。 If you were able to write a function to disk, and restore it, this would not account for its dependencies (functions it calls, global variables it updates, etc.). 如果您能够将函数写入磁盘并进行还原,则不会考虑其依赖性(它调用的函数,它更新的全局变量等)。 It won't work. 它不会工作。

If you want to read functions from disk, it is better to express them in a scripting language like Lua. 如果要从磁盘读取功能,最好使用Lua之类的脚本语言来表达它们。 This is a proven solution which is used in many commercial products such as video games and Adobe Lightroom. 这是一种经过验证的解决方案,已用于许多商业产品中,例如视频游戏和Adobe Lightroom。

While a function pointer is the entry point of a function, and that memory can be read and therefore copied, the first problem is that there is no reliable means of determining the length of that code, so you cannot determine for certain how much to copy to get the entire function and only that. 虽然函数指针是函数的入口点,并且可以读取并复制该内存,但第一个问题是没有可靠的方法来确定该代码的长度,因此您无法确定要复制多少获得整个功能,仅此而已。

The other issue is to what practical end? 另一个问题是要达到什么实际目的? Depending on the platform the code may not be relocatable and will have links to other code. 根据平台的不同,代码可能无法重定位,并具有指向其他代码的链接。 The binary contains no symbolic information; 二进制文件不包含任何符号信息。 the best you can do is disassemble it, but out of the context of the entire linked executable it may not be very useful to do so. 最好的办法是将其反汇编,但是在整个链接的可执行文件的上下文之外,这样做可能不是很有用。

If your aim is to separate functions from the primary executable, and to be able to later load and run them, then that is what DLLs and shared libraries are for. 如果您的目标是将功能与主要可执行文件分离,并在以后能够加载和运行它们,那么这就是DLL和共享库的用途。

If you just want to observe the binary relating to a function, then that is best done in a debugger - it will have a disassembly view mode that will show the raw binary (in hexadecimal), assembly code with symbolic links and the corresponding C source. 如果您只想观察与函数相关的二进制文件,那么最好在调试器中完成-它将具有反汇编视图模式,该模式将显示原始二进制文件(以十六进制形式),带有符号链接的汇编代码以及相应的C源代码。 This makes a lot more sense if your aim is merely to investigate how source code relates to binary machine code. 如果您的目的仅仅是研究源代码与二进制机器码之间的关系,那么这将更有意义。

Below is how you could possibly do what you are asking - even if there is no practical reason for doing it. 以下是您可能如何做的事情-即使没有实际的理由也可以这样做。 It makes assumptions about the behaviour of the compiler that may not be valid in some cases. 它对编译器的行为进行了假设,在某些情况下可能是无效的。 It assumes for example that the compiler will place adjacent functions contiguously in memory and in increasing memory address, so that function2 is immediately after function1 in memory. 它假定例如,该编译器将相邻的功能连续地在存储器中并在增加存储器地址,使function2 立即是function1在存储器中。 Here function2 serves only as an end marker for function1 and may be dummy. 这里function2仅作为一个结束标记function1 ,并且可以是伪。

int function1()
{
    ...

    return 0 ;
}

void function2() 
{
}

#include <stddef.h>

int main()
{
    ptrdiff_t function1_length = (char*)function2 - (char*)function1 ;

    FILE* fp = fopen( "function1.bin", "wb" ) ;
    fwrite( function1, function1_length, 1, fp ) ;
    fclose( fp ) ;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM