为匿名函数创建C语言附加组件是否可行？

Question

I know that C compilers are capable of taking standalone code, and generate standalone shellcode out of it for the specific system they are targetting. 我知道C编译器能够获取独立代码，并针对他们所针对的特定系统从中生成独立的shellcode。

For example, given the following in anon.c : 例如，在anon.c给出以下anon.c ：

int give3() {
    return 3;
}

I can run 我可以跑

 gcc anon.c -o anon.obj -c objdump -D anon.obj

which gives me (on MinGW): 这给了我（在MinGW上）：

anon1.obj:     file format pe-i386


Disassembly of section .text:

00000000 <_give3>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   b8 03 00 00 00          mov    $0x3,%eax
   8:   5d                      pop    %ebp
   9:   c3                      ret    
   a:   90                      nop
   b:   90                      nop

So I can make main like this: 所以我可以像这样制作main：

main.c main.c

#include <stdio.h>
#include <stdint.h>

int main(int argc, char **argv)
{
    uint8_t shellcode[] = {
        0x55,
        0x89, 0xe5,
        0xb8, 0x03, 0x00, 0x00, 0x00,
        0x5d, 0xc3,
        0x90,
        0x90
    };

    int (*p_give3)() = (int (*)())shellcode;
    printf("%d.\n", (*p_give3)());
}

My question is, is it practical to automate the process of converting the self contained anonymous function that does not refer to anything that is not within its scope or in arguments? 我的问题是，自动化转换自包含匿名函数的过程是否可行，该匿名函数不引用不在其范围或参数之内的任何东西？

eg: 例如：

#include <stdio.h>
#include <stdint.h>

int main(int argc, char **argv)
{
    uint8_t shellcode[] = [@[
        int anonymous() {
            return 3;
        }
    ]];

    int (*p_give3)() = (int (*)())shellcode;
    printf("%d.\n", (*p_give3)());
}

Which would compile the text into shellcode, and place it into the buffer? 哪个可以将文本编译成shellcode，并将其放入缓冲区？

The reason I ask is because I really like writing C, but making pthreads, callbacks is incredibly painful; 我问的原因是因为我真的很喜欢编写C，但是制作pthreads，回调是非常痛苦的。 and as soon as you go one step above C to get the notion of "lambdas", you lose your language's ABI(eg, C++ has lambda, but everything you do in C++ is suddenly implementation dependent), and "Lisplike" scripting addons(eg plug in Lisp, Perl, JavaScript/V8, any other runtime that already knows how to generalize callbacks) make callbacks very easy, but also much more expensive than tossing shellcode around. 并且一旦您在C语言之上迈出了“ lambdas”的概念，就会失去语言的ABI（例如，C ++具有lambda，但是您在C ++中所做的一切突然取决于实现），以及“ Lisplike”脚本附加项（例如，插入Lisp，Perl，JavaScript / V8，任何其他已经知道如何泛化回调的运行时）都使回调非常容易，但是比扔掉shellcode的开销要大得多。

If this is practical, then it is possible to put functions which are only called once into the body of the function calling it, thus reducing global scope pollution. 如果可行，则可以将仅被调用一次的函数放入调用它的函数主体中，从而减少全局范围的污染。 It also means that you do not need to generate the shellcode manually for each system you are targetting, since each system's C compiler already knows how to turn self contained C into assembly, so why should you do it for it, and ruin readability of your own code with a bunch of binary blobs. 这也意味着您不需要为要定位的每个系统手动生成shellcode，因为每个系统的C编译器已经知道如何将自包含的C转换为程序集，因此为什么要为此而做，并且破坏了其可读性自己的代码和一堆二进制Blob。

So the question is: is this practical(for functions which are perfectly self contained, eg even if they want to call puts, puts has to be given as an argument or inside a hash table/struct in an argument)? 因此，问题是：这是否实用（对于完全独立的函数，例如，即使它们要调用puts，put也必须作为参数提供或在参数的哈希表/结构内提供）？ Or is there some issue preventing this from being practical? 还是有一些问题使这无法实现？

Answer 1

Apple has implemented a very similar feature in clang, where it's called "blocks". 苹果公司在clang中实现了一个非常相似的功能，即“块”。 Here's a sample: 这是一个示例：

int main(int argc, char **argv)
{
    int (^blk_give3)(void) = ^(void) {
        return 3;
    };

    printf("%d.\n", blk_give3());

    return 0;
}

More information: 更多信息：

Answer 2

I know that C compilers are capable of taking standalone code, and generate standalone shellcode out of it for the specific system they are targeting. 我知道C编译器能够获取独立代码，并针对他们所针对的特定系统从中生成独立的shellcode。

Turning source into machine code is what compilation is . 谈到源成机器代码的编译是什么。 Shellcode is machine code with specific constraints, none of which apply to this use-case. Shellcode是具有特定约束的机器代码，这些约束均不适用于此用例。 You just want ordinary machine code like compilers generate when they compile functions normally. 您只希望像编译器这样的普通机器代码在正常编译函数时生成。

AFAICT, what you want is exactly what you get from static foo(int x){ ...; } AFAICT，您想要的正是您从static foo(int x){ ...; } static foo(int x){ ...; } , and then passing foo as a function pointer. static foo(int x){ ...; } ，然后将foo作为函数指针传递。 ie a block of machine code with a label attached, in the code section of your executable. 即可执行文件的代码部分中带有标签的机器代码块。

Jumping through hoops to get compiler-generated machine code into an array is not even close to worth the portability downsides (esp. in terms of making sure the array is in executable memory). 跳过循环将编译器生成的机器代码放入数组中甚至不值得在可移植性方面付出代价（尤其是在确保数组位于可执行内存中方面）。

It seems the only thing you're trying to avoid is having a separately-defined function with its own name. 您似乎要避免的唯一事情就是拥有一个单独定义的函数并使用自己的名称。 That's an incredibly small benefit that doesn't come close to justifying doing anything like you're suggesting in the question. 这是一个非常小的好处，与证明您在问题中建议的任何事情的合理性都差强人意。 AFAIK, there's no good way to achieve it in ISO C11, but: AFAIK，在ISO C11中没有好的方法可以实现，但是：

Some compilers support nested functions as a GNU extension : 一些编译器支持嵌套函数作为GNU扩展：

This compiles (with gcc6.2). 这将进行编译（使用gcc6.2）。 On Godbolt, I used -xc to compile it as C, not C++. 在Godbolt上，我使用-xc将其编译为C，而不是C ++。 . 。 It also compiles with ICC17, but not clang3.9. 它还可以使用ICC17进行编译，但不能使用clang3.9进行编译。

#include <stdlib.h>

void sort_integers(int *arr, size_t len)
{
  int bar(){return 3;}  // gcc warning: ISO C forbids nested functions [-Wpedantic]

  int cmp(const void *va, const void *vb) {
    const int *a=va, *b=vb;       // taking const int* args directly gives a warning, which we could silence with a cast
    return *a > *b;
  }

  qsort(arr, len, sizeof(int), cmp);
}

The asm output is: asm输出为：

cmp.2286:
    mov     eax, DWORD PTR [rsi]
    cmp     DWORD PTR [rdi], eax
    setg    al
    movzx   eax, al
    ret
sort_integers:
    mov     ecx, OFFSET FLAT:cmp.2286
    mov     edx, 4
    jmp     qsort

Notice that no definition for bar() was emitted, because it's unused. 请注意，没有发出bar（）的定义，因为它未使用。

Programs with nested functions built without optimization will have executable stacks. 具有未经优化而构建的嵌套函数的程序将具有可执行堆栈。 (For reasons explained below). （出于下面说明的原因）。 So if you use this, make sure you use optimization if you care about security. 因此，如果您使用此功能，请在确保安全性的同时确保使用优化功能。

BTW, nested functions can even access variable in their parent (like lambas). 顺便说一句，嵌套函数甚至可以在其父级中访问变量（例如lambas）。 Changing cmp into a function that does return len results in this highly surprising asm : 将cmp更改为确实return len的函数会导致以下令人惊讶的asm ：

__attribute__((noinline)) 
void call_callback(int (*cb)()) {
  cb();
}

void foo(int *arr, size_t len) {
  int access_parent() { return len; }
  call_callback(access_parent);
}

## gcc5.4
access_parent.2450:
    mov     rax, QWORD PTR [r10]
    ret
call_callback:
    xor     eax, eax
    jmp     rdi
foo:
    sub     rsp, 40
    mov     eax, -17599
    mov     edx, -17847
    lea     rdi, [rsp+8]
    mov     WORD PTR [rsp+8], ax
    mov     eax, OFFSET FLAT:access_parent.2450
    mov     QWORD PTR [rsp], rsi
    mov     QWORD PTR [rdi+8], rsp
    mov     DWORD PTR [rdi+2], eax
    mov     WORD PTR [rdi+6], dx
    mov     DWORD PTR [rdi+16], -1864106167
    call    call_callback
    add     rsp, 40
    ret

I just figured out what this mess is about while single-stepping it: Those MOV-immediate instructions are writing machine-code for a trampoline function to the stack, and passing that as the actual callback. 我只是在单步执行时弄清楚了这个混乱是什么：那些MOV即时指令正在将用于蹦床功能的机器代码写入堆栈，并将其作为实际的回调传递。

gcc must ensure that the ELF metadata in the final binary tells the OS that the process needs an executable stack (note readelf -l shows GNU_STACK with RWE permissions). gcc必须确保最终二进制文件中的ELF元数据告诉OS进程需要可执行堆栈（注意readelf -l显示具有RWE权限的GNU_STACK）。 So nested functions that access outside their scope prevent the whole process from having the security benefits of NX stacks . 因此，在其范围之外访问的嵌套函数会阻止整个过程获得NX堆栈的安全性好处。 (With optimization disabled, this still affects programs that use nested functions that don't access stuff from outer scopes, but with optimization enabled gcc realizes that it doesn't need the trampoline.) （禁用优化后，这仍然会影响使用嵌套函数的程序，这些函数不能访问外部作用域的内容，但是启用优化后，gcc会意识到它不需要蹦床。）

The trampoline (from gcc5.2 -O0 on my desktop) is: 蹦床（来自我桌面上的gcc5.2 -O0 ）是：

   0x00007fffffffd714:  41 bb 80 05 40 00       mov    r11d,0x400580   # address of access_parent.2450
   0x00007fffffffd71a:  49 ba 10 d7 ff ff ff 7f 00 00   movabs r10,0x7fffffffd710   # address of `len` in the parent stack frame
   0x00007fffffffd724:  49 ff e3        rex.WB jmp r11 
    # This can't be a normal rel32 jmp, and indirect is the only way to get an absolute near jump in x86-64.

   0x00007fffffffd727:  90      nop
   0x00007fffffffd728:  00 00   add    BYTE PTR [rax],al
   ...

(trampoline might not be the right terminology for this wrapper function; I'm not sure.) （蹦床可能不是此包装功能的正确术语；我不确定。）

This finally makes sense, because r10 is normally clobbered without saving by functions. 最终这是有道理的，因为r10通常会被破坏，而不会通过功能进行保存。 There's no register that foo could set that would be guaranteed to still have that value when the callback is eventually called. 没有可以设置foo的寄存器，可以保证在最终调用回调时仍然具有该值。

The x86-64 SysV ABI says that r10 is the "static chain pointer", but C/C++ don't use that. x86-64 SysV ABI表示r10是“静态链指针”，但是C / C ++不使用它。 (Which is why r10 is treated like r11, as a pure scratch register). （这就是为什么r10被当作r11当作纯暂存寄存器的原因）。

Obviously a nested function that accesses variables in the outer scope can't be called after the outer function returns. 显然，在外部函数返回之后，不能调用访问外部范围内变量的嵌套函数。 eg if call_callback held onto the pointer for future use from other callers, you would get bogus results. 例如，如果将call_callback保留在指针上以供其他调用者将来使用，您将得到虚假结果。 When the nested function doesn't do that, gcc doesn't do the trampoline thing, so the function works just like a separately-defined function, so it would be a function pointer you could pass around arbitrarily. 当嵌套函数不执行此操作时，gcc不会执行蹦床操作，因此该函数的工作方式类似于单独定义的函数，因此它将是您可以随意传递的函数指针。

Answer 3

It seems possible, but unnecessarliy complicated: 看起来可能，但不必要地复杂：

shellcode.c shellcode.c

 int anon() { return 3; }

main.c main.c

 ...
 uint8_t shellcode[] = {
 #include anon.shell
};

int (*p_give3)() = (int (*)())shellcode;
printf("%d.\n", (*p_give3)());

makefile: 生成文件：

anon.shell:
   gcc anon.c -o anon.obj -c; objdump -D anon.obj | extractShellBytes.py anon.shell

Where extractShellBytes.py is a script you write which prints only the raw comma-separated code bytes from the objdump output. 其中extractShellBytes.py是您编写的脚本，该脚本仅打印objdump输出extractShellBytes.py逗号分隔的原始代码字节。

为匿名函数创建C语言附加组件是否可行？

问题描述

3 个解决方案

解决方案1
5 2016-11-11 19:17:59

解决方案2
4 已采纳 2016-11-16 07:08:25

Some compilers support nested functions as a GNU extension : 一些编译器支持嵌套函数作为GNU扩展：

解决方案3
1 2016-11-11 19:10:53

为匿名函数创建C语言附加组件是否可行？

问题描述

3 个解决方案

解决方案1 5 2016-11-11 19:17:59

解决方案2 4 已采纳 2016-11-16 07:08:25

Some compilers support nested functions as a GNU extension : 一些编译器支持嵌套函数作为GNU扩展 ：

解决方案3 1 2016-11-11 19:10:53

解决方案1
5 2016-11-11 19:17:59

解决方案2
4 已采纳 2016-11-16 07:08:25

Some compilers support nested functions as a GNU extension : 一些编译器支持嵌套函数作为GNU扩展：

解决方案3
1 2016-11-11 19:10:53