简体   繁体   中英

Is there any operation in C analogous to this assembly code?

Today, I played around with incrementing function pointers in assembly code to create alternate entry points to a function:

.386
.MODEL FLAT, C
.DATA
    INCLUDELIB MSVCRT
    EXTRN puts:PROC
    HLO DB "Hello!", 0
    WLD DB "World!", 0
.CODE
    dentry PROC
        push offset HLO
        call puts           
        add esp, 4
        push offset WLD
        call puts
        add esp, 4
        ret
    dentry ENDP
    main PROC
        lea edx, offset dentry
        call edx
        lea edx, offset dentry
        add edx, 13
        call edx
        ret
    main ENDP
END

(I know, technically this code is invalid since it calls puts without the CRT being initialized, but it works without any assembly or runtime errors, at least on MSVC 2010 SP1.)

Note that in the second call to dentry I took the address of the function in the edx register, as before, but this time I incremented it by 13 bytes before calling the function.

The output of this program is therefore:

C:\Temp>dblentry
Hello!
World!
World!

C:\Temp>

The first output of " Hello!\\nWorld! " is from the call to the very beginning of the function, whereas the second output is from the call starting at the " push offset WLD " instruction.

I'm wondering if this kind of thing exists in languages that are meant to be a step up from assembler like C, Pascal or FORTRAN. I know C doesn't let you increment function pointers but is there some other way to achieve this kind of thing?

You can use the longjmp function: http://www.cplusplus.com/reference/csetjmp/longjmp/

It's a fairly horrible function, but it'll do what you seek.

AFAIK you can only write functions with multiple entry-points in asm.

You can put labels on all the entry points, so you can use normal direct calls instead of hard-coding the offsets from the first function-name.

This makes it easy to call them normally from C or any other language.

The earlier entry points work like functions that fall-through into the body of another function, if you're worried about confusing tools (or humans) that don't allow function bodies to overlap.


You might do this if the early entry-points do a tiny bit of extra stuff, and then fall through into the main function. It's mainly going to be a code-size saving technique (which might improve I-cache / uop-cache hit rate).


Compilers tend to duplicate code between functions instead of sharing large chunks of common implementation between slightly different functions.

However, you can probably accomplish it with only one extra jmp with something like:

int foo(int a) { return bigfunc(a + 1); }
int bar(int a) { return bigfunc(a + 2); }

int bigfunc(int x) { /* a lot of code */ }

See a real example on the Godbolt compiler explorer

foo and bar tailcall bigfunc , which is slightly worse than having bar fall-through into bigfunc . (Having foo jump over bar into bigfunc is still good, esp. if bar isn't that trivial.)


Jumping into the middle of a function isn't in general safe, because non-trivial functions usually need to save/restore some regs. So the prologue pushes them, and the epilogue pops them. If you jump into the middle, then the pop s in the prologue will unbalance the stack. (ie pop off the return address into a register, and return to a garbage address).

See also Does a function with instructions before the entry-point label cause problems for anything (linking)?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM