简体   繁体   中英

Heap buffer overflow on sprintf

I'm getting a heap-buffer-overflow error on this code:

// ast.c
char *not_last_prefix = malloc(strlen(next_prefix) + 4); // line 204

sprintf(not_last_prefix, "%s│  ", next_prefix); // line 206
=================================================================
==3394==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000279 at pc 0x7f0d9e6d7715 bp 0x7fff975bcf60 sp 0x7fff975bc6f0
WRITE of size 11 at 0x602000000279 thread T0
    #0 0x7f0d9e6d7714 in vsprintf (/lib/x86_64-linux-gnu/libasan.so.5+0x9e714)
    #1 0x7f0d9e6d7bce in sprintf (/lib/x86_64-linux-gnu/libasan.so.5+0x9ebce)
    #2 0x55708e40b909 in print_ast_impl src/ast.c:206
    #3 0x55708e40b7ef in print_ast src/ast.c:192
    #4 0x55708e4112ad in main src/main.c:50
    #5 0x7f0d9e46f1e2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x271e2)
    #6 0x55708e40a5cd in _start (/home/michael/Code/Baby-C/debug/bcc+0x65cd)

0x602000000279 is located 0 bytes to the right of 9-byte region [0x602000000270,0x602000000279)
allocated by thread T0 here:
    #0 0x7f0d9e746ae8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dae8)
    #1 0x55708e40b8cd in print_ast_impl src/ast.c:204
    #2 0x55708e40b7ef in print_ast src/ast.c:192
    #3 0x55708e4112ad in main src/main.c:50
    #4 0x7f0d9e46f1e2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x271e2)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib/x86_64-linux-gnu/libasan.so.5+0x9e714) in vsprintf
Shadow bytes around the buggy address:
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff8000: fa fa 00 fa fa fa 02 fa fa fa 00 00 fa fa 00 00
  0x0c047fff8010: fa fa 02 fa fa fa 00 00 fa fa 00 00 fa fa 02 fa
  0x0c047fff8020: fa fa 00 00 fa fa 00 00 fa fa 02 fa fa fa 02 fa
  0x0c047fff8030: fa fa 02 fa fa fa 02 fa fa fa 02 fa fa fa 02 fa
=>0x0c047fff8040: fa fa 02 fa fa fa fd fa fa fa 00 01 fa fa 00[01]
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==3394==ABORTING

Everything I can find suggests that I'm not allocating enough space for the result of the sprintf , but I can't see how that could be the case. I allocate space for the length of next_prefix , 3 bytes for the "│ " that follows it, and 1 for the NULL terminator. The resulting string should fit. What am I missing here?

The problem is that the length of the string literal is not 3, but 5. This is due to the fact that the vertical bar is not the standard ASCII character, but a unicode character (UTF-8 encoded as three bytes).

To avoid problems like this, one should assign the literal to a char * and take its length, like this

char *separator = "│  ";
char *not_last_prefix = malloc(strlen(next_prefix) + strlen(separator) + 1);
sprintf(not_last_prefix, "%s%s", next_prefix, separator); 

The problem, as was pointed out to me, was that my format string contained a unicode character. I wrongly assumed that mallocing one more byte would solve the problem - turns out UTF-8 characters can be as many as 4 bytes long! The good news is that you can check exactly how many bytes they take up by checking this simple table ( found here ).

Character code (decimal) | Bytes used
-------------------------|------------
0-127                    | 1 byte
128-2047                 | 2 bytes
2048-65535               | 3 bytes
65536-1114111            | 4 bytes

In my case, the vertical bar character I was using ( ) is unicode "\│" , which means it takes up 3 bytes!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM