I am able to disassemble an object file like below. But I'd like to just dump the raw number like 55, 48, ... of instructions in a binary format for a specific function, eg, add4, to a file.
I could write a program to parse the output of otool. But is there an easier way to do so?
My OS is Mac OS X.
$ cat add.c
long x;
long add2(long num) {
return num + 2;
}
long add4(long num) {
return num + 4;
}
$ clang -c -o add.o add.c
$ otool -tvjV add.o
add.o:
(__TEXT,__text) section
_add4:
0000000000000000 55 pushq %rbp
0000000000000001 48 89 e5 movq %rsp, %rbp
0000000000000004 48 89 7d f8 movq %rdi, -0x8(%rbp)
0000000000000008 48 8b 7d f8 movq -0x8(%rbp), %rdi
000000000000000c 48 83 c7 04 addq $0x4, %rdi
0000000000000010 48 89 f8 movq %rdi, %rax
0000000000000013 5d popq %rbp
0000000000000014 c3 retq
0000000000000015 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:_add4(%rax,%rax)
000000000000001f 90 nop
_add2:
0000000000000020 55 pushq %rbp
0000000000000021 48 89 e5 movq %rsp, %rbp
0000000000000024 48 89 7d f8 movq %rdi, -0x8(%rbp)
0000000000000028 48 8b 7d f8 movq -0x8(%rbp), %rdi
000000000000002c 48 83 c7 02 addq $0x2, %rdi
0000000000000030 48 89 f8 movq %rdi, %rax
0000000000000033 5d popq %rbp
0000000000000034 c3 retq
You can use nm -nU add.o
to get the symbol addresses. You can search for the symbol of interest and get its address and the subsequent address. That gives you the start and (roughly) length of the symbol. Then, you can use any tool for hex dumping from a file to read just that portion.
For example:
exec 3< <(nm -nU add.o | grep -A1 -w _add4 | cut -d ' ' -f 1)
read start <&3
read end <&3
3<&-
offset=$(otool -lV add.o | grep -A3 -w "segname __TEXT" | grep -m1 offset | cut -c 12-)
if [ -n "$end" ] ; then length_arg="-n $(( "0x$end" - "0x$start" ))" ; fi
hexdump -C -s $((0x$start + $offset)) $length_arg add.o
You can use objdump and then extract the opcode part. It can be done as follows.
$ objdump -d add.o | grep add4 -A10 | cut -f 2 | grep -v ':'
The -v flag for grep tells it to print all lines not containing a colon.
Output:
55
48 89 e5
48 89 7d f8
48 8b 45 f8
48 83 c0 04
5d
c3
The -A10 tells grep to print 10 lines after the match.
Now to output this into a file, we first format the opcodes as hex like '\\x45'. The above output can have multiple spaces and a space at the end of each line, so we remove them first as it can mess with our sed.
$ objdump -d add.o | grep add4 -A10 | cut -f 2 | grep -v ':' | sed 's/ */ /g' | sed 's/ $//g'
Add the '\\x' part, first for the spaces in between and then for the first hex in each line.
$ objdump -d add.o | grep add4 -A10 | cut -f 2 | grep -v ':' | sed 's/ */ /g' | sed 's/ $//g' | sed 's/ /\\\\x/g' | sed 's/^/\\\\x/g'
\x55
\x48\x89\xe5
\x48\x89\x7d\xf8
\x48\x8b\x45\xf8
\x48\x83\xc0\x04
\x5d
\xc3
Collapse it all into a single line and add quotes.
$ objdump -d add.o | grep add4 -A10 | cut -f 2 | grep -v ':' | sed 's/ */ /g' | sed 's/ $//g' | sed 's/ /\\\\x/g' | sed 's/^/\\\\x/g' | tr -d '\\n' | sed 's/^/\\"/g' | sed 's/$/\\"/g'
"\x55\x48\x89\xe5\x48\x89\x7d\xf8\x48\x8b\x45\xf8\x48\x83\xc0\x04\x5d\xc3"
Now we got a C-style string and we just pass it to printf and then redirect the output to a file.
$ printf $(objdump -d add.o | grep add4 -A10 | cut -f 2 | grep -v ':' | sed 's/ */ /g' | sed 's/ $//g' | sed 's/ /\\\\x/g' | sed 's/^/\\\\x/g' | tr -d '\\n' | sed 's/^/\\"/g' | sed 's/$/\\"/g') | sed 's/^\\"//g' | sed 's/\\"$//g' > add4.bin
The last two seds after the printf are to remove the quotes which remains in printf's output for some reason.
Hexdumping the file we get:
$ hexdump -C add4.bin
00000000 55 48 89 e5 48 89 7d f8 48 8b 45 f8 48 83 c0 04 |UH..H.}.H.E.H...|
00000010 5d c3 |].|
00000012
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.