简体   繁体   English

如何反汇编原始 16 位 x86 机器码?

[英]How do I disassemble raw 16-bit x86 machine code?

I'd like to disassemble the MBR (first 512 bytes) of a bootable x86 disk that I have.我想反汇编我拥有的可引导 x86 磁盘的 MBR(前 512 个字节)。 I have copied the MBR to a file using我已使用将 MBR 复制到文件中

dd if=/dev/my-device of=mbr bs=512 count=1

Any suggestions for a Linux utility that can disassemble the file mbr ?对于可以反汇编文件mbr的 Linux 实用程序有什么建议吗?

You can use objdump. 您可以使用objdump。 According to this article the syntax is: 根据本文的语法为:

objdump -D -b binary -mi386 -Maddr16,data16 mbr

GNU工具称为objdump ,例如:

objdump -D -b binary -m i8086 <file>

I like ndisasm for this purpose. 我为此喜欢ndisasm It comes with the NASM assembler, which is free and open source and included in the package repositories of most linux distros. 它带有NASM汇编器,该汇编器是免费的开放源代码,并且包含在大多数Linux发行版的软件包存储库中。

ndisasm -b16 -o7c00h -a -s7c3eh mbr

Explanation - from ndisasm manpage 解释 -来自ndisasm手册页

  • -b = Specifies 16-, 32- or 64-bit mode. -b =指定16位,32位或64位模式。 The default is 16-bit mode. 默认值为16位模式。
  • -o = Specifies the notional load address for the file. -o =指定文件的名义加载地址。 This option causes ndisasm to get the addresses it lists down the left hand margin, and the target addresses of PC-relative jumps and calls, right. 此选项使ndisasm获得它列出的地址在左边距的左边,而PC相对跳转和调用的目标地址在右边。
  • -a = Enables automatic (or intelligent) sync mode, in which ndisasm will attempt to guess where synchronisation should be performed, by means of examining the target addresses of the relative jumps and calls it disassembles. -a =启用自动(或智能)同步模式,在该模式下,ndisasm将通过检查相对跳转的目标地址并调用其分解来尝试猜测应在何处执行同步。
  • -s = Manually specifies a synchronisation address, such that ndisasm will not output any machine instruction which encompasses bytes on both sides of the address. -s =手动指定一个同步地址,这样ndisasm将不会输出任何包含该地址两侧字节的机器指令。 Hence the instruction which starts at that address will be correctly disassembled. 因此,从该地址开始的指令将被正确地反汇编。
  • mbr = The file to be disassembled. mbr =要反汇编的文件。

starblue and hlovdal both have parts of the canonical answer. starbluehlovdal都具有典型的答案的一部分。 If you want to disassemble raw i8086 code, you usually want Intel syntax, not AT&T syntax, too, so use: 如果要反汇编原始i8086代码,通常也需要Intel语法,而不是AT&T语法,因此请使用:

objdump -D -Mintel,i8086 -b binary -m i386 mbr.bin
objdump -D -Mintel,i386 -b binary -m i386 foo.bin    # for 32-bit code
objdump -D -Mintel,x86-64 -b binary -m i386 foo.bin  # for 64-bit code

If your code is ELF (or a.out (or (E)COFF)), you can use the short form: 如果您的代码是ELF(或a.out(或(E)COFF)),则可以使用缩写形式:

objdump -D -Mintel,i8086 a.out  # disassembles the entire file
objdump -d -Mintel,i8086 a.out  # disassembles only code sections

For 32-bit or 64-bit code, omit the ,8086 ; 对于32位或64位代码,请省略,8086 the ELF header already includes this information. ELF标头已包含此信息。

ndisasm , as suggested by jameslin , is also a good choice, but objdump usually comes with the OS and can deal with all architectures supported by GNU binutils (superset of those supported by GCC), and its output can usually be fed into GNU as (ndisasm's can usually be fed into nasm though, of course). jameslin建议的ndisasm也是一个不错的选择,但是objdump通常随操作系统一起提供,并且可以处理GNU binutils支持的所有架构(GCC支持的那些架构的超集),并且其输出通常可以as (当然,通常可以将ndisasm送入nasm

Peter Cordes suggests that “ Agner Fog's objconv is very nice. 彼得·科德斯Peter Cordes)提出:“ Agner Fog的objconv非常好。 It puts labels on branch targets, making a lot easier to figure out what the code does. 它将标签放在分支目标上,使弄清楚代码的作用变得容易得多。 It can disassemble into NASM, YASM, MASM, or AT&T (GNU) syntax.” 它可以分解为NASM,YASM,MASM或AT&T(GNU)语法。”

Multimedia Mike already found out about --adjust-vma ; 多媒体Mike已经发现--adjust-vma ; the ndisasm equivalent is the -o option. ndisasm等效项是-o选项。

To disassemble, say, sh4 code (I used one binary from Debian to test), use this with GNU binutils (almost all other disassemblers are limited to one platform, such as x86 with ndisasm and objconv ): 要反汇编sh4代码(我使用了Debian的一个二进制文件进行测试), objconv其与GNU binutils一起使用(几乎所有其他反汇编程序都限于一个平台,例如x86和ndisasmobjconv ):

objdump -D -b binary -m sh -EL x

The -m is the machine, and -EL means Little Endian (for sh4eb use -EB instead), which is relevant for architectures that exist in either endianness. -m是机器, -EL表示小sh4eb (对于sh4eb使用-EB代替),这与存在于任一-EB中的体系结构有关。

试试这个命令:

sudo dd if=/dev/sda bs=512 count=1 | ndisasm -b16 -o7c00h -

If you're just looking to use a disassembler, then objdump is one choice.如果您只是想使用反汇编程序,那么 objdump 是一种选择。 The disassembler that comes with the nasm assembler is ndisasm. nasm 汇编器附带的反汇编器是 ndisasm。 You can also run "debug.exe" in DOS Box on Linux, provided you get a hold of a copy of the program.您还可以在 Linux 上的 DOS 框中运行“debug.exe”,前提是您获得了该程序的副本。 It also does disassembly, as well as controlled execution;它还进行反汇编以及受控执行; ie simulation of the CPU, itself - which is also important, even when doing disassembly, for reasons I'm about to describe.即 CPU 本身的模拟 - 这也很重要,即使在进行反汇编时也是如此,原因我将要描述。

Fake86 has a cpu emulator. Fake86 有一个 cpu 模拟器。 You may be able to hack it into doing disassembly by (a) having it show the instruction instead of simulating it, (b) having it not take conditional jumps or invoke calls, but (instead) stacking the address as a new entry point to do disassembly from (ie, in effect, taking both branches and encapsulating subroutines), (c) having it stop the current disassembly at an unconditional jump or return, (d) having it accept one, two or more entry points to start with and ideally (e) having it also accept base addresses for data segments, and (f) getting it to do a hex dump of all the areas unprocessed as data or code segments (as these are usually where indirect jumps or calls or indirectly-accessed data segments land into.)您可以通过以下方式破解它进行反汇编:(a)让它显示指令而不是模拟它,(b)让它不进行条件跳转或调用调用,而是(相反)将地址堆叠为新的入口点进行反汇编(即,实际上,采用两个分支并封装子例程),(c)让它在无条件跳转或返回时停止当前反汇编,(d)让它接受一个、两个或多个入口点开始和理想情况下(e)让它也接受数据段的基地址,和(f)让它对所有未处理为数据或代码段的区域进行十六进制转储(因为这些通常是间接跳转或调用或间接访问的数据段着陆。)

This gets to the other sense of your query: "I want to make a disassembler".这涉及您查询的另一种意义:“我想做一个反汇编程序”。 The source for ndisasm is available, and it handles many of the descendants of 8086, not just 8086, itself (which seriously clutters it, if all you want is an 8086 or even 80386 disassembler), but it is not self-contained and has a heavy dependency on the rest of the distribution. ndisasm 的源代码是可用的,它处理 8086 的许多后代,而不仅仅是 8086,它本身(如果你想要的只是 8086 甚至 80386 反汇编程序,它会严重混乱),但它不是独立的,并且有严重依赖于 rest 的分布。

Its main talking point is that it uses octal digits for the opcodes - which better fits the 80x86 - as I pointed out on the USENET in 1995 in comp.lang.asm... and (in fact) nasm's creation was a direct response to that.它的主要论点是它使用八进制数字作为操作码——它更适合 80x86——正如我在 1995 年在 USENET 在 comp.lang.asm 中指出的那样......并且(实际上)nasm 的创建是对那。 So, it's potentially more transparent and you may want to keep the source handy as a check and comparison, if you're making your own disassembler.因此,它可能更加透明,如果您正在制作自己的反汇编程序,您可能希望将源代码放在手边作为检查和比较。

You can also run the debug.exe program on itself.您也可以自行运行 debug.exe 程序。

You could also try to run ndisasm on debug.exe;您也可以尝试在 debug.exe 上运行 ndisasm; after stripping out the 0x200-byte.EXE file header, to make it a raw binary, after extracting out the entry point address CS:IP and stack pointer address SS:SP from it (80x86 stacks grow down, so the stack segment is nominally SS:0 to SS:(SP-1)).将 0x200-byte.EXE 文件 header 剥离后,使其成为原始二进制文件,然后从中提取入口点地址 CS:IP 和堆栈指针地址 SS:SP(80x86 堆栈向下增长,因此堆栈段名义上是SS:0 到 SS:(SP-1))。 The EXE for debug.exe has no relocations, so you're okay with that treating the code as raw binary. debug.exe 的 EXE 没有重定位,因此您可以将代码视为原始二进制文件。

But you won't get anything that's clearly recognizable, since the program is self-modifying - more precisely: self-extracting.但是你不会得到任何清晰可辨的东西,因为程序是自我修改的——更准确地说:自我提取。 You'll get a (barely) compressed code image (about 5/6 compression ratio) followed by a loader routine.您将获得一个(几乎没有)压缩的代码图像(压缩比约为 5/6),然后是一个加载程序例程。

You have to run emulation on it, eg by running debug.exe on debug.exe to emulate its unpacking routine, to get it to extract itself, and then you dump the unpacked program image and disassemble that.你必须在它上面运行仿真,例如通过在 debug.exe 上运行 debug.exe 来仿真它的解压例程,让它自己解压,然后你转储解压后的程序映像并反汇编它。 There is a "relocation table" at the end of the loader routine, so it does actually have relocations in it - it's just that they're applied when the program unpacks itself, rather than by the OS when the EXE file is loaded.在加载程序的末尾一个“重定位表”,所以它实际上确实有重定位 - 只是它们在程序解压时应用,而不是在加载 EXE 文件时由操作系统应用。

And then you've just disassembled a disassembler that also happens to do CPU emulation, like Fake86 does - but only for the 8086. You'll have to make the absolute addresses relative (using the original relocation table as a guide), to make is re-assemblable.然后你刚刚反汇编了一个也恰好进行 CPU 仿真的反汇编程序,就像 Fake86 一样 - 但仅限于 8086。你必须使绝对地址相对(使用原始重定位表作为指南),以使是可重新组装的。 Once you do that, you can work on the source.一旦你这样做了,你就可以在源代码上工作。 The opcode table is in clear view (if you display it as text) - both when seen in the packed and unpacked versions of debug.exe.操作码表清晰可见(如果您将其显示为文本) - 无论是在打包和解包版本的 debug.exe 中都可以看到。

There's also DosDebug up on GitHub. GitHub 上还有 DosDebug。 It handles everything up to "80586" (or Pentium") and "80686": it flags a generation "6" for some instructions.; eg the conditional "cmov" operations are handled by it, as well as their "fcmov" floating point versions. DosDebug is in 8086 assembly and is best-suited to compile with jwasm. You might be able to run nasm on it, I don't know. I never tried.它处理直到“80586”(或Pentium)和“80686”的所有内容:它为某些指令标记第“6”代。例如,条件“cmov”操作由它处理,以及它们的“fcmov”浮动点版本。DosDebug 是 8086 程序集,最适合用 jwasm 编译。你也许可以在它上面运行 nasm,我不知道。我从未尝试过。

I might port the DAS disassembler to the x86, since items (a)-(f) are already incorporated into DAS's design.我可能会将 DAS 反汇编程序移植到 x86,因为 (a)-(f) 项已经包含在 DAS 的设计中。 I've only ever ported it to the 8051, 6800, 6809 and 8080/8085 (and Z80) up to now;到目前为止,我只将它移植到 8051、6800、6809 和 8080/8085(和 Z80); but the transition from 8085 to 8086 is relatively small.但是从8085到8086的过渡比较小。 To that end, I might hack something out of Fake86.为此,我可能会从 Fake86 中破解一些东西。 That's mostly abandonware, now, since the author replaced it by XTulator, as Fake86 was written when the programmer was relatively new to C.现在,这主要是废弃软件,因为作者将其替换为 XTulator,因为 Fake86 是在程序员对 C 相对较新时编写的。 You might also be able to hack something directly out of DosDebug's opcode tables (their "instr.*" files).您还可以直接从 DosDebug 的操作码表(它们的“instr.*”文件)中破解一些东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 mov 机器代码仅在一台机器上崩溃,两者都是 x86,可能不同的解释? - mov machine code crashes on only one machine, both x86, possibly different interpretation? 如何修复 masm x86 中未解决的外部错误 - How do I fix error unreolved externals in masm x86 x86 `movq %rsp, (%rdi)` 和 `movq (%rsi), %rsp` 有什么作用,它们有何不同? - What does x86 `movq %rsp, (%rdi)` and `movq (%rsi), %rsp` do and how are they different? 如何使用 x86 操作码在 IAT 中调用函数 - How to call functions in IAT with x86 opcodes 如何在 m1 上安装 x86/64 架构 pod - How to install x86/64 architecture pod on m1 如何挂钩未知数量的函数 - x86 - How to hook an unknown number of functions - x86 指针解引用和乘法的基本 C / x86 汇编代码的逐步说明 - Step-by-step explanation of basic C / x86 assembly code for pointer dereference and multiply 有人知道适用于 VS Code 的良好组装 x86 荧光笔扩展吗? - Does someone know good Assembly x86 highlighter extension for VS Code? 在 x86 上获取发布 - Acquire-release on x86 是否有任何语言/编译器使用具有非零嵌套级别的 x86 ENTER 指令? - Do any languages / compilers utilize the x86 ENTER instruction with a nonzero nesting level?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM