简体   繁体   English

汇编和二进制有什么区别?

[英]What is the difference between Assembly and binary?

I've trouble with understanding the difference between assembly and binary.我无法理解汇编和二进制之间的区别。 Just I need to understand what the relation is between linked binary and assembly.只是我需要了解链接二进制文件和程序集之间的关系。

Assembly is basically binary code written in a form that humans can read.汇编基本上是以人类可以阅读的形式编写的二进制代码。 The assembler then takes the assembly code and translates it line by line to the corresponding bit code.汇编器然后获取汇编代码并将其逐行转换为相应的位代码。

Imagine that there is a table with a line for each possible assembly statement.想象一下,有一个表格,每个可能的汇编语句都有一行。 Then on each line there is on the left the statement itself, and on the right the corresponding bits that the computer can understand然后在每一行左边是语句本身,右边是计算机可以理解的相应位

That being said assemblers also have extra functionality like macros etc. but the main functionality is that described above.话虽如此,汇编器也有额外的功能,如宏等,但主要功能如上所述。

For programmers, Binary is just a numbering system.对于程序员来说,二进制只是一个编号系统。 For example, base2 consists of some 0's and 1's.例如, base2由一些 0 和 1 组成。 All computers work with these binary numbers (0 and 1).所有计算机都使用这些二进制数(0 和 1)。 They perceive the instructions as a set of these numbers.他们将指令视为一组这些数字。 They do not feel the human-generated code which is generally generated using a high-level programming language such as Python, Java, and etc.他们感觉不到一般使用 Python、Java 等高级编程语言生成的人工生成代码。

It is obvious that machine instructions in computers are not really human-readable –most people can't figure out the operational difference between 100010001... and 010001000... by just looking at a binary or hex representation of the instruction bytes.很明显,计算机中的机器指令并不是人类可读的——大多数人无法通过查看指令字节的二进制或十六进制表示来弄清楚 100010001... 和 010001000... 之间的操作差异。 These instructions are just Machine Codes .这些指令只是机器代码

For instance, the machine code for loading a value into a register in x86-16 architecture takes this instruction as a HEX code: 8B 0E 34 12 where 8B means mov r16, r/m16 and 0E specifies which register the destination is (in this case CX), and which memory/source register with a 2-bit addressing mode field and 3-bit base register (in this specific case, there is no register, just a 16-bit absolute displacement).例如,在 x86-16 架构中用于将值加载到寄存器的机器代码将此指令作为十六进制代码: 8B 0E 34 12其中8B表示mov r16, r/m160E指定目标是哪个寄存器(在此case CX),以及哪个内存/源寄存器具有 2 位寻址模式字段和 3 位基址寄存器(在这种特定情况下,没有寄存器,只有 16 位绝对位移)。

PS Just to be clear, HEX code is used to represent the Machine Code. PS为了清楚起见,HEX 代码用于表示机器代码。 Actually, it is easy to translate it to binary "10001011000011100011010000010010" and this is what you have mentioned as binary .实际上,很容易将其转换为二进制“10001011000011100011010000010010”,这就是您提到的binary HEX is just a text serialization format for binary numbers like a string of ASCII 0 and 1, but more compact. HEX 只是二进制数字的文本序列化格式,如 ASCII 0 和 1 的字符串,但更紧凑。

Assembly is more high-level than Machine Code and makes such binray/HEX instructions readable for human.汇编比机器代码更高级,并使此类 binray/HEX 指令对人类可读。 For example, the machine code 8B 0E 34 12 would be decoded / disassembled to MOV CX, [1234H] .例如,机器码8B 0E 34 12将被解码/反汇编为MOV CX, [1234H]

The tag wiki starts off by more or less answering this question. 标签维基开始或多或少地回答了这个问题。 You should read it.你应该阅读它。

An assembler assembles human-readable assembly-language into bytes of a binary file.汇编器将人类可读的汇编语言组装成二进制文件的字节。 The asm source can specify bytes directly, in hex or whatever. asm 源可以直接指定字节,以十六进制或其他形式。 In x86 NASM syntax, you can use a db 0x30 statement to assemble that byte into the current output position.在 x86 NASM 语法中,您可以使用db 0x30语句将该字节组合到当前输出位置。

You can also use mnemonics for machine instructions.您还可以将助记符用于机器指令。 eg add eax, [rdi + rdx*4] to ask an Intel-syntax x86 assembler to emit the bytes that encode that instruction.例如, add eax, [rdi + rdx*4]以要求 Intel 语法 x86 汇编器发出对该指令进行编码的字节。 The assembler then figures out the shortest (or only) way to encode that instruction into machine code, and puts those bytes in the output.汇编器然后找出将该指令编码为机器代码的最短(或唯一)方法,并将这些字节放入输出中。

There are further complications, for example modern object file formats have multiple sections (like .text and .data ), and you can select which section your bytes will be assembled into.还有更复杂的情况,例如现代目标文件格式有多个部分(如.text.data ),您可以选择将字节组装到哪个部分。 So you can keep constants near the code that uses them without actually mixing code and data in the final binary.因此,您可以将常量保留在使用它们的代码附近,而无需在最终二进制文件中实际混合代码和数据。

For example, see this godbolt link .例如,请参阅此 Godbolt 链接 In the right-hand pane, you can see the binary and the corresponding asm source.在右侧窗格中,您可以看到二进制文件和相应的 asm 源代码。

Binary is not olny used as a number system to represent a 'number', but also can represent some objects and used as char.二进制不仅可以用作数字系统来表示“数字”,还可以表示某些对象并用作字符。 take a example number like '2', when you see it as a number, it is number,you can add it, and maybe someone's id 2, and you call him no.2, but you don't caculate it because it in fact is a char....以“2”这样的数字为例,当你看到它是一个数字时,它是数字,你可以添加它,也许某人的id是2,你称他为2,但你没有计算它,因为它在事实上是一个字符....

binary and assembly are one to one match, which means what you are writing in assembly is actually binary. binary 和 assembly 是一对一的匹配,这意味着你在汇编中写的实际上是二进制的。

for example, before we have assembly, you want to add one and one, you may need to:例如,在我们进行组装之前,您要加一加一,您可能需要:

1.load 1 to accumulator 1.load 1 到累加器

2.add 1 with the one in accumulator 2.累加器加1

3.store it in an address 3.存入地址

but you can only use brinary insctrction to represent that .... what can you do?但你只能使用 brinary insctrction 来表示......你能做什么? only thing you can is to use conbination of 0 and 1 to represent the things you need to do.你唯一能做的就是使用 0 和 1 的组合来表示你需要做的事情。 lets just think 0001 means load, 0010 means add, 0011 as store, so you might write something like:让我们想想 0001 表示加载,0010 表示添加,0011 表示存储,因此您可以编写如下内容:

0001 000000001

0010 000000001

0011 000000101(000000101 is a location where you store the stuffs in 

accumulator)累加器)

and that is quite mess, so clever you come out a good idea, which is use readable words to reperesent the instrction like this:真是一团糟,聪明的你想出了一个好主意,就是用可读的词来表示这样的指令:

0001 -> load

0010 -> add

0011 -> store

so you can write it in assembly...所以你可以用汇编写它......

load  1

add   1

store 5

which is easily understand assembly!(of course you can change the number into hex form for abbreviation~)这是很容易理解的汇编!(当然你可以把数字改成十六进制形式来缩写~)

you can see, when you translate it , 0001 is actually not a number, 00000001 is.你可以看到,当你翻译它时,0001实际上不是一个数字,00000001是。 so 0001 is just a notation, and assembly is used to replace the cahr type notaion for better read.所以 0001 只是一个符号,为了更好的阅读,使用汇编来代替 cahr 类型的符号。 00000001 is really a number, and you can writer it on any other form, but coincidently for decimal is 1, for hex is 1 too:) 00000001 真的是一个数字,你可以用任何其他形式写它,但巧合的是十进制是 1,十六进制也是 1:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM