简体   繁体   English

裸机 RISC-V CPU - 处理器如何知道从哪个地址开始获取指令?

[英]Bare metal RISC-V CPU - how does the processor know which address to start fetching instructions from?

I am designing my own RISC-V CPU and have been able to implement a few instruction codes.我正在设计自己的 RISC-V CPU,并且已经能够实现一些指令代码。

I have installed the RV32I version of the GCC compiler and so I now have the assembler riscv32-unknown-elf-as available.我已经安装了 RV32I 版本的 GCC 编译器,所以我现在可以使用汇编程序riscv32-unknown-elf-as

I'm trying to assemble a program with just one instruction:我试图用一条指令组装一个程序:

# simple.asm
add x5,x6,x7

I compile this with the assembler and then run objdump with this command:我用汇编器编译它,然后用这个命令运行 objdump:

riscv32-unknown-elf-as simple.asm -o simple
riscv32-unknown-elf-objdump -D simple

This prints out the following:这将打印出以下内容:

new:     file format elf32-littleriscv


Disassembly of section .text:

00000000 <.text>:
   0:   007302b3                add     t0,t1,t2

Disassembly of section .riscv.attributes:

00000000 <.riscv.attributes>:
   0:   2d41                    jal     0x690
   2:   0000                    unimp
   4:   7200                    flw     fs0,32(a2)
   6:   7369                    lui     t1,0xffffa
   8:   01007663                bgeu    zero,a6,0x14
   c:   00000023                sb      zero,0(zero) # 0x0
  10:   7205                    lui     tp,0xfffe1
  12:   3376                    fld     ft6,376(sp)
  14:   6932                    flw     fs2,12(sp)
  16:   7032                    flw     ft0,44(sp)
  18:   5f30                    lw      a2,120(a4)
  1a:   326d                    jal     0xfffff9c4
  1c:   3070                    fld     fa2,224(s0)
  1e:   615f 7032 5f30          0x5f307032615f
  24:   3266                    fld     ft4,120(sp)
  26:   3070                    fld     fa2,224(s0)
  28:   645f 7032 0030          0x307032645f

My questions are:我的问题是:

  1. What is going on here?这里发生了什么? I thought I'd have a simple single line of hex, but there's a lot more going on.我以为我会有一个简单的单行十六进制,但还有很多事情要做。
  2. How do I instruct my processor to start reading the instructions at a certain memory address?如何指示我的处理器开始读取某个内存地址处的指令? It looks like objdump also doesn't know where the instructions will begin.看起来 objdump 也不知道指令从哪里开始。

Just to be clear, I'm treating my processor as bare metal at this point.明确地说,此时我将我的处理器视为裸机。 I am imagining I will hardcode in the processor that the instructions start at memory address X and data is available at memory address Y and stack is available at memory address Z. Is this correct?我想象我将在处理器中硬编码指令从内存地址 X 开始,数据在内存地址 Y 处可用,堆栈在内存地址 Z 处可用。这是正确的吗? Or is this the wrong approach?或者这是错误的方法?

how does the processor know which address to start fetching instructions from?处理器如何知道从哪个地址开始获取指令?

The actual CPU itself will have some hard-wired address that it fetches from on reset / power-on.实际的 CPU 本身会有一些硬连线地址,它可以在复位/上电时从中获取。 Usually a system will be designed with ROM or flash at that phys address.通常,系统会在该物理地址处设计有 ROM 或闪存。 (And might have code for an ELF program loader which will respect the ELF entry-point metadata, or you could just link a flat binary with the right code at the start of the binary.) (并且可能有一个 ELF 程序加载器的代码,它将尊重 ELF 入口点元数据,或者您可以在二进制文件的开头将一个平面二进制文件与正确的代码链接起来。)

What is going on here?这里发生了什么? I thought I'd have a simple single line of hex, but there's a lot more going on.我以为我会有一个简单的单行十六进制,但还有很多事情要做。

Your objdump -D disassembles all ELF sections, not just .text.您的objdump -D反汇编所有 ELF 部分,而不仅仅是 .text。 As you can see, there is only one instruction in the .text section, and if you used objdump -d that's what you'd see.如您所见,.text 部分中只有一条指令,如果您使用了 objdump -d,您会看到这样的内容。 (I normally use objdump -drwC , although -w no line-wrapping is probably irrelevant for RISC-V, unlike x86 where a single insn can be long.) (我通常使用objdump -drwC ,尽管-w no line-wrapping 可能与 RISC-V 无关,不像 x86,其中单个 insn 可能很长。)

Would it be possible to pass the file I compiled above as is to my processor?是否可以将我上面编译的文件按原样传递给我的处理器?

Not in the way you're probably thinking.不是你想的那样。 Also note that you chose the wrong file name for the output.另请注意,您为输出选择了错误的文件名。 as produces an object file (normally .o), not an executable. as 生成目标文件(通常为 .o),而不是可执行文件。 You could link with ld into a flat binary, or link and objcopy the .text section out of it.您可以将ld链接到一个平面二进制文件中,或者将objcopy.text部分链接和objcopy

(You could in theory put a whole ELF executable or even object file into ROM such that the .text section happens to start where the CPU will fetch from, but nothing will look at the metadata bytes. So the ELF entry-point address metadata in an ELF executable would be irrelevant.) (理论上,您可以将整个 ELF 可执行文件甚至目标文件放入 ROM 中,这样.text部分恰好从 CPU 获取的位置开始,但不会查看元数据字节。因此,ELF 入口点地址元数据在ELF 可执行文件将无关紧要。)

Difference between a .o and an executable: a .o just has relocation metadata for the linker to fill in actual addresses, absolute for la pseudo-instructions, or relative for auipc in cases like multiple .o files where one references a symbol from the other. .o和可执行文件之间的区别:a .o只有重定位元数据供链接器填充实际地址,绝对用于la伪指令,或相对用于auipc的情况,例如多个.o文件,其中一个文件从其他。 (Otherwise the relative displacement could be calculated at assemble time, not left for link time.) (否则可以在组装时计算相对位移,而不是在链接时计算。)

So if you had code that used any labels for memory addresses, you'd need the linker to fill in those relocation entries in your code.因此,如果您的代码使用任何内存地址标签,则需要链接器在代码中填写这些重定位条目。 Then you could objcopy some sections out of a linked ELF executable.然后你可以从链接的 ELF 可执行文件中objcopy一些部分。 Or use a linker script to set the layout for your flat binary.或者使用链接器脚本来设置平面二进制文件的布局。

For your simple case with only an add , no la or anything, there are no relocation entries so the text section in the .o is the same as in a linked executable.对于只有add ,没有la或任何内容的简单情况,没有重定位条目,因此.o的文本部分与链接的可执行文件中的相同。

Also tricky to get right with objcopy is static data, eg .data and .bss sections.使用objcopy也很棘手的是静态数据,例如.data.bss部分。 If you copy just the .text section to a flat binary, you won't have data anywhere.如果您.text部分复制到平面二进制文件,则在任何地方都不会有数据。 (But in a ROM, you'd need a startup function that copies static initializers from ROM to RAM for .data , and zeros the .bss space. If you want to write the asm source to have a normal-looking .data section with non-zero values, you'd want your build scripts to figure out the size to copy so your startup function can use it, instead of having to manually do all that.) (但在 ROM 中,您需要一个启动函数,将.data静态初始值设定项从 ROM 复制到 RAM,并将.bss空间归零。如果您想编写 asm 源以具有正常外观的.data部分非零值,您希望您的构建脚本计算出要复制的大小,以便您的启动函数可以使用它,而不必手动执行所有这些操作。)

@PeterCordes answer set me on the right path. @PeterCordes 的回答让我走上了正确的道路。 I finally figured out how to generate a raw memory dump file that I can use.我终于想出了如何生成我可以使用的原始内存转储文件。

The steps are as follows:步骤如下:

  1. Modified the assembly file to have a .text and .data section and a _start label.修改了程序集文件以具有.text.data部分以及_start标签。 My simple.asm file now looks as follows:我的simple.asm文件现在如下所示:

     .globl _start .text _start: add x5,x6,x7 .data L1: .word 27
  2. Assemble the .asm to a .o file using the following command:使用以下命令将.asm组装到.o文件:

     riscv32-unknown-elf-as simple.asm -o simple.o
  3. Create a linker script for the specific processor.为特定处理器创建链接描述文件。 I followed this amazing video which walks through the process on creating a linker script from scratch.我关注了这个惊人的视频,该视频介绍了从头开始创建链接器脚本的过程。 For now, I just need .text and .data sections.现在,我只需要.text.data部分。 So my linker script ( mycpu.ld ) is as shown below:所以我的链接器脚本( mycpu.ld )如下所示:

     OUTPUT_FORMAT("elf32-littleriscv", "elf32-littleriscv", "elf32-littleriscv") ENTRY(_start) MEMORY { DATA (rwx) : ORIGIN = 0x0, LENGTH = 0x80 INST (rx) : ORIGIN = 0x80, LENGTH = 0x80 } SECTIONS { .data : { *(.data) }> DATA .text : { *(.text) }> INST }
  4. Generate the ELF file using riscv32-unknown-elf-gcc which automatically calls riscv32-unknown-elf-ld :使用riscv32-unknown-elf-gcc生成 ELF 文件,它会自动调用riscv32-unknown-elf-ld

     riscv32-unknown-elf-gcc -nostdlib -T mycpu.ld -o simple.elf simple.o
  5. Create a raw binary or hex file from the .elf file which I will use to populate the contents of the memory..elf文件创建一个原始二进制或十六进制文件,我将用它来填充内存的内容。

     riscv32-unknown-elf-objcopy -O binary simple.elf simple.hex

Final simple.hex contains the following (using hexyl ):最终的simple.hex包含以下内容(使用hexyl ):

┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 1b 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │•0000000┊00000000│
│00000010│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │00000000┊00000000│
│00000020│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │00000000┊00000000│
│00000030│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │00000000┊00000000│
│00000040│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │00000000┊00000000│
│00000050│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │00000000┊00000000│
│00000060│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │00000000┊00000000│
│00000070│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │00000000┊00000000│
│00000080│ b3 02 73 00             ┊                         │וs0    ┊        │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

where b3027300 is the hex value for add x5,x6,x7 .其中b3027300add x5,x6,x7的十六进制值。

And that's it!就是这样! Big thanks to @PeterCordes for his help!非常感谢@PeterCordes 的帮助! :) :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM