简体   繁体   English

为什么 RISC-V SB 和 UJ 指令类型以这种方式编码?

[英]Why are RISC-V S-B and U-J instruction types encoded in this way?

I am reading a book "Computer Organization and Design RISC-V Edition" , and I came across the encoding for SB and UJ instruction types.我正在读一本书《计算机组织与设计 RISC-V 版》 ,我遇到了 SB 和 UJ 指令类型的编码。

Those types I have mentioned above has strange encoded immediate field.我上面提到的那些类型有奇怪的编码立即字段。

SB types separate the immediate field into 2 parts. SB 类型将直接字段分为两部分。 This makes sense since all instructions encoding has to be similar.这是有道理的,因为所有指令编码都必须相似。 But I cannot understand why the immediate field is encoded in this way below.但我不明白为什么立即字段在下面以这种方式编码。

imm[12, 10:5], imm[4:1, 11]

instead of代替

imm[11:5], imm[4:0]

UJ types also have this strange encoded immediate field UJ 类型也有这个奇怪的编码立即字段

imm[20,10:1,11,19:12]

instead of代替

imm[19:0]

Can anyone explain this?谁能解释一下?

The chosen encodings line up very nicely with other encodings, simplifying the hardware at the expense of software that has to generate instructions, software that has to decode instructions, and, programmers learning or working with RISC V;).选择的编码与其他编码非常吻合,以牺牲必须生成指令的软件、必须解码指令的软件以及学习或使用 RISC V 的程序员为代价来简化硬件;)。

The S-Format breaks up the immediate into imm[11:5] and imm[4:0] . S-Format 将立即数分解为imm[11:5]imm[4:0] The reason this immediate is broken up is to keep the other fields, namely the register fields, rs2 and rs1 , in the same position as with the two source register fields in R-Type instructions.这个立即数被分解的原因是为了将其他字段,即寄存器字段rs2rs1保留在与 R-Type 指令中的两个源寄存器字段相同的 position 中。 (As compared with MIPS, which did similar but not as completely, this obviates a register name width (eg 5 bit wide) mux and several extra wirings, as well a control signal.) (与 MIPS 相比,它类似但不完全,这消除了寄存器名称宽度(例如 5 位宽)多路复用器和几个额外的布线,以及控制信号。)

The S-Format allows for a 12 bit immediate. S 格式允许 12 位立即数。

Whereas the (S)B-Type for branches uses a 13 bit immediate, though the last (Least Significant Bit) of the 13-bit immediate is always zero so it is not stored , So, it needs to actually encode 12 bits just like the S-Format, but because they are shifted in actual usage (left by one. eg, *2).而分支的 (S)B-Type 使用 13 位立即数,尽管 13 位立即数的最后一个(最低有效位)始终为零,因此它不被存储,因此,它需要实际编码 12 位,就像S-Format,但是因为它们在实际使用中发生了偏移(左移一。例如,* 2)。 all the bits are essentially off by 1 bit position as compared with the S-Format immediate.与 S 格式立即数相比,所有位基本上都偏移了 1 位 position。 (Shifting is not hard or slow but costs silicon real-estate, Typically; such a shift by a constant amount would be done by simply wiring the input bits to offset output bit positions rather than using a dedicated shifter we would see in an ALU, however. still this is immediate and datapath sized wiring so ~12 to 32+ extra wires.) (移位并不困难或缓慢,但通常会花费硅房地产;这种恒定量的移位将通过简单地连接输入位以偏移 output 位位置来完成,而不是使用我们在 ALU 中看到的专用移位器,然而,这仍然是直接和数据路径大小的布线,所以大约有 12 到 32+ 条额外的线。)

In order to not have to shift (as much as possible of) the part of the immediate that is stored, and so as to line nicely with the immediate in S-Format, the not stored LSB position (from S-Format) is used to store bit 11 of the SB-Format immediate.为了不必(尽可能多地移动)存储的立即部分,以便与 S-Format 中的立即数很好地对齐,使用未存储的 LSB position(来自 S-Format)存储 SB 格式立即数的第 11 位。 This way bits 10:1 line up exactly with the S-Format immediate.这样位 10:1 与 S-Format 立即数完全对齐。

But why not put bit 12 of the branch immediate there instead, which would keep one more bit in alignment (ie 11:1) with the S-Format?但是为什么不直接将分支的第 12 位放在那里,这样可以在 S 格式的 alignment(即 11:1)中多保留一位? Because the highest bit encoded in the immediate of the instruction is used to sign extend the immediate to 32-bits (for RV32, or 64-bits for RV64, 128 for RV128, lots of wires.), So, by keeping the sign bit in the same place as with the S-Format 12 bit immediate;因为在指令的立即数中编码的最高位用于将立即数符号扩展为 32 位(对于 RV32,或者对于 RV64 是 64 位,对于 RV128 是 128,很多线。),所以,通过保持符号位在与 S-Format 12 位立即数相同的位置; the same sign extension hardware can be shared (with the same first-described-above pros and cons.-), Hence, the choice to store bit 11, the next most significant bit of the SB-Type immediate.可以共享相同的符号扩展硬件(具有相同的上述优点和缺点。-),因此,选择存储位 11,即 SB-Type 立即数的下一个最高有效位。 in the 0 bit position (relative to S-Format).在 0 位 position (相对于 S 格式)。

The cost for SB (given S already) is only two or so (1-bit) wires and one 1-bit mux and a 1-bit control signal — minimal compared to alternatives. SB(已经给定 S)的成本仅为两根左右(1 位)线和一个 1 位多路复用器和一个 1 位控制信号 - 与其他替代方案相比是最低的。

See the following presentation , slide 46, titled "RISC-V Immediate Encoding", and subtitled: "Why is it so confusing???!"请参阅下面的演示文稿,幻灯片 46,标题为“RISC-V 立即编码”,副标题为:“为什么这么混乱???!”

The UJ-Type does similar, keeping the sign bit in the same bit position as the sign bit of other instructions, while aligning as many of the other bits as possible with other formats. UJ-Type 的做法类似,将符号位与其他指令的符号位保持在同一位 position 中,同时尽可能多地将其他位与其他格式对齐。

See slide 60 of the same presentation.请参阅同一演示文稿的幻灯片 60。

The official RISC-V spec does an excellent job of explaining every design choice in the instruction set, why something is done in that specific way.官方的 RISC-V 规范很好地解释了指令集中的每个设计选择,以及为什么以这种特定方式完成某事。 When in doubt you just need to have a look at it当你有疑问时,你只需要看看它

So the rationale for instruction encoding is described in chapter 2.2 - Base Instruction Formats .所以指令编码的基本原理在第 2.2 章-基本指令格式中描述。 It's all for making instruction decoding simpler and faster by这一切都是为了使指令解码更简单,更快

  • Sharing the decoding units between different instruction formats在不同的指令格式之间共享解码单元
  • Putting immediate bits at a specific position to remove the need of a hardware shifter and reduce fan-out while decoding将立即位放在特定的 position 以消除对硬件移位器的需求并减少解码时的扇出

The RISC-V ISA keeps the source ( rs1 and rs2 ) and destination ( rd ) registers at the same position in all formats to simplify decoding. RISC-V ISA 将所有格式的源( rs1rs2 )和目标( rd )寄存器保持在相同的 position 中,以简化解码。 Except for the 5-bit immediates used in CSR instructions (Chapter 9), immediates are always sign-extended, and are generally packed towards the leftmost available bits in the instruction and have been allocated to reduce hardware complexity.除了 CSR 指令(第 9 章)中使用的 5 位立即数外,立即数总是符号扩展的,并且通常被打包到指令中最左边的可用位,并已被分配以降低硬件复杂性。 In particular, the sign bit for all immediates is always in bit 31 of the instruction to speed sign-extension circuitry.特别是,所有立即数的符号位始终位于指令的第 31 位,以加速符号扩展电路。


Decoding register specifiers is usually on the critical paths in implementations, and so the instruction format was chosen to keep all register specifiers at the same position in all formats at the expense of having to move immediate bits across formats (a property shared with RISC-IV aka. SPUR [11]).解码寄存器说明符通常在实现中的关键路径上,因此选择指令格式以在所有格式中保持所有寄存器说明符相同又名 SPUR [11])。

Look at the instruction encoding you'll see that just a single decoder is needed for each of rs1 , rs2 and rd in any instruction formats that need them, and bit 31 is always the sign bit in the immediates regardless of their length, for fast sign extension查看指令编码,您会发现rs1rs2rd在任何需要它们的指令格式中都只需要一个解码器,并且第 31 位始终是立即数中的符号位,无论它们的长度如何,为了快速标志扩展

RISC-V 指令编码

Now focus to the immediates and you'll also see that they're arranged in "weird" orders, but they also allow decoders to be shared between formats.现在专注于立即数,您还会看到它们以“奇怪”的顺序排列,但它们也允许解码器在格式之间共享。 For example bits 10:1 are always at the same place in all formats.例如,位 10:1 在所有格式中始终位于同一位置。 Same to bits 19:12 in U/J and 4:1 in S/B.与 U/J 中的 19:12 位和 S/B 中的 4:1 位相同。 Those 2 pairs are actually almost the same, with the immediate is shifted left by one bit in J and B. By interleaving bit that way the most of the hard work of shifting is left to the assembler, simplifying hardware even more这 2 对实际上几乎相同,立即数在 J 和 B 中左移一位。通过这种方式交错位,移位的大部分繁重工作都留给了汇编程序,进一步简化了硬件

2.3 Immediate Encoding Variants 2.3 立即编码变体

The only difference between the S and B formats is that the 12-bit immediate field is used to encode branch offsets in multiples of 2 in the B format. S 和 B 格式之间的唯一区别是 12 位立即数字段用于在 B 格式中以 2 的倍数对分支偏移进行编码。 Instead of shifting all bits in the instruction-encoded immediate left by one in hardware as is conventionally done, the middle bits (imm[10:1]) and sign bit stay in fixed positions, while the lowest bit in S format (inst[7]) encodes a high-order bit in B format.不是像传统那样在硬件中将指令编码立即数中的所有位左移一位,中间位 (imm[10:1]) 和符号位保持在固定位置,而 S 格式的最低位 (inst[ 7]) 以 B 格式对高位比特进行编码。

Similarly, the only difference between the U and J formats is that the 20-bit immediate is shifted left by 12 bits to form U immediates and by 1 bit to form J immediates.类似地,U 和 J 格式之间的唯一区别是 20 位立即数左移 12 位形成 U 立即数,并左移 1 位形成 J 立即数。 The location of instruction bits in the U and J format immediates is chosen to maximize overlap with the other formats and with each other.选择 U 和 J 格式立即数中指令位的位置以最大化与其他格式以及彼此之间的重叠。


Sign-extension is one of the most critical operations on immediates (particularly for XLEN>32), and in RISC-V the sign bit for all immediates is always held in bit 31 of the instruction to allow sign-extension to proceed in parallel with instruction decoding.符号扩展是立即数上最关键的操作之一(特别是对于 XLEN>32),在 RISC-V 中,所有立即数的符号位始终保存在指令的第 31 位中,以允许符号扩展与指令解码。

Although more complex implementations might have separate adders for branch and jump calculations and so would not benefit from keeping the location of immediate bits constant across types of instruction, we wanted to reduce the hardware cost of the simplest implementations.尽管更复杂的实现可能具有用于分支和跳转计算的单独加法器,因此不会受益于跨指令类型保持立即位的位置不变,但我们希望降低最简单实现的硬件成本。 By rotating bits in the instruction encoding of B and J immediates instead of using dynamic hardware muxes to multiply the immediate by 2, we reduce instruction signal fanout and immediate mux costs by around a factor of 2. The scrambled immediate encoding will add negligible time to static or ahead-of-time compilation.通过旋转 B 和 J 立即数的指令编码中的位,而不是使用动态硬件多路复用器将立即数乘以 2,我们将指令信号扇出和立即多路复用器成本减少了大约 2 倍。加扰的立即数编码将增加可忽略不计的时间static 或提前编译。 For dynamic generation of instructions, there is some small additional overhead, but the most common short forward branches have straightforward immediate encodings.对于指令的动态生成,有一些额外的开销,但最常见的短前向分支具有直接的立即编码。

If you're interested you can find more discussions in the official github page如果您有兴趣,可以在官方 github 页面中找到更多讨论

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM