简体   繁体   English

在一个时钟周期内在 FPGA 中添加大数

[英]Adding large numbers in FPGA in one clock cycle

If I have a VHDL adder which adds two numbers together:如果我有一个将两个数字相加的 VHDL 加法器:

entity adder is
    port(
        clk : in std_logic;
        sync_rst : in std_logic;
        signal_A_in : in signed(31 downto 0);
        signal_B_in : in signed(31 downto 0);
        result_out : out signed(31 downto 0)
    );
end adder;

I have two options, one is to concurrently sum signal_A_in and signal_B_in together as so:我有两个选择,一个是同时将 signal_A_in 和 signal_B_in 加在一起,如下所示:

architecture rtl of adder is

begin

result_out <= signal_A_in + signal_B_in;

end rtl;

The other is to perform the addition in a clocked process as so:另一种是在时钟过程中执行加法,如下所示:

architecture rtl of adder is

begin

myproc1 : process(clk, sync_rst)
begin
    if clk = '1' and clk'event then
        if sync_rst='1' then
            result_out <= (others=>'0');
        else
            result_out <= signal_A_in + signal_B_in;
        end if;
    end if;
end process;

end rtl;

So option B will have a single clock cycle delay compared to option A. However does it guarantee that the result will be ready in one clock cycle (ie to meet timing).所以选项B 与选项A 相比会有一个时钟周期延迟。但是它是否保证结果将在一个时钟周期内准备好(即满足时序要求)。 The reason I am asking this is because I am getting a timing failure on my design which utilises option A;我问这个的原因是因为我在使用选项 A 的设计中遇到了时序故障; concurrent summation.并发求和。 I believe that such a methodology is OK for smaller size numbers because the combinatorial logic delay is lower but when the numbers start getting larger the delay increases and the design fails timing.我相信这种方法对于较小的数字是可以的,因为组合逻辑延迟较低,但是当数字开始变大时,延迟会增加,并且设计会导致时序失败。 How does the synthesis tool cope with this and does putting the expression in a clocked process solve the issue?综合工具如何处理这个问题,将表达式放入时钟进程是否解决了这个问题?

When you write something like signal_A_in + signal_B_in;当您编写诸如signal_A_in + signal_B_in; that is combinatorial logic for an adder.这是加法器的组合逻辑。 Each FPGA will have different amount of time it takes for signals propagate through wires to+from the adder, and the adder itself.每个 FPGA 将有不同的时间,信号通过电线传播到加法器和加法器本身所需的时间。

When you do something like当你做类似的事情时

if clk = '1' and clk'event then
    result_out <= signal_A_in + signal_B_in;

As you noted you are now creating a 1 cycle delay by inferring a register.正如您所指出的,您现在通过推断寄存器来创建 1 个周期的延迟。 So now, no matter what your path ends right after your adder sending the result into a register called result_out .所以现在,无论你的路径在你的加法器将结果发送到一个名为result_out的寄存器之后立即结束。 Which is why your timing improved.这就是为什么你的时机有所改善。 Ex.前任。 as shown the path is likely just for your adder - giving you plenty of time and you pass timing.如图所示,路径可能只适合您的加法器 - 给您充足的时间,您可以通过时间。 (but be careful adding a register.= guaranteed to meet timing). (但要小心添加一个寄存器。=保证满足时序)。

Timing is worse in your first example and fails because it does not infer a register.在您的第一个示例中,时间更糟并且失败,因为它没有推断出寄存器。 Now not only does your signal need to get across the signal_A_in + signal_B_in adder logic in the clock cycle time - BUT ALSO needs to get across whatever result_out is driving (maybe more adders, other logic somewhere else etc).现在,您的信号不仅需要在时钟周期时间内通过signal_A_in + signal_B_in加法器逻辑 - 而且还需要通过result_out驱动的任何内容(可能更多的加法器、其他地方的其他逻辑等)。 Your timing path is AT LEAST as long you adder - and probably longer since you didnt break up the path with a register.只要您加法器,您的时序路径至少是 - 并且可能更长,因为您没有用寄存器分解路径。

Often times even larger adders are done not in 0 cycles (comb. logic) or 1 cycle(with a register output) but over N cycles as a pipelined operation.通常,更大的加法器不是在 0 个周期(梳状逻辑)或 1 个周期(带有寄存器输出)中完成,而是在 N 个周期内作为流水线操作完成。

This is mostly for you the human to fix - but some synthesis tools can do small retiming of circuits to help.这主要是由你来修复的——但一些综合工具可以对电路进行小的重新定时来提供帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM