Review of Register Transfer Level (RTL) abstraction

This part of the lesson covers the limitations that propagation delay causes to RTL version of the same circuits as the previous section.

In the previous lesson we learned about the propagation delay, which is an intrinsic characteristic of each logic gate. In this lesson we are going to cover the same topic at a slightly different abstraction level, the so called Register Transfer Level, or RTL in short. This is a design abstraction, that focuses on sampling the inputs and outputs of a circuit with registers, as you can see in the simulations below, which contains the same chips as the last one but with sampled inputs and outputs. You will hopefully understand how this abstracted structure, is all over the hardware architecture of processors.

Now let us explore how this new abstraction level influences how the chips work.

RTL of a simple gate

Focusing on the nand gate: pause the simulation and inspect its clock signal, and the input and outputs on both sides of the registers. When an input changes, it is not passed to the inner circuit, in this case the nand gate, until the next rising edge of the clock signal, which updates the input registers' ouputs after one step.

After that the nand gate is free to propagate and calculate its output in one step: such value is however not immediately transfered to the actual circuit output, it will do so only at the next rising edge of the clock signal, which will update the output register, that after one further step will update the actual circuit output.

Given that the input is detected at a given rising edge of the clock signal, the ideal case is when the correct output comes out of the output register at the very next rising edge of the clock signal, that is when the propagation delay of the circuit contained within the input and output registers is shorter than the clock period.

This is critically important: if we want for example to put more than one of this cells in series to one another, each one must compute its own output within one clock cycle, that is the rising edge of the clock signal immediately after the one that detected the new inputs, in order sample the correct output. If this does not happen, the wrong value is passed onto the next section of the circuit and the computation fails.

This being said, we can now see the propagation delay of a chip under a different perspective: it becomes a limit, to the maximum clock frequency that the corresponding RTL representation can handle without error, where by error we mean the output being sampled at any later point, than the rising edge immediately after the one that detected the different input. We can call this limitation a "sampling constraint".

Let us try then to find the maximum clock frequency of each of the circuits in the simulation panel. The clock component can be controlled by the number above it, which represents how many steps until the clock changes value, or half of the clock period as measured in steps. So, for example two consecutive rising edges of a clock set at number 5 are 10 steps apart, and that value, the clock period of 10 steps, is the maximum allowed propagation delay of the chip contained inbetween the registers, that are clocked by such signal.

Let us then tackle each circuit. We found out earlier that the propagation delay of the nand port is just one step. Therefore even a clock set to 1, which has a two long clock period, should present the correct output after just one clock cycle. If we verify that, we see that no matter when we change the input, the output changes exactly one step after the next rising edge, to the one which detected the new input. The extra step is due to the propagation delay of the output register itself.

RTL of a simple boolean circuit

We can do the same thing with the xnor, expecting the minimum clock period to be 6, that is the clock component set to 3. The way to test the worst case scenario is to use one known input, with the maximum expected propagation delay, in our case zero zero to one one, and change it at the same time as a rising edge of the clock. We see that even in that case, the circuit still functions properly.

RTL of a complex circuit

Regarding the incrementer, it was your task to find its exact propagation delay in the last exercise, so we are not going to spoil the solution. We are therefore going to use this chip as an example of what happens if you got your estimate wrong. If you overestimated the delay you are not going to see the error with the signal monitors, as the correct output will just wait to be sampled at the output register for a few steps. If you underestimated the delay however, it will be very evident from the signals, and that is the case we are covering now.

Let us say that you estimated the propagation delay of the incrementer to be 20 steps (we are taking a particularly low estimate just to be sure that it is very very wrong). That means that we expect that a clock set with number 10 will sample the correct output, at the rising edge immediately following the one that detected the new input.

Set the input radix and the monitors to decimal and the clock to 10. If we set the input to zero, and the change is detected at this rising edge, we can see the circuit comfortably outputting one just one step after the next rising edge. Remember that that extra step is due to the propagation delay of the output register.

However, if we input for example 63, we see that the next rising edge yields 56 instead of 64. What is more is that the next rising edge is sampling the value 32, again instead of 64. Finally, on the third rising edge after the one that sampled the new input we get 64. This behaviour happens because the propagation delay of the incrementer with input 63, is between 40 and 60 steps (between the second and third rising edge from the one that sampled the input) and at each rising edge before the result is ready, the output register is sampling some random number that comes out of the incrementer chip, while it is propagating. This is in a nutshell why in general, the clock period must be slow enough to allow for the full propagation of the circuit at any input.

Wrap Up Exercises

  1. Reformulate the result of the previous exercise, that is the estimate of the propagation delay for the incrementer, as a constraint to the maximum clock speed of its RTL circuit form.

Please write your answers on a document, to then submit as PDF in the "Assignments" section. Mind that all answers from the whole Circuit Timing section are to be written on the same document. Feel free to add images to your answers, if it helps to explain something more concisely.