Input and Output Delays in Primetime
Following the review of setup and hold time analysis in previous post, we will now cover timing analysis at the primary pins of the chip. The IO pins are constrained using input and output delays. We will look at an example to constrain IOs in a design using Synopsys Primetime, the widely used static timing analysis tool.
Input and Output constraints for a chip
In the figure above, we have RXD (the data) being clocked by RXC (period = 8ns), both being primary inputs to the chip. The input specification for this interface in our chip is 4 ns setup and 1 ns hold. The RXC input clock is clock tree buffered and clocks an output flop. The TXD and TXC are primary outputs of the chip and the output specification is a max delay of 5 ns and min delay of 1 ns.
In primetime, the set_input_delay command specifies the data arrival time at the port with respect to a clock. The set_output_delay command specifies the data required time at the port with respect to a clock.
We need to define the clock at the input using create_clock and make it a propagated clock so that primetime can propagate the input through all the buffers to the external TXC output.
create_clock -name rxc -period 8 -waveform “0 4″ [get_ports RXC]
set_propagated_clock [get_clocks rxc]
To constrain the RXD input wrt RXC clock for setup, we define the maximum external delay allowable (i.e. period - setup requirement = 8 - 3 = 5 ns)
set_input_delay 5 -max -clock [get_clocks rxc] [get_clocks rxc]
For the hold constraint, we specify the hold requirement as is.
set_input_delay 1 -min -clock [get_clocks rxc] [get_clocks rxc]
For output constraints, we need to create a generated clock source at the output pin TXC. This is necessary since without this generated clock the latency of the TXC clock source is discarded which is incorrect.
create_generated_clock -name txc -source [get_ports RXC] -divide_by 1 [get_ports TXC]
The output port TXD is constrained with delays external to the chip i.e.
set_output_delay 3 -max -clock [get_clocks txc] [get_ports TXD]
set_output_delay -1 -min -clock [get_clocks txc] [get_ports TXD]
Note the negative sign for min output delay, this is because a larger hold requirement means that you must specify a more negative output delay. If you apply the hold requirement as a positive amount, the constraint will be incorrectly relaxed by double that amount. This could cause the hold check to pass when there is really a violation.
Always be careful while defining input/output constraints and check the reports to ensure that the IOs are constrained correctly.
Sphere: Related ContentSetup and Hold times
Many designers are familiar with setup and hold time definitions - however, few can identify correctly the launch and capture edges and the slack/violation between two flops during timing analysis. In this post, we will cover setup/hold times in a design with clear examples.
Setup time is defined as the minimum amount of time BEFORE the clock’s active edge by which the data must be stable for it to be latched correctly. Any violation in this minimum required time causes incorrect data to be captured and is known as setup violation.
Hold time is defined as the minimum amount of time AFTER the clock’s active edge during which the data must be stable. Any violation in this required time causes incorrect data to be latched and is known as hold violation.
The setup time in a design determines the maximum frequency at which the chip can run without any timing failures. Factors affecting the setup analysis are the clock period Tclk, Clock to Q propagation delay of the launch flop Tck->q, negative clock skew Tskew, required setup time of the capture flop Tfs and combinational logic delay Tcomb between the two flops being timed. The following condition must be satisfied.
Tfs <= Tclk – Tck->q – Tskew – Tcomb
Hold analysis depends on the Tck->q, combinational logic delay, the clock skew and the hold time requirement Tfh of the capture flop. It is independent of the frequency of the clock. The condition below must be satisfied.
Tck->q + Tskew + Tcomb >= Tfh
Consider the figure below depicting a flop to flop path in the same domain with some combinational logic between them. We will now calculate the setup and hold time slacks in the design based on the given timing parameters.
Setup and Hold time illustration - Full cycle transfer
For setup checks in single cycle paths, the clock edges that are relevant is shown in the Figure above. The data required time for the capture flop B to meet setup is
Data Required time = (Clock Period + Clock Insertion Delay + Clock Skew - Setup time of the flop) = 8 + 2 + 0.25 -0.1 = 10.15 ns
The data arrival time from the launch flop is
Data Arrival time = (Clock Insertion Delay + CK->Q Delay of the launch flop + Combinational logic Delay) = 2 + 0.1 + 5 = 7.1 ns.
Setup slack is
Setup Margin = Data Required Time - Data Arrival Time = 10.15 - 7.10 = 3.05 ns
Similarly for hold checks assuming the hold time requirement of the flop B is 100 ps, the data expected time is
Data expected time = (Clock Insertion Delay + Clock skew + Hold time requirement of flop) = 2 + 0.25 +0.1 = 2.35 ns.
So the hold time slack is
Hold Margin = Data Arrival time - Data expected time = 7.10 - 2.35 = 4.85 ns
Consider the case where the clock to flop B is inverted (or that the flop is negative edge trigerred). In this particular case, the relevant edges for setup/hold are as shown in the figure below.
Setup and Hold time illustration - Half cycle transfer
In this scenario, the setup margin considering all the other parameters to be the same is
Data Required time = (half_clock_period + clock insertion delay + Ck->Q delay of flop A - Setup time required for flop B) = 4 + 2 + 0.25 -0.1 = 6.15 ns
Since the Data Arrival time remains the same, there is a setup violation of
Setup violation = 6.15 ns - 7.10 ns = -1.05 ns
There is no hold violation since the data arrival time remains the time but the data expected time is any time after (Clock skew + Hold time requirement of flop B)
Data expected time = 0.25 + 0.1 = 0.35 ns
Hold Margin = 7.10 - 0.35 = 6.75 ns
All clear now ? We will cover IO constraints next.
Sphere: Related ContentAsynchronous and Synchronous Resets
Designing power-up reset sequence and reset structures in a chip is a critical task and there are many issues one needs to be aware of. Incorrect reset generation can cause intermittent failures that are hard to debug and in some cases can also make a chip DOA (Dead on Arrival).
In this post, we will look at asynchronous/synchronous resets, reset synchronizers and also factors that may affect reset sequence in a chip. Most of the information presented here is derived from Cliff Cummings et. al. excellent paper and the reader is strongly recommended to read the paper at his leisure.
Resets can be either synchronous or asynchronous and each flip-flop has a timing window during which the reset cannot change transition. Recovery time is known as the minimum time the reset should be stable BEFORE the active clock edge (setup time) while Removal is the minimum time the reset should be stable AFTER the active clock edge (hold time).
Advantages of using Synchronous resets are :
- The reset is active only on active clock edge.
- Fewer number of gates (although negligible)
- Timing analysis is easier as the reset is synchronous to the clock.
A major disadvantage of synchronous reset design is that the clock should be running at the time of reset. In some chips, this may not be feasible due to gated clocks or due to requirements of the design.
Advantages of using Asynchronous resets are :
- No extra logic on the datapath making timing closure easier
- No clock required at the time of reset
The problem with asynchronous resets is that they can cause flops to go metastable, hence care must be taken at the time of assertion or deassertion of reset. Another issue is that timing analysis should include checks for recovery and removal times.
Reset Synchronizer
A novel technique to overcome the issues with asynchronous resets is to use Reset synchronizers. A reset synchronizer ensures that the reset removal does not cause any metastable problems – it resets the design asynchronously ( i.e. without a running clock) while the deassertion is synchronous!
Reset synchronizer
A reset synchronizer circuit is shown above, the two flops are dual stage synchronizers to synchronize the reset to the clock. On assertion of the chip reset, the synchronizer output drives the internal reset to the flops in the design. Deassertion can only happen during the next active edge. An important point to note is that the second flop in the synchronizer cannot go metastable as both the input and output points are both low when the reset is removed.
The two flops in the reset synchronizer should not be made scannable for DFT and a bypass mux is added at the output of the reset synchronizer to control the reset in test modes. Also note that a separate reset synchronizer will be required for each clock domain.
Another important requirement in many multi-clock domains is sequencing of resets – i.e. reset in one clock domain must be deasserted prior to reset in another clock domain. The author has come across designs where this requirement was neglected or overlooked causing critical issues in Silicon. A circuit below using reset synchronizers illustrates this.
Reset Sequencer (resetb is deasserted only later than reseta)
Sphere: Related Content