Timing-closure is a growing concern for FPGA designers, particularly with the recent introduction of multi-million gate architectures fabricated at the 90 nm and 65 nm technology nodes. It is not sufficient for a timing-closure solution – the entire flow, including synthesis – to meet only the required timing; such a solution must also minimize the number of time-consuming synthesis-place-route iterations and provide results that remain stable across multiple physical synthesis runs and during final routing.
The designers' end goal is to ensure that timing will be met at the end of the FPGA design flow and to implement and debug the FPGA system as quickly and as soon as possible, possibly with the need to incorporate design changes and spec modifications along the way.
Placement and availability of routing resources play a huge role in the designers' ability to meet eventual performance goals in their FPGA implementation. Simultaneous logic synthesis and physical placement optimizations allow designers to rapidly drive towards – and concurrently lock-down – timing performance in their FPGA. The performance is more readily ensured when the synthesis tool passes legalized placement information to the FPGA vendor's back-end tools.
Problems with conventional synthesis technologies
It is no secret that wire delays dominate cell delays in modern silicon chips. In the case of FPGAs implemented at the 90 nm technology node, for example, wire delays can account for 80-to-90% of each delay path. This causes a problem with conventional FPGA synthesis solutions, because only the cell delays are known, while the wire delays – which cannot be fully characterized until after place-and-route – are estimated.
As illustrated in Fig 1, the distribution of delays – that is, the difference between what is estimated after logic synthesis compared to the actual delays following place-and-route – increases with each technology node and also as a function of the size of a design.
1. There is increasing uncertainty in delay
distribution as designs use newer technology nodes.
The result of these inaccurate delay estimations is a timing-closure nightmare, because the critical paths seen by the synthesis engine do not match the critical paths seen by place-and-route. In many cases, based on these inaccurate delay estimations, true critical paths may be under-optimized (thereby degrading performance), while non-critical paths may be over-optimized (thereby degrading area). Ultimately, this results in a significant increase in the number of synthesis-place-route iterations with poor convergence. Also, it leads to instability, because minor changes to the RTL can result in large, unpredictable changes to the results from synthesis and place-and-route.
Some considerations associated with FPGA architectures
The majority of today's conventional FPGA logic synthesis solutions are derived from technology that was originally designed with ASIC synthesis in mind. In the case of physically-aware synthesis for ASICs, for example, placement and synthesis are combined in a single pass. The reason this works so well in the ASIC world is that routing tracks are built-to-order. This means that the delays associated with the final placed-and-routed design tend to correlate reasonably well to those estimated by the synthesis engine.
One way to think of this is that working with an ASIC is like living in an undeveloped area (such as a desert) with a large budget. In this case you can build new roads as you need them and – generally speaking – each road can be as fast as you wish. Thus, if you wished to construct a path from your home to the office, this path could be implemented as a point-to-point super-fast highway. Based on this, it would be easy for you to accurately estimate the time it will take for you to travel to work and back.
By comparison, working with an FPGA is similar to living in the middle of a large, well-established city, whose "interconnects" are a mixture of small back streets, medium-sized roads, and really fast highways. In that case, when you are returning home from the office, you don't necessarily aim at driving the shortest distance. Instead, you may commence your journey by heading the "wrong way" until you can get on a highway, at which point you may reverse your direction and pass the office again as you head toward your home.
Planning on traveling in a straight line is great if nothing goes wrong. For example, if you wish to travel from point A to point B in a car and you plan on taking a certain road, you may estimate that the trip will take only 1 hour assuming good weather and no significant amount of traffic. But suppose some fog comes in and it starts to rain, plus it turns out that a major festival is taking place and there are cars everywhere. There may be lots of alternative routes available: some offering a short detour but with lots of tight curves, while others are straighter but longer – which route offers the best solution?
A similar situation occurs with regard to synthesis-place-route – basing one's estimations on a straight-line route for a signal is great if nothing gets in the way and nothing goes wrong. In this case, the logic synthesis will work and the place-and-route will closely match its results. In a real-world design, however, multiple gates and routes end up fighting for the same resources. This means that some placement-signal combinations will have to "take a detour", but which ones have to take the detour and which detour should they take?
The situation is further confused by the fact that having two logical functions in close proximity to each other on an FPGA does not necessarily imply the availability of a fast connection between them. Although this is somewhat non-intuitive – and it is dependant on the available routing resources – one may actually achieve better routing and timing results by placing the logic functions further apart. This is why ASIC-derived synthesis technologies return less-than-optimal results when applied to FPGA architectures. This is one reason why design flows based on these technologies require large numbers of time-consuming iterations between the front-end (synthesis) and back-end (place-and-route) engines in order to achieve correlation and timing closure.
Just to complicate the situation still further, routing resource contentions that result in the need to reroute using a slower overall path might well occur downstream in the design flow – during final routing – if the design is not placed with a view to route availability. It's not uncommon to see an 0.5 ns increase in a path delay due to the need to re-route the path due to lack of available routing resources. This problem is most easily solved by performing routing resource estimates during a combined synthesis and placement optimization process and then – having determined that there are adequate routes available for the placed design – pass that exact placement to the FPGA vendor tools for final routing, In effect, preserving the placement generated during physical synthesis ensures that timing goals will be met downstream.