Semiconductor successfully implemented the design using SystemC and the
AXI3 TLM IP models, and used Cadence C-to-Silicon Compiler HLS to
generate RTL. The design passed simulation and functional verification,
described in more detail below. They compared the implementation, QoR,
and performance with the hand-written RTL design.
line count of the SystemC model is almost 1/3 the size of the
hand-written RTL code for this design, which is significant because
there were over 10,000 lines of RTL. Note that the line count for the
SystemC model only represents customer-written code, since the AXI3 TLM
model was provided within a SystemC library and is design-independent.
For the hand-written RTL code, there was no reusable AXI3 code
available. The large line count reduction with the TLM-based approach
significantly reduced the coding effort and enabled designers to
concentrate on exploring and optimizing core functionality.
compare performance between the models, Fujitsu Semiconductor measured
average throughput using six different types of data transfers that
cover the various types of burst transfers the design needs to perform.
In all cases, the performance of the HLS-generated RTL was better than
that of the hand-written RTL and, on average, the HLS-generated model
had 35% better performance than the hand-written RTL.
for this was that Fujitsu Semiconductor was able to take advantage of
the higher abstraction level of the SystemC model and explore a range of
micro-architecture implementations in C-to-Silicon Compiler, ultimately
finding a more efficient micro-architecture than what had been
implemented in RTL. With traditional RTL-based design entry, this type
of exploration is almost impossible.
Semiconductor used Cadence C-to-Silicon Compiler to generate RTL from
the SystemC model and Cadence RTL Compiler to generate the gate-level
netlist using their own production technology library. Table 2 shows the
area comparison between the HLS-generated RTL and the hand-written RTL
using the implementation with eight logical channels, across different
Table 2: Area Comparison
Semiconductor utilized clock gating optimization in both C-to-Silicon
Compiler and RTL Compiler to reduce dynamic power, then compared the
dynamic power consumption results of each flow by simulating at the gate
level. Table 3 shows the dynamic power reduction from the SystemC flow
versus the hand-written RTL flow.
Table 3: Power Comparison at 400MHz