current design-flow of digital automotive ASIC parts commonly starts on
register-transfer level (RTL) using VHDL or Verilog (henceforth
referred to as HDL descriptions). Therefore, we define these models as
IL models in the digital domain. For transitions between Simulink models
and these HDL descriptions, there are two different kinds of solutions,
which are depicted in Figure 2.
Figure 2: Transitions in the digital domain.
kind of solutions are high-level synthesis (HLS) transitions (see arrow
1 in Figure 2), which refine a behavioral description of an SL model
into a cycle-accurate description of an IL model by performing
scheduling, allocation and binding. The other kind of solutions are
mapping transitions (see arrow 2 in Figure 2), which transform an SL
model that is already refined down to an RTL description into an IL
model without performing any refinement. We focus on mapping
transitions, because high-level synthesis transitions make it difficult
to formally verify resulting HDL descriptions (see (2)).
mapping transition from Simulink models described on RTL to HDL is
realized by MathWorks HDL Coder. HDL Coder is compatible with Simulink
models containing a subset of the standard Simulink blockset, Stateflow
models and MATLAB Code. Additionally, HDL Coder offers the possibility
to explore the trade-off between area and timing. Therefore, different
implementations for Simulink blocks are provided on the one hand. On the
other hand, HDL Coder can modify the SL model by applying distributed
pipelining, streaming and resource sharing to subsystems.
want to assist the developer in developing Simulink designs that result
in efficient HDL designs. Therefore, we developed two optimization
approaches. They were derived from our experience in developing Simulink
SL models representing datapath-oriented applications. These
optimizations showed a significant improvement of our SL design flow.
in digital signal processing designs are often processed at different
sample rates: While for example analog signals are sampled at a very
high rate for the improvement of the signal-to-noise ratio, the internal
computation at lower rates reduces the computational effort in the
design. Transitions between different sample rates are realized by
decimation and interpolation of signals. Such systems containing
multiple sample rates are called multirate systems.
capable of modeling multirate systems. Decimation or interpolation of
signals can be achieved for example by using rate transition blocks or
counters and switches. While developers often model decimation and
interpolation of signals correctly based on their signal processing
knowledge, it might occur that sample rates of blocks are higher than
the sample rates of the connected signals. Because HDL Coder translates
Simulink sample rates directly into clock rates, this results in
unnecessarily high clock rates in the HDL description.
avoiding too high clock rates, we developed Timing Optimization, which
checks a Simulink model for blocks whose connected signals change at a
lower rate than the blocks' sample rates. When the sample rate of blocks
is adjusted to the alteration rate of the connected signals, then HDL
Coder runs the resulting HDL components at a corresponding lower clock
rate. This results in relaxed timing constraints for logic synthesis and
reduced power consumption. Furthermore, if a block such as a derivation
block contains delay blocks and its sample rate can be reduced by a
certain factor, then the number of delays can be decreased by the same
factor. This leads to fewer registers in the HDL description. The
alteration rate of signals is determined by simulation. After changing
the sample rates, an automated verification is executed by simulating
the model and comparing the systems outputs. This is necessary since
some combinations of blocks might lead to a different behavior when
their sample rate is reduced.
Figure 3 shows an example of a
system of two Simulink blocks. The value R is the rate at which a signal
changes, which is also depicted in the signal sequences next to the
signals. S represents the sample rate of a block that results in the
used clock rate in the HDL description. The decimation block reduces the
incoming signal rate by a factor of 2. The processing block then
receives a signal with a change rate of 5 kHz and outputs a signal with
the same rate, yet this block is executed with a sample rate of 10 kHz.
Therefore, the processing block's sample rate can be reduced by a factor
Figure 3: Timing Optimization
optimization identifies characteristics of Simulink models which lead
to inefficient HDL descriptions and improves these characteristics. The
optimization depends on signals' sample rates occurring during
simulation. Thus it is crucial that the testbench stimuli of the SL
model are sufficient for covering all possible scenarios that are
specified. Furthermore, this optimization enables the quick assembling
of an SL model using existing Simulink subsystems and the adaption of
the subsystem to the alteration rates of the incoming signals.
the beginning of a common system-level design flow, Simulink models
representing digital hardware often use floating-point signals. For a
resulting efficient hardware implementation, a fixed-point
representation of the design is needed, which supports the needed value
ranges and precisions for every signal.
Tool – part of Simulink Fixed Point – assists the developer in the
adaption of signals’ integer bits to the used value ranges. We extend
this functionality by providing an optimization that reduces the
fractional bits of signals. Since the modification of fractional bits of
signals influences their precision, we consider the impact on the
output behavior of the overall system. This also allows the
identification and reduction of blocks that have little or no impact on
the system's behavior. We are looking for the most efficient hardware
implementation that doesn’t change the system’s behavior.
this purpose, we use the stochastic optimization method Simulated
Annealing. We explore the solution space by increasing and decreasing
the integer and fractional bits of randomly chosen blocks. The resulting
solution specifies word-lengths for every block while the resource
consumption should be minimal and no deviation of the system's output
signals is allowed.
In Figure 4 the Simulated Annealing algorithm
for our specific problem is shown: We start with an initial SL model
that contains signal values that lead to the desired signal processing
behavior. This behavior will be used in further optimization as
reference behavior. In the following, the model is rated by estimating
the resulting hardware resource consumption of the digital component.
After that, it will be decided if the solution will be accepted. Better
solutions are always accepted, and depending on the current temperature,
worse solutions are accepted with a specific probability. This prevents
the algorithm from getting stuck in local optima. Furthermore, it is
checked if the model violates the allowed signal deviation. If the model
is not accepted, then a new model is created by modifying the
fixed-point type of randomly chosen blocks. If the model is accepted,
the current model is replaced. After that, depending on the number of
iterations, the temperature is reduced. If no new model was accepted for
a defined number of iterations, the algorithm is stopped. Otherwise, a
new model is created and the loop starts again.
Figure 4: Simulated Annealing flowchart
exploration at SL in the Simulink environment has the advantage that
the impact of the digital part's change of functionality on the overall
system is directly visible. As with Timing Optimization, the outcome of
this optimization strongly depends on the SL testbench stimuli.