# System-level design of mixed-signal ASICs using Simulink: Efficient transitions to EDA environments

5 comments

The
current design-flow of digital automotive ASIC parts commonly starts on
register-transfer level (RTL) using VHDL or Verilog (henceforth
referred to as HDL descriptions). Therefore, we define these models as
IL models in the digital domain. For transitions between Simulink models
and these HDL descriptions, there are two different kinds of solutions,
which are depicted in Figure 2.

A mapping transition from Simulink models described on RTL to HDL is realized by MathWorks HDL Coder. HDL Coder is compatible with Simulink models containing a subset of the standard Simulink blockset, Stateflow models and MATLAB Code. Additionally, HDL Coder offers the possibility to explore the trade-off between area and timing. Therefore, different implementations for Simulink blocks are provided on the one hand. On the other hand, HDL Coder can modify the SL model by applying distributed pipelining, streaming and resource sharing to subsystems.

**Optimizations**

We want to assist the developer in developing Simulink designs that result in efficient HDL designs. Therefore, we developed two optimization approaches. They were derived from our experience in developing Simulink SL models representing datapath-oriented applications. These optimizations showed a significant improvement of our SL design flow.

**Timing optimization**

Signals in digital signal processing designs are often processed at different sample rates: While for example analog signals are sampled at a very high rate for the improvement of the signal-to-noise ratio, the internal computation at lower rates reduces the computational effort in the design. Transitions between different sample rates are realized by decimation and interpolation of signals. Such systems containing multiple sample rates are called multirate systems.

Simulink is capable of modeling multirate systems. Decimation or interpolation of signals can be achieved for example by using rate transition blocks or counters and switches. While developers often model decimation and interpolation of signals correctly based on their signal processing knowledge, it might occur that sample rates of blocks are higher than the sample rates of the connected signals. Because HDL Coder translates Simulink sample rates directly into clock rates, this results in unnecessarily high clock rates in the HDL description.

For avoiding too high clock rates, we developed Timing Optimization, which checks a Simulink model for blocks whose connected signals change at a lower rate than the blocks' sample rates. When the sample rate of blocks is adjusted to the alteration rate of the connected signals, then HDL Coder runs the resulting HDL components at a corresponding lower clock rate. This results in relaxed timing constraints for logic synthesis and reduced power consumption. Furthermore, if a block such as a derivation block contains delay blocks and its sample rate can be reduced by a certain factor, then the number of delays can be decreased by the same factor. This leads to fewer registers in the HDL description. The alteration rate of signals is determined by simulation. After changing the sample rates, an automated verification is executed by simulating the model and comparing the systems outputs. This is necessary since some combinations of blocks might lead to a different behavior when their sample rate is reduced.

Figure 3 shows an example of a system of two Simulink blocks. The value R is the rate at which a signal changes, which is also depicted in the signal sequences next to the signals. S represents the sample rate of a block that results in the used clock rate in the HDL description. The decimation block reduces the incoming signal rate by a factor of 2. The processing block then receives a signal with a change rate of 5 kHz and outputs a signal with the same rate, yet this block is executed with a sample rate of 10 kHz. Therefore, the processing block's sample rate can be reduced by a factor of 2.

**Word-length optimization**

At the beginning of a common system-level design flow, Simulink models representing digital hardware often use floating-point signals. For a resulting efficient hardware implementation, a fixed-point representation of the design is needed, which supports the needed value ranges and precisions for every signal.

MathWorks Fixed-Point Tool – part of Simulink Fixed Point – assists the developer in the adaption of signals’ integer bits to the used value ranges. We extend this functionality by providing an optimization that reduces the fractional bits of signals. Since the modification of fractional bits of signals influences their precision, we consider the impact on the output behavior of the overall system. This also allows the identification and reduction of blocks that have little or no impact on the system's behavior. We are looking for the most efficient hardware implementation that doesn’t change the system’s behavior.

For this purpose, we use the stochastic optimization method Simulated Annealing. We explore the solution space by increasing and decreasing the integer and fractional bits of randomly chosen blocks. The resulting solution specifies word-lengths for every block while the resource consumption should be minimal and no deviation of the system's output signals is allowed.

In Figure 4 the Simulated Annealing algorithm for our specific problem is shown: We start with an initial SL model that contains signal values that lead to the desired signal processing behavior. This behavior will be used in further optimization as reference behavior. In the following, the model is rated by estimating the resulting hardware resource consumption of the digital component. After that, it will be decided if the solution will be accepted. Better solutions are always accepted, and depending on the current temperature, worse solutions are accepted with a specific probability. This prevents the algorithm from getting stuck in local optima. Furthermore, it is checked if the model violates the allowed signal deviation. If the model is not accepted, then a new model is created by modifying the fixed-point type of randomly chosen blocks. If the model is accepted, the current model is replaced. After that, depending on the number of iterations, the temperature is reduced. If no new model was accepted for a defined number of iterations, the algorithm is stopped. Otherwise, a new model is created and the loop starts again.

Author

old account Frank Eory 5/30/2012 9:49:47 PM

Author

mdos 6/21/2012 9:39:25 AM

Author

Andreas Mauderer 7/13/2012 9:29:04 AM

Author

Andreas Mauderer 4/3/2015 8:02:19 AM

Because of that, in the further development of the Word Length Optimization a further constraint was added so that the critical integer bits are not reduced and reliable designs are generated. This leads to the following results: The area consumption of the resulting decimation filters is 27 percent higher than the area consumption of the handcoded design while the power consumption is 12 percent higher. Although the resulting power and area consumption of the resulting design is now higher, the design is still much more efficient than the resulting design without the application of the developed optimizations. Furthermore, the additional area and power consumption of the generated designs can still be compensated by the reduction of the design effort using the presented design flow.

Rookie

edenaxas 1/15/2016 5:18:18 PM