Emerging process technology will enable
the integration of dozens of processors onto a single chip. Such
high chip densities support the integration of complete information
and communication system-on-chip (SoC) designs. These SoCs include
the complete information-processing path starting from the analog
domain and the physical transport of bits, over network layer
processing, and down to user-level multimedia
Unfortunately, the ease with which technology can integrate
system specifications conflicts with the design process that brings
these specifications into an implementation. This conflict is
caused by each design discipline, such as DSP algorithm design,
network performance simulation, hardware synthesis, and embedded
software design, having its own tool and/or favorite design
environment. The result of this lack of common tools and design
environments is a system-level design flow consisting of a
patchwork of scripts, translators, and tools.
There are major obstacles existing today for doing successful
SoC design. There is no single system-level environment designers
can use throughout the design flow. While algorithms might be
designed at a high level in C, you still have to synthesize gates
from an HDL description. Each manual-format translation, no matter
how small, is a potential error source.
Up until now, system specification and initial design were
traditionally based on natural-language documents and informal
block diagrams, possibly supplemented by point tools, such as
Matlab, for a more detailed exploration of specific design aspects.
An executable model for the complete system is typically only
constructed at the register-transfer (RT) level. However, this
approach is not feasible for a 100-million-transistor chip that you
need to design in a few months. There is a need for executable
system models at a much higher level of abstraction.
There is also a lack of reuse mechanisms. Intensive reuse of
silicon-intellectual-property (IP) blocks is the key to designing
complex systems in a short time. Reuse is not just an issue of
standards, but essentially requires new methodologies. IP block
architectural and timing constraints have to be taken into account
at the design's start. More specifically, these constraints must be
modeled at the system level.
In many cases, designers have insufficient control over the
design process. They have to accept the design produced by a
synthesis tool. It is a problem that the synthesis tool is sold as
a closed box, without designer visibility into how the tool does
its job. In addition, there is a lack of a systematic verification
strategy. For most designs, there are as many testbenches as there
are tools used in the design flow.
At design-flow phases where the design representation undergoes
drastic changes, such as during a transition from Matlab to
Verilog, the development of corresponding or equivalent testbenches
is extremely hard—if even possible.
Why Executable Modeling is Needed at Different Abstraction
The need for a common language framework in which every aspect
of an SoC design can be captured is only one part of a successful
design. Several factors, inherent to the growing complexity of
advanced information and communication systems, complicate the
specification and design.
Typically, SoCs are increasingly heterogeneous, requiring
different modeling paradigms at the conceptual and implementation
levels. At the conceptual level, different models, such as dataflow
and synchronous, are appropriate for different parts of the system.
At the implementation level, the system—consisting of
general-purpose processors, application-specific processors,
on-chip busses, memories, reconfigurable hardware, and dedicated
hardware—has very substantial software content. Consequently,
you need a mix of formalisms to efficiently model such
heterogeneous systems at different abstraction levels.
The exponential increase of data and media communications
systems requires the integration of high-speed networking onto
silicon. This integration introduces a new breed of problems in the
design flow. Dynamic memory management and dynamic processes are
needed to implement such functionality. This requirement makes
timing (or scheduling) and resource issues harder to analyze, but
at the same time makes these issues more important.
The vast majority of new designs are variations or combinations
of existing designs. When a complete system is integrated on a
chip, replacing some components can no longer create new versions.
Instead, you need to modify a more abstract model of the system.
Documentation of design decisions and trade-offs for the original
design are necessary to guide such modifications.
Finally, advanced complex systems are often created under
changing requirements, even with the presence of unstable
standards. It is usually difficult or even impossible to completely
specify the required system functionality before starting the
design effort. IMEC developed its design environment, SoC++,
starting from the belief that system design should be based on an
executable model in which the designer can concurrently refine
functionality, structure, and timing from concept to
implementation. The executable model should support different
modeling styles and be easily extensible.
SoC++ is based on object-oriented system design using the
standard ANSI C/C++ language. SoC++ provides a powerful and
flexible modeling environment supporting different abstraction
levels. The modeling primitives to efficiently construct executable
system-level descriptions are provided by C++ extended with class
libraries. A time-aware multi-thread library, TIPSY, supports
concurrency and time in an executable model of an SoC.
SoC++ Design Flow
System design starts with a system-requirement specification
phase followed by an architecture-definition phase. The system is
divided in subsystems. For every subsystem, an analysis is made
). These phases are typically covered by OMT/UML-based
methodologies. The next step is designing every subsystem. Here
hardware and software design follows very different design flows.
However, for both designs C++ based environments exist, resulting
in executable specifications.
Figure 1: System-level design flow
The starting point in IMEC's SoC++ is a unified model to
represent both hardware and software, allowing a fast exploration
of hardware/software partitioning alternatives (Figure 2). This
unified modeling has the advantage that switching a process from
hardware to software implementation does not require a code
rewrite. The model is executable and can be refined to either
hardware or software. It is the most abstract—yet
executable—level of description and is called uncommitted
parallel processes. Uncommitted means that the model has maximum
parallelism (all threads run virtually in parallel on different
processors), zero execution time, and unlimited communication
resources (a dedicated channel for every communication).
Figure 2: SoC++ design flow
The goal of the implementation-refinement process is to arrive
at a fully committed model. Within this model, all processes are
annotated with realistic execution times, processes are allocated
to processors with priorities defined, and all inter-process
communication channels are allocated to a communication resource.
The process implements intra-processor communications towards an
RTOS-based implementation, whereas inter-processor communications
require hardware-software interfaces.
From the fully committed communicating-process model, you need
to generate application-software code for every software processor.
For hardware targets, SoC++ automatically generates synthesizable
VHDL or Verilog code. Throughout this design process, you can
simulate the complete system—including models of the hardware
at different abstraction levels—to verify functionality,
timing, and performance (Figure 3). To support this easy
construction of executable system specifications, C++ has been
extended with different class libraries, such as OCAPI, SoCOS, and
Figure 3: Levels of abstraction in SoC++
Hardware Design of Complex Digital Systems
As discussed previously, many bottlenecks exist in the hardware
design of complex digital systems. To solve these problems, IMEC
developed the OCAPI library. Applications of OCAPI include hardware
design of DSP functions and protocol processors.
Capturing behavior at a high level is essential for doing
algorithmic design and exploration. OCAPI initially captures
behavior as a dataflow description, much in the same way as an
environment like Synopsys' COSSAP or Cadence's SPW captures
behavior. One difference is that the system description in OCAPI is
a C++ program, where other design environments are
OCAPI offers the possibility to do detailed design hardware
design, similar to traditional HDL environments. To provide this
capability, OCAPI includes a set of objects that allow describing
hardware at the RT level. In addition, these objects can be
co-simulated with a high-level dataflow description. Furthermore,
you can automatically translate a C++ description made in terms of
OCAPI objects to VHDL. This avoids making manual translations
between equivalent design representations. In addition, the
translation includes the generation of VHDL testbenches and test
vectors, which you can use to repeat HDL simulations along with the
OCAPI supports incremental refinement, allowing a smooth
transition from pure behavioral descriptions down to architecture
descriptions. You can co-simulate dataflow and architecture
descriptions. In addition, you can also freely mix floating-point
and fixed-point datatypes.
Effective IP Reuse
In traditional hardware-design environments, reuse is focused at
the structural level, which creates some problems. A component is
made reusable by matching it to a standard interface. This
interface defines input/output signals, these signals' timing
interface, and other characteristics. With this interface, the
component's internals are hidden as designer IP, while the
component can still be reused. However, reuse is a matter of
reusing functionality, not structure. A component can often be
almost reused, but requires additional encapsulation to match the
desired behavior for the new system. For instance, a digital filter
can have the ideal characteristics and performance for a modem
system, but will contain a serial coefficient programming input
instead of the required parallel one for the modem.
Next, structural reuse seals the behavior of a component in a
closed box behind the reuse interface. You can only manipulate this
reused behavior indirectly through this interface. For instance,
introducing a wait state in the operation of a memory controller
might require cumbersome interface manipulations. Current
hardware-development environments are good for capturing,
simulation, and synthesizing hardware components. These
environments, however, do a poor job in manipulating the same
descriptions. For example, VHDL defines a component as an entity
with a well-defined port set. It is not possible to strip the
entity's ports dependent on some external design condition.
SoC++ solves these reuse limitations by using the
object-oriented features of C++. In an object-oriented environment,
reuse is a natural way of doing design. For hardware design, you
can have datapath elements as register objects. You can also have
more abstract control elements, such as finite-state-machine
objects. You can create new objects either by relying on genuine
C++ mechanisms such as inheritance, object composition, or
templates, or by manipulating an existing object hierarchy. In this
way, the reuse interface moves from a structural level to a
behavioral level—to the set of objects you use to describe
SoCOS: Middleware for SoC Design
The SoCOS library supports modeling, simulation, and analysis of
SoC designs during the system-level design phases. The library also
supports an efficient refinement path for embedded software. SoCOS
provides the glue to combine different computational models at
mixed abstraction levels in an executable specification—the
library serves as middleware enabling a seamless SoC design.
Embedded software makes SoC designs essentially dynamic, which
is why an efficient SoC modeling environment must include dynamic
behavior. Such behavior is analogous to the services an operating
system offers in the software world, hence the term System-on-Chip
Operating System (SoCOS).
In addition, the designer needs to model the real-time aspects
of the system in an early phase of the design. SoCOS supports
executable modeling of both functional and timing behavior of
dynamic real-time multi-threaded systems. A system model in SoCOS
consists of communicating processes. Each process is executed in a
separate thread in the OS. You can create processes statically or
dynamically. Processes have communication ports connected to
communication channels. You can also statically or dynamically
create these communication channels.
SoCOS supports a combination of several computational models in
a single executable specification. A designer can distinguish
asynchronous, reactive, and synchronous models. Asynchronous models
communicate with the rest of the system without any relationship to
a system-wide clock. Reactive models contain processes that are
triggered by an event on a communication channel. These models are
very similar to interrupt routines in software and are dynamic by
nature. Synchronous models communicate with the rest of the system
on clock edges. SoCOS allows simulating OCAPI hardware models
together with SoCOS software models. In addition, SoCOS supports
performance verification and refinement of both concurrency and
communication aspects of a design.
After refining the scheduling and communication of the embedded
software, you generate RTOS-based embedded software code (Figure
4). In this step, every SoCOS system call, present in the
communicating process model, is replaced by a corresponding piece
of code based on an RTOS library. In this translation, you
guarantee the behavior to be consistent, while the implementation
overhead of SoCOS is replaced by an efficient RTOS implementation.
Finally, the embedded software code is co-simulated in the system
model using an OCAPI library. OCAPI provides the functionality of a
typical RTOS to the application code. This functionality is
implemented on top of the underlying SoCOS simulation environment.
This approach can be used to co-simulate either existing software
code or generated code with the rest of the system model. The
simulation is useful both as a final verification of
embedded-software functionality and as a reference before
transferring the software to the target processor.
Figure 4: Software refinement path
Supporting Concurrency and Time
Modeling of functionality, structure, and timing of a
heterogeneous system at different levels of abstraction requires a
mix of features that cannot be found in any existing language. In
addition, plain C++ system models are often inflexible due to the
centralized representation of concurrency and time—typically
a loop in the main program. To solve this, IMEC extended C++ with
the class library TIPSY (TImed Parallel SYstem modeling). TIPSY
supports a decentralized notion of concurrency and time in an
executable model of an SoC.
A limitation in traditional languages and environments for
system modeling is that they usually stress the specification of
functionality at a high level of abstraction without taking into
account architecture and timing. However, industrial experience
shows that architecture and timing are equally important at the
beginning of the design process. TIPSY encourages modeling of
architecture and timing in an initial abstract model and then
supports concurrent refinement of functionality, architecture, and
timing. This approach to system design is similar to solving a
jigsaw puzzle: key pieces such as border pieces and pieces with
easily recognizable features are placed first with the remaining
holes filled later.
TIPSY models concurrency by using non-preemptive multithreading.
Threads or tasks can dynamically spawn and kill other tasks, and
can block (suspend) until resumed by another task or a timeout. For
efficiency reasons, each task has a local copy of the current time
and tasks synchronize their local times only when they communicate.
Inter-task communication is based on shared data, for the practical
reason that it is the basic communication mechanism available in a
multi-threading environment. Shared data allows an efficient
implementation of other communication primitives. Before accessing
shared data, a task must synchronize its local copy of the current
time with that of the other tasks. This ensures that shared data
accesses are executed in correct time order.
TIPSY is the foundation for a rich set of C++ libraries
supporting different modeling paradigms that can be easily combined
using TIPSY's common notion of concurrency and time (Figure 5).
Moreover, TIPSY supports the modeling of RTOS schedulers and
preemption in a very efficient way. This becomes increasingly
important, since an increasingly large part of a SoC design
consists of software running on an embedded processor under the
control of an RTOS.
Executable specifications based on TIPSY give the designer early
feedback about the functional and timing correctness of the design.
These specifications also allow the designer to check the
consistency of the system before and after design refinements.
Figure 5: TIPSY is the foundation for multiple modeling
SoC++ Design of an ADSL Modem
Throughout its development, the applicability of the SoC++
environment has been used on several demonstrator designs that have
been brought to working implementations. This is an essential
factor in the success of SoC++.
The demonstrator designs have been related to IMEC's Intelligent Home
project. The designs include a cable
modem, an image-compression
unit, a wireless LAN
modem, and an Internet
appliance. Several embedded-software design projects are
finished, including the design of an industrial strength firmware
simulation of an ADSL
(Asymmetric Digital Subscriber Line) modem.
The ADSL modem (Figure 6) contains a number of DSP functions,
implemented in hardware, and a number of initialization and
synchronization functions, implemented in software, on an embedded
ARM processor. The hardware functions are modeled in approximately
15,000 lines of C++ code. The software functions are modeled using
a reactive computational model in approximately 20,000 lines of C++
code. The model allows verification of the correct operation of the
initialization and run-time behavior of the ADSL system. The modem
initialization sequence takes 10 seconds in real time. The model
execution of these 10 seconds takes less than 30 minutes of CPU
time. Because of its efficiency, you can use the model to explore
different embedded-software implementations.
Figure 6: Architecture of an executable model of ADSL
Future work in SoC++ concentrates on two areas. First, SoC++
needs to improve system-level specification paradigms. While SoC
design already has a rich set of specification paradigms, the
coupling among these paradigms is not worked out very well, and is
restricted to co-simulation. IMEC focuses on co-design with
different paradigms and approaches this problem as object-oriented
The second area of desired innovation relates to refinement
strategies for mixed hardware/software systems. Recently there has
been increasing diversification of SoC architectures. New classes
of processors and processing systems are announced every few weeks.
These systems go beyond the classic view of layered, hierarchical
software and hardware systems. The systems contain reconfigurable
elements in their basic set of operators, which allow tuning of
communication and computing architectures at runtime. The future
SoC++ environment will introduce support for runtime
reconfiguration both at the hardware and software design
About the Author
Katrien Marent has an engineering degree in
microelectronics. She joined IMEC in 1992 as an analog designer,
specializing in the design of low-noise readout electronics for
high-energy physics. She is currently a scientific editor at IMEC
and responsible for authoring and editing the research
organization's numerous technical documents and publications.