In the coming decade, SDRs will drive all types of wireless devices, answering the exploding demand for multi-standard, high-throughput wireless communication. Such SDRs will have rigorous constraints for energy consumption, real-time processing, low-cost fabrication, and short time-to-market design. They will be implemented on multi-purpose, multi-processor System-on-Chips (MPSoCs). But the design of SDRs on MPSoCs brings about a dramatic increase in the complexity of hardware and software design.
IMEC is an independent research center, focusing on next-generation chip technology and on the enabling technologies for ambient intelligence. IMEC's research bridges the gap between fundamental research at universities and development in the industry. IMEC has its headquarters in Leuven (Belgium), and employs more than 1500 people, including 500 industrial residents and guest researchers. One of the research domains of IMEC concerns solving the technology bottlenecks for future wireless, multi-mode, multimedia devices.
A team at IMEC recently took the design complexity hurdle. They designed and demonstrated an SDR on MPSoCs, using advanced methods such as electronic system-level (ESL) design and co-emulation. The team first created a high-level virtual model of the SDR MPSoC. Then, each component of the platform was incrementally refined to the RTL level, verifying each step through co-simulation and co-emulation. State-of-the-art processor design tools were used to further model one of the critical low-power processors of the MPSoC. Also with the ESL tools, the data transfers between the processing cores were optimized to meet the tight timing constraints of baseband processing. The ESL tools helped to achieve an efficient design, especially through the architectural exploration and the early performance assessment.
Designing SDR MPSoCs calls for ESL tools
In general, an SDR is a reconfigurable and programmable hardware platform that can potentially tune to any frequency band and receive any modulation. The SDR solution that IMEC envisions has a reconfigurable front-end combined with a (re)programmable baseband platform. It targets 802.11n, 802.16e, and 3GPP-LTE.
Potentially, an SDR consumes much more energy than dedicated hardware solutions. To overcome this, a trade-off had to be made for each functional unit between hardware and software solutions, carefully weighting the resulting energy efficiency. Hardware abstractions could only be introduced when the impact on the overall energy consumption was low, or when the extra flexibility could be exploited for improved energy management. These considerations naturally led to an MPSoC architecture in which the various tasks could be implemented on different cores, providing the necessary performance and flexibility at a minimum cost.
Traditionally, chips are designed close to the physical level. For MPSoCs, this will not do, as the physical design does not allow early verification and tuning. ESL is a design approach that raises the abstraction to the highest level of the target platform. This allows starting with an early verification of the hardware and software choices, long before a complete RTL model or a silicon prototype are available. Therefore, IMEC chose to use advanced ESL design methods. Another consideration was that commercial SDR MPSoCs will be designed under rigorous time-to-market, energy and real-time processing constraints, which will also call for ESL design tools.
ESL allows high-level, selective abstraction
The IMEC team first created a virtual model of the target SDR platform. This is a transaction-level model (TLM), specified in SystemC. For each component, a suitable abstraction level was chosen, depending on the IP availability, the complexity, and the speed requirements for the simulation. A major plus of the tools used is that components defined on different abstraction levels can still be co-simulated. This allows designing and refining each component independently.
With that early version of the virtual SDR ready, work was started on the hardware-dependent software. This software, including the hardware abstraction layer, enables basic simulated hardware/software validation. It also provides early feedback on the performance of the hardware/software choices and interaction. At this stage, this feedback could already be used to optimize the interconnect and the handling of interrupt events.
In a next step, the development of the functional software was initiated. The programmable and non-programmable cores on the platform are either custom processing units modeled with Application Specific Integrated Processors (ASIP) development kits or third-party IPs, for example ARM Instruction Set Simulators (ISS). Each ISS is placed in a SystemC container with a bus interface. The ISS is simulated in the context of the SystemC platform simulation. In that way, multiple cores and their synchronization can be debugged concurrently.
Throughout the design and refinement of the virtual SDR platform, the model of the bus was kept at cycle accurate level. This bus offered a correct and well-defined interface to attach units of various abstraction levels (for example peripherals). These units could then be verified and refined, starting with a model of their functional behavior only. Later, exact timing information was added to match the real requirements. The next step, exploring the platform to optimize its performance, could then be performed independently from the rest of the platform development.
Platform exploration consists of gradually identifying and repairing bottlenecks. Having a virtual model of the platform, the IMEC team could start platform exploration early in the design flow. If they would have designed the SDR directly at the HDL level, it would have been much more difficult and time-consuming to evaluate changes in the interconnect- and platform architecture, and relate them to changes in the performance.
1. Design schema of the SDR.
To explore the platform, IMEC examined the performance based on simulating the real behavior of the selected IPs and bus connectivity. The interconnect architecture was optimized by introducing various bus architectures and progressively searching for the ideal configurations. The team assessed, for example, if a further segmentation of the interconnect would improve the performance, or on which interconnect segment a particular unit should be placed to get the best access time. At each step, the bus throughput and utilization was evaluated. And a simple software test bench was developed on the ARM ISS to do profiling and interconnect stress tests. The test bench emulates the interconnect requests at maximum load.
Gradual RTL refinement with co-simulation and co-emulation
Gradually, and for each unit separately, IMEC refined the design down to the RT level. The rest of the platform was kept at TLM level, serving as a test bench to verify the RTL units. In this co-simulation setup, the TLM platform simulator in SystemC was the master, calling the RTL simulator to simulate the units already at RT level.
In this way, the TLM and RTL blocks could be co-simulated in the early stage of the RTL refinement, when most components were still defined as TLM. However, as more blocks were RTL specified, the speed of the co-simulations rapidly became a bottleneck, reducing the validation that could be achieved. To overcome this issue, the team then changed the test setup, employing co-emulation.
Usually, RTL modules are verified with pure hardware emulation. However, this would require developing synthesizable RTL test benches. Co-emulation avoids this step. It is a new verification technique in which SystemC TLM blocks are simulated on a PC, and the RTL blocks are simulated on a dedicated emulation station. The two simulation environments communicate via transactors. These transactors or communication pipes are implemented as function calls between the SystemC platform and the SystemVerilog direct programming interface of the RTL simulation.
During a transaction, the simulation control is temporarily transferred from the emulator station to the SystemC environment on the PC. Simultaneously, the emulator hardware is stopped through clock gating. Next, the SystemC simulator executes the function called through the transactor. This function may, in turn, trigger events that start other processes or threads. When the original function finishes, the emulator station resumes functioning. As the emulator hardware clock was stopped during the transaction, the emulator sees the SystemC function returning immediately (the execution time is zero).
Design of a low-power pre-synchronization ASIP
In burst-based communication, for example in IEEE802.11 or IEEE802.16, the burst detection functions have high duty cycles. They thus need an ultra low power implementation; but at the same time they should still be programmable so that they can support various modes. For its SDR MPSoC, IMEC designed a dedicated low-power pre-synchronization ASIP targeting the IEEE802.11a/n and IEEE802.16e synchronization at 20MHz input rate.
The ASIP was implemented in three steps. First, the processor was modeled in LISA (Language for Instruction-Set Architecture). Then, RTL code was generated, synthesized and profiled in a gate level power simulation. Finally, a backend experiment was carried out to ensure timing closure.
IMEC chose a tool set that enabled the generation of software development tools, such as an assembler, linker, and instruction-set simulator very early in the design process. The processor micro-architecture can then be co-optimized with the kernel software. Moreover, the tools offer strong support for platform integration (by generating a wrapper for SystemC-based virtual platform modeling) and good-quality automated RTL code generation.
The resulting processor delivers a theoretical maximum performance of 5 GOPS (32bit equivalent) at a peak power of 25 mW. The energy efficiency is thus 200 MOPS/mW (fully loaded). IEEE802.11a synchronization (20 MHz) requires only 630 MOPS. The processor consumes 7.17 mW when executing this kernel (79.5 MOPS/mW). The more demanding IEEE802.16e synchronization (20MHz) requires 1838 MOPS. For this kernel the estimated average power is 15.86 mW (115.89 MOPS/mW). The achieved energy efficiency is 2 to 4 times higher than in typical SDR baseband processors. This ASIP is ready for low power packet detection, enabling energy-aware MPSoC SDR platforms.
ESL tools assisted in overcoming MPSoC design hurdles
With the ESL tools from CoWare the IMEC team was able to overcome the complexity hurdles in designing an SDR on MPSoC. Designing the MPSoC in a more traditional way, closer to the hardware level, would have been prohibitively cost- and time-inefficient. Throughout the development, IMEC used the CoWare Platform Architect and CoWare Processor Designer tool families extensively.
The most notable advantages and efficiency gains came from the ability to build a virtual platform at an abstract level in SystemC. And the possibility to co-simulate and co-emulate units refined at RT level in the context of the complete platform also proved to be a major plus. It enabled early verification and trade-off of the hardware and software choices that had been made. It also allowed exploring architecture alternatives without having to build an RTL platform, or even a silicon prototype.
Also the ability to profile parameters on the virtual platform using bus transaction traces and analysis tools provided early feedback on the design. It especially allowed understanding the complex interactions between the various units of the MPSoC before a fully functional system was available.
About the Authors:
Bart Van Poucke is a Technical Business Manager within IMEC. Bart is responsible for IMEC's technical business relations in the field of Nomadic Embedded Systems. Bart obtained his electrical engineering degree at the KIHO (Gent, Belgium) in 1996. He can be reached at: Bart.Vanpoucke@imec.be
Bruno Bougard has been a researcher at IMEC since 2000. He has a an M. Sc. in Electrical Engineering from the Polytechnic University of Mons, Belgium (2000) and the Ph. D. in Electrical Engineering from the K.U.Leuven, Belgium (2006). Bruno can be reached at: Bruno.Bougard@imec.be
Jan Provoost (Jan.Provoost@imec.be). is a scientific editor at IMEC, reporting about IMEC's breakthroughs in international scientific magazines and newsletters. Jan has a Masters degree in Languages (1989) and a Masters degree in Information Science (1993), both from the K.U.Leuven, Belgium. He can be reached at: Jan.Provoost@imec.be
CoWare Introduces ESL 2.0