In 1999 Altera set out to develop the Nios processor, a product that would provide the functionality of a microcontroller within a PLD. However, first we had to better understand the needs of microcontroller users to ensure that our product met them. One thing we noticed about the microcontroller market was that many of the products consisted of different mixes of peripherals and memory combined with the same processing unit.
For many designers of embedded devices, a large part of the decision over which exact microcontroller to use hinges on which peripheral or peripherals are offered in that product. We perceived this as an advantage for our Nios product, since in principal any user could create any custom peripheral desired by designing it into the PLD. However, this concept was not a common one among designers and we did not want them relying solely on custom peripherals that they would have to build. As a result, one of our challenges was to create a set of peripherals and interfaces that would be able to create a viable embedded controller and to provide a simple, intuitive way for engineers to connect them and any custom peripherals to the processor. Without this, the advantage of implementing an entire system on a programmable chip would be accepted only slowly among the microcontroller-using community.
In addition to the potential confusion over peripheral content and custom peripherals, we faced some of the usual questions about how to design the product. Should we utilize a 16- or 32-bit instruction-set data path? How much RAM and ROM should we include? What method should we use to connect peripherals to the processor?
As we did our research, it became clear that a fixed answer to any of these questions limited the application of the Nios processor, making the product less attractive. On the other hand, by taking advantage of the flexibility of the implementation platform (programmable logic), we could instead develop a microcontroller that offered many aspects of user configurability.
Having decided to develop a configurable microcontroller, we set about determining which parameters the user could control. We figured that supporting both 16- and 32-bit data paths would keep the potential user base very broad; although an 8-bit data path version is also possible, the trend in the market is clearly toward 16- and 32-bit processors and away from their 8-bit cousins.
To accommodate both 16- and 32-bit data paths, we chose a 16-bit instruction set, which would lead to a smaller memory footprint and also aligned with the broader availability of low-cost 16-bit flash memories for boot code. This choice also required only one memory access per instruction for both 16- and 32-bit data path implementations of the Nios processor, whereas a 32-bit instruction set would have required two accesses per instruction in a 16-bit data path implementation.
We wanted the Nios processor to be perceived as a mainstream processor architecture and not a toy for hobbyists. To avoid this type of confusion, we added several sophisticated processor features including a large windowed register file for fast context switching, an integrated and vectored interrupt controller and dynamic bus sizing to accommodate memories that are narrower than the processor data path.
P>We also included a feature-rich instruction set with both full- and partial-width register-indirect addressing (with and without offset). The instruction set also supports 5/16-bit immediate values by providing several arithmetic and logical instructions that take a 5-bit immediate value as an operand. A 5-bit immediate value represents a constant from 0 to 31; to represent a constant value that requires from 6 to 16 bits, a PFX instruction is available. The 11-bit K register is set by the PFX instruction and is concatenated with the 5-bit immediate value of the next instruction executed to allow the use of a 16-bit immediate value.
In keeping with our decision to provide the user with a flexible configuration, we allow the user to determine the size of the Nios processor register file. This decision presents a classic trade-off because the register file is built from memory resources that are embedded in the PLD. A larger memory file leads to fewer accesses of off-chip memory and, as a result, generally faster operation. A smaller memory file may lead to potentially slower microcontroller operation, but it frees more embedded memory resources for the rest of the user's design. In the Nios processor, the register file can be a maximum of 512 registers deep. To illustrate the memory usage in an actual PLD, consider the Apex EP20K600E. It has 311,296 total embedded memory bits, which means that a Nios processor with a maximized register file can be implemented in it, with 294,912 embedded memory bits still available to the rest of the design.
To generate the Nios processor design after all the user settings were determined, we developed a Perl script that would output a VHDL or Verilog file that could then be compiled using a PLD development tool like Altera's Quartus II. This Perl script later became a component of the SOPC Builder, a tool that would handle not only the creation of the processor design, but its integration with other elements like memory and peripherals to arrive at a complete microcontroller solution.
To demonstrate the power of building custom peripherals and hardware accelerators, we chose two commonly used functions that would benefit a great deal from hardware implementation: bit shifting and multiplication. The Nios processor instruction set includes several instructions to perform bit shifts and other data manipulations. In order to accelerate shift operations, we decided to include the option to have the SOPC Builder create a custom bit shifter. To use it, the designer activates it within the SOPC Builder and chooses a shift size. Then, during program execution, instructions that require a shift by that number of bits (or fewer) are performed in a single clock, resulting in up to a 32-times speed improvement.
Multiplication can be a resource-intensive function compared with other logical functions if high-speed operation is desired or large bit widths are involved or both. With this in mind, we developed two methods of hardware-accelerated multiplication for the Nios processor. The first method is called MSTEP, which performs a 1-bit-per-clock multiplication using additional PLD logic resources. The user indicates whether or not the MSTEP function is to be built by using a switch in the SOPC Builder tool. The second method is via an integer multiply unit, which performs a 16 x 16-bit multiplication in two clock cycles. To take advantage of either of these multiplier units, the user selects them in the SOPC Builder. When the system is created, a custom software library is compiled that generates the appropriate assembly language calls when the user's C code is compiled.
While the examples of the multiplier and bit shifter demonstrate the advantages of building custom functions and peripherals to achieve specific performance goals, we felt that our job in bringing this capability to the embedded designer wasn't complete. After all, in a typical microcontroller and microprocessor environment, peripherals require interrupts, priorities and addressing, as well as some sort of connection scheme, like a bus. And in an off-the-shelf microcontroller, these elements have already been thought out by the microcontroller's designers. By allowing for the development of custom peripherals, we were also potentially forcing the user to do extra work.
The answer, we felt, was to devise an intelligent, automated way of generating all those elements that an embedded designer would expect from an off-the-shelf microcontroller. So we made the SOPC Builder able to automatically generate the logic that would handle the interrupt requests from the peripherals and grant them access to the processor when necessary.
The SOPC Builder incorporates this logic into the hardware description file that it outputs. In addition, having received the addresses of the peripherals and the data structures of their registers from the user, it's easy for the SOPC Builder to generate several other items to aid the designer. These include custom libraries of routines for the peripherals in the system, a complete memory map and header files for the user to include in C code. Through partnership and co-development effort, the user can compile these files using an industry-standard tool: the GNU Pro compiler from Red Hat.