As FPGA, ASIC, and system on chip (SoC) based digital systems increase in size and complexity, so does the importance of hardware functional verification tasks. Hardware simulation has been the cornerstone for functional verification and will remain so for years to come. Today, it is estimated that 40-70% of the design cycle for ICs is spent in verification. Companies are taking the necessary steps in their verification efforts by adopting proven verification methodologies and using various tools to cut their verification time.
Traditionally engineers have used a number of high level languages such as C, C++, and Perl, along with Verilog, to address their functional verification needs. While one can argue about the strength and weaknesses of each language, these languages were not purposely built for hardware verification in the first place. The Jeda Hardware Verification Language (HVL), a verification platform from Jeda Technologies Inc., was developed out of the need for better verification tools based upon real-world experiences of validating complex digital systems.
Simulation based hardware verification is really a software problem. Therefore, adherence to well understood and widely practiced current software methodologies was the cornerstone for Jeda's creation. Jeda is simple, so it can be easily learned by novice engineers; familiar, so that experienced engineers can easily pick it up; object oriented, to take advantage of modern software development methodologies and code reuse; multi-threaded, to address the concurrent nature of hardware systems; and aspect-oriented, to maximize code reuse.
2 Additional background
Although Jeda borrows best practices and constructs from C, C++, Java, AspectJ and Verilog, Jeda provides yet additional level of constructs to specifically address hardware verification and modeling needs. These include:
- Strict type checking and automatic garbage collection support
- Native clock and cycle handling
- Native concurrent programming support
- Timed expression
2.1 Garbage collection
In Jeda, common user programming errors seldom occur, because of the automatic garbage collection and strict type checking built into the system. For instance, Jeda's automatic garbage collector decouples the user from manually addressing memory allocation/de-allocation issues. If performed by the user, such memory allocation/de-allocation tasks, which are very prone to user programming errors, usually result in memory leaks and cause simulation problems which are hard to debug.
2.2 Native clock handling
Jeda has the capability to associate statement evaluation in relevance to any clock defined in the testbench. This provides a concise way to construct transactors and checkers that are quite flexible and work well in a multi-clock environment.
For example, the following statement will be executed after 5 cycles:
@5 data = get_new_data() ;
In this case, the default edge and clock, "posedge" of clock, is used. But the specific clock signal and negative edge can also be used as:
@5 (negedge tck) data = tdi ;
2.3 Native concurrent programming support
Jeda supports fork/join, fork/join_none, and fork/join_any, allowing multithreading capability. For example, the following code spawns a thread that monitors the value of data and reports an error if the condition is exceeded.
The concept of a thread-creating mechanism using fork/join_none and fork/join_any were invented by Jeda creators while they were developing Vera at Sun Microsystems in 1994.
2.4 Timed expression
Jeda provides another powerful and compact concurrent programming primitive called timed expression that provides a mechanism to evaluate an expression within a multi-cycle window. For example, the following code fragment states that if the equality test for data0 and data1 yields a "true" anywhere between cycles 5-20, then do_it will be executed.
if( @5,20 ( data0 == data1 ) ) do_it() ;
Without the timed expression, the same functionality is expressed as:
Jeda expands the timed express capability and provides concurrent logical and or logical or evaluation of those timed expressions. This provides the ability to construct complex temporal logic evaluation schemes while maintaining a simple notation. For example, the following if statement checks the concurrent logical or of two timed expressions:
if( p_or( @5,10( data == data0), @5,10( data == data1) ) ) ..
3 Tutorial general flow
The flow of this tutorial resembles the steps typically taken by design teams: construct a hardware model then construct a testbench. In the modeling phase we will demonstrate how Jeda is used to construct a stimuli-generator for a 4x4 crossbar, as well as the full implementation of the crossbar. In the testbench section, we demonstrate how Jeda is used to construct a Jeda-only verification framework to validate the crossbar.
Code re-use is also a prevalent theme of this tutorial. The tutorial shows that with careful and up front thought, we are able to fully re-use the Jeda code written for modeling exercise in the testbench phase. Doing so provides the added advantage of cutting verification time, through code-reuse and minimizing risk by using an already tested model.
4 DUT description
Throughout this tutorial we will be using a 4x4 crossbar switch as the Device Under Test (DUT) as in Figure 4.1. In this section we discuss the protocol for the crossbar switch and provide a timing diagram for its input(Rx) port.
The function for this crossbar is summarized as follows:
- DUT has 4 input ports (Rx0-3) and 4 output ports (Tx0-3)
- Each input and output port has a queue associated with it
- A cell (128-bit data) is received at the input and placed into the queue
- Cells from input queue are transferred to the destination output queue via the crossbar switch
- The crossbar switch can only transfer one cell to a specific output queue at a given cycle
- The data for Each transmit (TX) and receive (RX) port has the same set of signals, RDY, VLD, DST, DATA, and CLK as defined in Table 4.1.
Figure 4.1 -- 4x4 crossbar switch DUT
Table 4.1 -- Transmit port signals
Each transmit port uses the source synchronous clock (CLK), and each receiver synchronizes to the clock provided at its input. A single cell transfer is deemed complete when RDY is driven followed by the assertion of VLD along with payload on the following cycle. In other words, one more data may be sent after receiver negates RDY signal. Figure 4.2 shows the timing diagram of this handshake.
Figure 4.2 -- Transmit timing diagram
5 Modeling the DUT with Jeda
Hardware architectural modeling is used by designers to simulate and prove their design assumptions early in the design process. By using Jeda, designers can create either untimed or cycle accurate hardware models easily without resorting to using other procedural languages.
The two major enablers that make Jeda ideal for modeling are:
- Clocks: Jeda provides the capability to specify and use either single or multi-clock domain environments without needing additional simulation licenses.
- Multi-precision data types: Jeda provides double and float data types that open the possibilities of reusing C/C++ based algorithms inside Jeda models.
The modeling tutorial below decomposes the modeling task by constructing three unique models the receiver, the transmitter and the switch models.
5.1 Crossbar primitive classes
Our goal here is to build the fundamental building blocks that will help modularize the crossbar Jeda model. For this, we first define two class constructs shown in Listing 5.1 and Listing 5.2.
The class data_port provides an abstraction for a generic input and output port definition that will connect to the DUT. The ports are defined using Jeda's signal type, that is, a pointer to a port instead of hardwired port name.
The class data_cell defines the basic data element sent and received on the crossbar. It defines a member function RND() to randomize the cell's destination ID and payload.
Jeda's RND() function implements a Mersenne Twister Random sequence generator that is fully controllable by the user. compare() returns a 1 if the comparison of the destination ID and payload for the input data_cell type dt matches, and 0 otherwise.
Listing 5.2 -- Data cell class (File:utils/data_cell_0.j)
5.2 Transmitter and receiver models
The structure for the rx_model and tx_model is illustrated in Figure 5.3. Each port in the model makes use of data_port class defined in an earlier section. Using Jeda's list we construct a data_cell list to model the tx_queue and rx_queue for the transmit and receive data buffering. From a timing perspective, the enqueuing and dequeuing of the data_cell from the respective queues is done in a different clocking domain instead of the core clock.
Figure 5.3 -- Tx and Rx model
Although there is a lot of similarity between both models, the rx_model is slightly different. It provides a flow control mechanism to back-pressure cell arrivals on the input queue. The model negates RDY when the number of data_cells at the input queue reaches a high water mark(rx_stop_num) and de-asserts RDY when the number reaches the low water mark(rx_enable_num).
5.3 Model implementation details
In this section we will briefly discuss the composition of the tx_model and rx_model and demonstrate how to construct Jeda models. The following two considerations were taken into account when developing Jeda models:
- We intend to re-use all the models in the testbench phase. As a result the models were all constructed using Jeda's class mechanism.
- We use Jeda's aspect programming features to implement debug print messages outside the models.
The skeletal tx_model is shown in Listing 5.3 and is constructed using class
to maximize its re-use. The listing only shows member function definitions without their full code implementation.
Listing 5.3 -- The tx_model framework
As we have discussed earlier, Jeda provides the ability to write models with multiple clock domains. The two functions shown in Listing 5.4 show that the enqueuing of the cell into tx_queue (line 15) is in a different clock domain versus dequeuing of tx_queue (line 7). The fork/join_none statement is used here to spawn a thread which checks enqueued cells from the transmitter queue and sends them out on the transmit port.
Listing 5.4 -- tx_model sample member functions
The rx_model construction is similar to the one of the tx_model as in Listing 5.5.
Listing 5.5 -- rx_model (utils/tx_rx_1.j)
A full listing of rx_loop and receive_cell functions are provided in Listing 5.6. Another example shown in lines 27-30 is of the concurrent threading mechanism briefly touched upon earlier sections. Here, once the thread is started it simply waits for the arrival of the cell before it determines whether to stall cell arrivals by asserting line 29.
Listing 5.6 -- Sample member functions for rx_model
In receive_cell() we demonstrate how you can use Jeda's built-in function for bus pipelining. In line 38, the previous cycle value of RDY and current cycle value of VLD are evaluated to determine if cells are stalled. Notice that no additional code is needed here to keep track of previous values for RDY and VLD.
5.4 Switch model
The implementation of the crossbar_model per Listing 5.4 is as follows:
- Line 4: Jeda's semaphore class type is used to sequentially order the arbitration for the transmit queue.
- Lines 6-22: member function new() creates 4 copies of rx_model and 4 copies of tx_model respectively.
- Lines 24-40: the crossbar switch implementation where xbar_loop is the top level function. Whenever a cell is received on the receive port, it schedules it to be sent out to an output port determined by the data_cell Id. The transmit sempahore get(line 25) and semaphore put(line 28) guarantee that requests are sequentially ordered as they enter the switch and guarantee that only one cell is processed at a given cycle. Jeda's timed expression capability provides a concise way to sample the incoming signal with the respect to the clock of the transmit port. Each transmit port clock runs in its own clock domain.
Listing 5.7 -- Crossbar Switch Model (File:model/switch_model.j)
6 Testbench construction using Jeda
In this section we discuss the construction self-checking testbench. We will again use the crossbar model developed in the previous section as our DUT. Building both the testbench and DUT in Jeda allows us to use Jeda in standalone mode without needing other simulators. Typically the DUT would be implemented in Verilog.
6.1 Switch tester class
The testbench is implemented using class switch_tester as shown in Figure 6.4. The rx_model and tx_model are re-used in this testbench to drive and receive cells respectively while maximizing code reuse.
Figure 6.4 -- switch_tester class
The data flow into and out of the DUT is summarized as follows:
- send_rand_data generates random data_cells for each tx_model. At the same time is also sends the same data_cell into a 4x4 check_queue that corresponds to 4 input and output ports.
- tx_model drives the data_cells into the DUT.
- Each thread of check_loop() implements a self-checking thread that compares the cells on the output of the DUT against expected results from check_queue.
The following code fragment provides an overview of the switch_tester.
Listing 6.8 -- Switch Tester (File:utils/switch_tester_2.j)
The two points illustrated in the code listing above are:
- tx_model and rx_model are reused without any modification, highlighting how Jeda's object-oriented programming feature is used for developing re-usable and easily modifiable code.
- Line 5: Jeda's multi-dimensional array capability permits an easy way of building a queue checker.
6.2 Testbench top structure
Listing 6.9 illustrates how to connect the basic building blocks discussed thus far to build our Jeda testbench.
Listing 6.9 -- Testbench Jeda implementation (File:suites/switch_test_top.j)
An engineer familiar with structured languages, such as C/C++ or Java, can quickly go through the Jeda code to understand its functionality. One major consideration in designing Jeda was to minimize the learning curve.
6.3 Example test cases
Although one can come up with a number of tests to validate the crossbar, we only present here one directed test. The directed corner test in Listing 6.10, fast_in, tests the DUT's flow control mechanism by making the cell arrival rate faster than the cell leaving rate.
Listing 6.10 -- Fast In Test (File:tests/fast_in.j)
7 Aspect-oriented programming
Aspect-oriented programming (AOP) is a new programming paradigm that provides explicit language support to extract "crosscutting concerns" (behavior that cuts across the typical divisions of program functionalities). AOP is a well researched field of study with demonstrable implementations such as AspectJ, the AOP extension of Java developed at Xerox PARC.
You may ask, what AOP has to do with verification? The answer is a lot. Object-oriented facilities in Jeda allow us to divide verification tasks into manageable modular components while allowing higher level of abstraction. Although verification test benches are developed with the full intent to maximize their reuse, one cannot anticipate how such code is used in the future. AOP is designed to address specifically this type of problem.
With AOP, additional functions can be added to an original Jeda code without modifying the base code as illustrated in Figure 7.5. The additional function is weaved into the original code, and the mechanism is suitable for expressing the "crosscutting concerns" that span multiple classes (such as logging for debugging, performance measurement, coverage measurement).
Figure 7.5 -- Weaving aspect code into objects
Jeda's AOP routines provide you with flexible way to implement functions such as:
- Debug message logging
- Performance measurement
- Coverage measurement
- Error injection
In this section we will demonstrate how Jeda is used for writing debug message logging and performance measurements.
7.1 Debug message logging with AOP
Debug messages keep track of the interaction of processes, sequences of events, and changing parameter values. After the debugging phase is over, these messages are no longer needed. Typically engineers write special functions to enable/disable logging or enclose such messages using the conditional compilation directive #ifdef. With the increase in the complexity and size of verification code, the readability of a given code diminishes because the sheer number of messages clutter the original code.
The Aspect code in Listing 7.11 demonstrates how to write a self-contained logging aspect block. So far neither our DUT model nor switch_tester contain debug message logging capabilities.
Listing 7.11 -- Debug_messages aspect (aspect/debug_messages.j)
The aspect block has structures similar to class but cannot be extended. The new code block in lines 6-19 is executed right before the member function rx_model.rx_loop.receive_cell returns. The code insertion points in Jeda's aspect mechanism are called the pointcut. Pointcut uses regular expressions to determine the code insertion point, either through an exact match of a function name or by specifying a regular expression pattern.
The following two advantages are derived from using such a logging scheme:
7.2 Performance measurement with AOP
- The debug message printing is optionally turned on by simply linking the Jeda aspect code to our testbench code or is turned off by default.
- Debug messages that affect multiple classes can be defined in a single location, thus making the modification of such messages easy.
Performance measurements are important during the modeling phase of a design. In our crossbar modeling we want to collect statistics such as optimal queue sizes or number of arbitration conflicts. In the testbench phase we don't want to collect such statistics since it would affect our runtime performance during long regressions. In such instances Jeda's AOP mechanism provides an excellent way of implementing performance measurement checks which can be turned on for modeling and turned off during the testbench phase.
Figure 7.6 -- switch_check aspect
The example outlined here checks how many conflicts are observed on the crossbar transmission as shown in Figure 7.6. The Aspect block is shown in Listing 7.12. The advice xbar_transfer_call() and xbar_transfer_ret() are attached to the call and return point of xbar_transfer() function. They measure the number of cycles between the call and return pair. At the end of simulation the block reports the measurement numbers collected.
Listing 7.12 -- switch_check aspect (aspect/switch_model_check.j)
The aspect block in the above example illustrates in line 13 the call pointcut mechanism. Here, the aspect code block specified in lines 14-21 is executed before the function call switch.xbar_transfer.
This tutorial illustrates how to use Jeda for modeling and re-usable testbench construction. Jeda was built from years of experience in validating complex hardware systems. It embodies the best practices of software engineering while adding new features to address hardware verification and modeling needs. At the end of the day, using Jeda while employing good verification methodologies will increase the productivity of an engineer.
The full examples along with instructions on how to run the sample tests are available on line.
Atsushi Kasuya, CTO, co-founded Jeda Technologies in December 2002 to lead product definition, direction and development efforts. As the architect of the Jeda platform, Mr. Kasuya provides the authoritative direction and expansion of this platform. Previously, he worked as a senior staff verification engineer at Juniper Networks. Prior to Juniper, Atsushi spent 6+ years at Sun Microsystems, along with the other co-founders. During his tenure at Sun Microsystems, Atsushi invented the industry's first Hardware Verification Language, Vera, for which he holds patents.
Teshager Tesfaye, director of technology, is responsible for identifying and leading the development of complimentary technologies for the Jeda-X platform. Prior to joining Jeda Technologies in December 2002, Tesh worked as a staff verification engineer at Juniper Networks. Before that he was at Sun Microsystems, where he worked as a verification engineer on Sun's multi-processing system ASICs along with Eugene Zhang and Atsushi Kasuya.
Ennis Hawk, director of business development, has designed and managed electronic product projects employing digitial/analog/RF technologies for the instrumentation, consumer, computing and telecom markets. Ennis has worked at Juniper Networks, Axil Computer, Sun Microsystems and Schlumberger.