A number of good alternatives exist to emulate software operation for architecture exploration. System modeling experience shows that three types of software modeling work quite well. At a statistical-level, a delay value for each function is sufficient to trigger the traffic on the bus and the memory devices. At
the hardware-level, an application-specific instruction allocation
called instruction-mix table provides an extremely accurate
representation of a software task. The last method is to annotate performance-intensive portions of the code and generate instruction trace during execution. This last technique is good to test the architecture behavior for a benchmark or set of benchmarks. This is also good to evaluate how a piece of code will behave in a multi-core environment.
The first approach requires a table with the name of the task and the associated delay. During
execution, the processor model does a table lookup and based on the
task (A_Task_Name in Table 2) from the RTOS delays the processor based
on the number and type of instructions in the task.
Table 2: Instruction mix table for a software task
application-specific instruction allocation technique is the most
versatile and can be used for software testing, hardware verification
and architecture optimization. As shown in Table 2, each
software task or thread has a number of instructions and percentage of
different types of instructions. In the case of My_Task_1, we have 10%
of integer, 48% floating point, 10% logical, 7% load-store, and 25%
brand instructions. This table is fed into a software generator block that generates the instruction sequence based on an intelligent algorithm. This sequence is used for the hardware testing, thus providing a more realistic test of the platform architecture.
Table 3 shows the output for My_Task_1. To
get an accurate distribution of the instruction type within a code
structure, use a good decompiler such as Hey-Ray, Intel Vtunes or
boomerang. The number of tasks or threads will differ based on the application. Getting
this amount of flexible instruction sequence to simulate is hard to
achieve using an ISS but fairly easy using a good software generator.
Moreover, you can run the tasks in order, random order or based on the input request. This mechanism can provide a lot more variety in terms of cache access, hit-miss ratio, bus activity and pipelines flushes. One can modify the task instruction mix and study the impact on your architecture by simply modifying the percentage table. This is quick to do and is not locked to a specific code implementation. Moreover
the variety allows for a much larger level of architecture testing. If
you look at the generated out for My_Task_1, you can see diversity in
the instruction sequence, allowing for a much larger level of testing.
Table 3: Instruction sequence output associated with the first line of the instruction mix table
To view and simulate a model that uses this application-specific instruction mix table, go to http://www.mirabilisdesign.com/new/software/demo/Partitioning/SoC/Power_Perf.htm. Accept all security warning and the model will load up in the Web Page as an Applet. You can click on the GO to run the simulation. Similarly you can change a parameter in the model view and click on GO. You will see the changes in the reports.
A recent TechTalk at http://youtu.be/_csv53LlXp8 by Robert Juliano Ph.D., Sr. Director of Applications, Mirabilis Design covers a similar topic.
The instruction-mix table method of software emulation offers the most advantages for architecture exploration. Using
this approach, the designer can view the depth of the pipeline,
identify the cause of a stall, power management algorithm impact, memory
hierarchy operation, performance slowdown of load/store requests, and
cache coherency algorithm quality. The simulation reports provide
significant visibility into the architecture operation and allow for
great optimization of the system throughput.
A number of other approaches can also be used for architecture exploration. They are extremely hard to generate. This
includes hand-annotating specific sections of the code; generating a
bus trace with a list of instructions, and tapping the Operating System
for cache accesses. These approaches are implementation-specific but can be targeted for a timing-intensive function. So,
the next time you are doing architecture exploration, look at your
options for the software emulation to test the architecture. Look beyond
the ISS. Look at the instruction-mix table.
About the author:
Deepak Shankar is the Founder of Mirabilis Design, a systems engineering software solutions provider. Mr.
Shankar has been involved with architecture exploration of embedded
systems, semiconductors and real-time software for over 20 years. While
at Mirabilis Design, he has developed new methodologies and solutions
to streamline the validation of system specification, make architecture
exploration extremely accurate and accelerate the systems engineering
process. Prior to Mirabilis Design, Deepak Shankar has
worked at Cadence, Spincircuit and Memcall in technical, marketing and
executive management roles. Mr. Shankar has published over
30 articles in technical journals around the world and has been the
lead speaker at various IEEE and other Organizations. Mr.
Shankar has a MS in Electronics from Clemson University, MBA from
University of California Berkeley and a BS in Electronics and
Communication from Coimbatore Institute of Technology.
If you found this article to be of interest, visit EDA Designline
where you will find the latest and greatest design, technology,
product, and news articles with regard to all aspects of Electronic
Design Automation (EDA).
Also, you can obtain a highlights update delivered directly to your
inbox by signing up for the EDA Designline weekly newsletter – just Click Here
to request this newsletter using the Manage Newsletters tab (if you
aren't already a member you'll be asked to register, but it's free and
painless so don't let that stop you).