Design Article
ACE’ing the verification of a cache coherent system using UVM
Peer Mohammed, Romondy Luo, Ray Varghese, Parag Goel, Amit Sharma & Satyapriya Acharya
6/25/2012 10:54 AM EDT
Title-2
UVM-based ACE verification IP
One example of a suite of UVM-based verification components that provides a complete UVM-based verification solution for ACE protocol is the Synopsys VIP for AMBA AXI. The AXI ACE VIP provides a system environment component with a configurable number of ACE master and slave agents, a system monitor and an interconnect component, as illustrated in Figure 7. The VIP leverages most of the functionality mentioned in the previous sections as well as the UVM resource mechanism to provide the configurability and the sophisticated stimulus generation requirements in the ACE context.

The master agent generates constrained random ACE coherent transactions, and responds to the ACE snoop transactions concurrently. It also allocates cache lines and performs cache state transitions to the various cache states based on the transactions it sends and receives by using a built-in cache model. The user has back-door access to the cache model through built-in application programming interfaces (APIs) to allocate, de-allocate or query the cache lines. The slave agent responds to read/write requests and models the memory for the system. It also supports ACE-Lite requirements through simple configuration parameters. The interconnect environment component receives coherent transactions from the initiating master, and generates appropriate snoop transactions to the other masters based on domain information. It then responds to the coherent transactions based on responses received through snoop transactions.
The master and slave agent instantiate the port monitor, which continues to be available when the agents are configured in the passive mode. These monitors perform port-level transaction checks, signal stability checks and sequencing between ACE coherent and snoop transaction checks. Another key component of the ACE solution is the system monitor, which performs system-level checks, coherency checks and data integrity checks. As certain checks are dependent on design behavior, the system monitor also provides hooks to implement design-specific checks. The built-in coverage supports ACE coherent and snoop transaction coverage. The cache state transition coverage helps to validate whether the master’s cache has transitioned through all the legal cache states. The coverage can be used in conjunction with the ACE verification planner to track the verification progress.
Solving cache coherency challenges
Stimulus generation
The aforementioned components are complemented by a library of configurable ACE sequences that, weaved together, form virtual sequences to further aid in scenario creation at the block, cluster or system level across various masters and the interconnect. Additionally, the UVM sequence library enables the user to control the different permutations by which atomic and hierarchical sequences can be stitched together to create the complex scenarios depicted earlier.
Creating custom rules for the sequence library would help not only to streamline multiple sequences in different simulations but also to avoid redundancy and move progressively toward convergence of all interesting system-level scenarios. Again, in such scenarios, the sequences have to be aware of the functional configuration to enable reconfiguration based on the system-level requirements.
Creating configurable sequences
There might be specific requirements when the sequences’ constraints or properties depend on the values in the configuration object. The UVM resource mechanism is used in the AC sequences to bring in configurability, as shown in Figure 8.

Though the hierarchical UVM configuration mechanism is designed around components, the non-component object can access the configuration field through the component handle. In case of sequences, ‘m_sequencer’ is the handle to the sequencer that executes the sequence. It is a built-in member of the uvm_sequence class. The configuration parameter can be accessed in a hierarchical context through the ‘m_sequencer’ handle as shown below:
uvm_config_db#(int)::get(m_sequencer, “” ,
"item_count",item_count);
The ‘set’ of the parameter is as follows:
uvm_config_db#(int)::set(this, "env.agent.seqr", "item_count", 20);
Therefore, when parameters change in a dynamic environment, the ACE sequences can reconfigure themselves to meet the generation requirements at that point in time. Thus, for different master and slave components that may support a subset or full ACE, ACE-Lite, AXI4 or AXI3 protocol and work with different bus widths or clock frequencies, the sequences can be reconfigured to work with each of their associated sequencers.
Hierarchical sequence stitching and sequence libraries
The functionalities supported by the protocol range from those that can be mapped to atomic transactions to those that run into hundreds of lines of testbench code. The sequence collection has a rich set of functionality; there are sequences to initiate all the possible coherent transactions. Sequences which do not cause a snoop of any cached masters, which must cause a snoop of the cached masters that can hold a copy of the cache line, which must cause a snoop of any of the cached masters that can hold a copy of the cache line and more.
Given the functionality that UVM provides, it is much more convenient to stitch together low-level, proven or validated scenarios to create more complex ones. This is how the ACE higher level and virtual sequences are built up. Let’s take a look at how custom user scenarios can be built using the sequence collection.
In this example, it must be verified that all the cache line states associated with a Readclean transaction need to be tested. This would require cache line initialization followed by cache line invalidation, then a basic Readclean. A cache line initialization sequence initializes the cache line states of a master's cache and its peer's caches to a set of random but valid states. This ensures that all the different cache line state transitions for a coherent transaction initiated by a master are verified. A cache line invalidation sequence invalidates cache lines of a master. This may be required for non-speculative load transactions. A basic Readclean sequences initiates a Readclean transactions over a given set of addresses. The basic steps are:

A complete verification scenario (like that shown in Figure 7) can be mimicked using the nested sequences as explained in Figure 4. With the hierarchical approach, it becomes relatively easy to model any scenario generation requirements regardless of how complicated they are. The same approach when combined with the virtual sequences helps to leverage this functionality across multiple interfaces and is highly relevant in the system context. For example, there are multiple virtual sequences that are part of the library and perform a combination of different sequential coherent transactions from different masters to the same slave.
// Write into M0’s local cache. Data is now dirty in local cache
M0 initiating MAKEUNIQUE to addr1
// Write data into memory. Data is now clean in local cache. Data in cache matches data in memory
M1 initiating WRITECLEAN to addr1
// Read data into M1’s local cache. Gets clean data from M0
M1 initiating READSHARED to addr1
The above sequence tests that the interconnect can:
Apart from building the explicit virtual sequences, the uvm_sequence_library can be used to achieve the same by adding the sequences registered with the sequence library on the per-requirement basis for a specified instance of the sequencer. Thus, sequences modeling functionalities such as overlapping store operations to verify the interconnect behavior for concurrent transactions, or those exercising multiple initiating masters attempting simultaneous shareable store operations to the same cache line, can easily be made part of the sequence library or collection. The end user can then readily leverage this library.
Using the AXI interconnect and system level checks
The system monitor observes transactions across the ports of a single interconnect and performs checks between the transactions of these ports. It does not perform port-level checks, which are accomplished by the checkers of each master/slave agent connected to a port. In ACE, the system monitor correlates coherent transactions and the corresponding snoop transactions to perform checks. The checks in the system monitor are geared toward checking the proper working of an interconnect DUT.
The system monitor requires transaction-level inputs from the master and slave ports that are connected to interconnect. By transaction-level inputs, we mean transactions created by port-level monitors as a result of signal-level activity. The system monitor does not require signal-level inputs. Transaction-level inputs are provided by port monitors. To provide transaction-level inputs, the system monitor could, in turn, instantiate port-level monitors. UVM provides the capabilities to easily connect various components. All transactions from the port-level monitors of each of the agents can easily be provided to the system monitor via transaction-level modeling (TLM) connections, thereby eliminating the need for instantiating these port level monitors in the system monitor. Figure 10 describes two examples for system-level checks.

Thus, by leveraging the UVM capabilities together with coherency knowledge, the system check provides robustness to verification of the device under test (DUT).
Distributed phasing
Finally, given the usage of such coherent systems in all handheld devices, it is imperative to devise a mechanism for a power-aware verification setup. Also, as mentioned earlier, different components might support a different subset of the protocol. Some of the components might be power aware and would be modeling components in power domains. Such components would need the phase-aware sequences to be executing in user-defined phases. Some of these might go to a powered-down phase in the middle of simulation and on ‘waking up’ would have to catch up the other phases. Again, the UVM hierarchical phasing schemes and configurable sequences can be leveraged to help the user to model the different power state transitions for the system.
UVM allows new domains to be created and components to be grouped into different domains that have executed their phases independent of each other. The default domain name is the ‘uvm’ domain, which contains the default runtime phases; see Figure 11.

New phases can be inserted to the domains created. The components in a specific user-defined domain can be made to sync with the other domain at the end of run_phase. So, as shown in Figure 12, even if an ACE component is powered down, it alone can be made to rewind back to an earlier phase, wake-up and then get in phase with the other components running the default runtime phases.

Next: UVM enabled debug
UVM-based ACE verification IP
One example of a suite of UVM-based verification components that provides a complete UVM-based verification solution for ACE protocol is the Synopsys VIP for AMBA AXI. The AXI ACE VIP provides a system environment component with a configurable number of ACE master and slave agents, a system monitor and an interconnect component, as illustrated in Figure 7. The VIP leverages most of the functionality mentioned in the previous sections as well as the UVM resource mechanism to provide the configurability and the sophisticated stimulus generation requirements in the ACE context.

Figure 7: AXI ACE system environment
The master agent generates constrained random ACE coherent transactions, and responds to the ACE snoop transactions concurrently. It also allocates cache lines and performs cache state transitions to the various cache states based on the transactions it sends and receives by using a built-in cache model. The user has back-door access to the cache model through built-in application programming interfaces (APIs) to allocate, de-allocate or query the cache lines. The slave agent responds to read/write requests and models the memory for the system. It also supports ACE-Lite requirements through simple configuration parameters. The interconnect environment component receives coherent transactions from the initiating master, and generates appropriate snoop transactions to the other masters based on domain information. It then responds to the coherent transactions based on responses received through snoop transactions.
The master and slave agent instantiate the port monitor, which continues to be available when the agents are configured in the passive mode. These monitors perform port-level transaction checks, signal stability checks and sequencing between ACE coherent and snoop transaction checks. Another key component of the ACE solution is the system monitor, which performs system-level checks, coherency checks and data integrity checks. As certain checks are dependent on design behavior, the system monitor also provides hooks to implement design-specific checks. The built-in coverage supports ACE coherent and snoop transaction coverage. The cache state transition coverage helps to validate whether the master’s cache has transitioned through all the legal cache states. The coverage can be used in conjunction with the ACE verification planner to track the verification progress.
Solving cache coherency challenges
Stimulus generation
The aforementioned components are complemented by a library of configurable ACE sequences that, weaved together, form virtual sequences to further aid in scenario creation at the block, cluster or system level across various masters and the interconnect. Additionally, the UVM sequence library enables the user to control the different permutations by which atomic and hierarchical sequences can be stitched together to create the complex scenarios depicted earlier.
Creating custom rules for the sequence library would help not only to streamline multiple sequences in different simulations but also to avoid redundancy and move progressively toward convergence of all interesting system-level scenarios. Again, in such scenarios, the sequences have to be aware of the functional configuration to enable reconfiguration based on the system-level requirements.
Creating configurable sequences
There might be specific requirements when the sequences’ constraints or properties depend on the values in the configuration object. The UVM resource mechanism is used in the AC sequences to bring in configurability, as shown in Figure 8.

Figure 8: Configurable sequence
Though the hierarchical UVM configuration mechanism is designed around components, the non-component object can access the configuration field through the component handle. In case of sequences, ‘m_sequencer’ is the handle to the sequencer that executes the sequence. It is a built-in member of the uvm_sequence class. The configuration parameter can be accessed in a hierarchical context through the ‘m_sequencer’ handle as shown below:
uvm_config_db#(int)::get(m_sequencer, “” ,
"item_count",item_count);
The ‘set’ of the parameter is as follows:
uvm_config_db#(int)::set(this, "env.agent.seqr", "item_count", 20);
Therefore, when parameters change in a dynamic environment, the ACE sequences can reconfigure themselves to meet the generation requirements at that point in time. Thus, for different master and slave components that may support a subset or full ACE, ACE-Lite, AXI4 or AXI3 protocol and work with different bus widths or clock frequencies, the sequences can be reconfigured to work with each of their associated sequencers.
Hierarchical sequence stitching and sequence libraries
The functionalities supported by the protocol range from those that can be mapped to atomic transactions to those that run into hundreds of lines of testbench code. The sequence collection has a rich set of functionality; there are sequences to initiate all the possible coherent transactions. Sequences which do not cause a snoop of any cached masters, which must cause a snoop of the cached masters that can hold a copy of the cache line, which must cause a snoop of any of the cached masters that can hold a copy of the cache line and more.
Given the functionality that UVM provides, it is much more convenient to stitch together low-level, proven or validated scenarios to create more complex ones. This is how the ACE higher level and virtual sequences are built up. Let’s take a look at how custom user scenarios can be built using the sequence collection.
In this example, it must be verified that all the cache line states associated with a Readclean transaction need to be tested. This would require cache line initialization followed by cache line invalidation, then a basic Readclean. A cache line initialization sequence initializes the cache line states of a master's cache and its peer's caches to a set of random but valid states. This ensures that all the different cache line state transitions for a coherent transaction initiated by a master are verified. A cache line invalidation sequence invalidates cache lines of a master. This may be required for non-speculative load transactions. A basic Readclean sequences initiates a Readclean transactions over a given set of addresses. The basic steps are:
- Address selection – Choose the set of addresses on which to test the sequence (user configurable)
- Cache line initialization - Bring cache lines states to random but valid states for all masters.
- Cache line invalidation - Load transactions may need to invalidate its cache before initiating transactions, unless they are speculative.
- Basic ReadClean - Initiate a particular transaction type from one master.

Figure 9: ReadClean coherent command – a basic flow
A complete verification scenario (like that shown in Figure 7) can be mimicked using the nested sequences as explained in Figure 4. With the hierarchical approach, it becomes relatively easy to model any scenario generation requirements regardless of how complicated they are. The same approach when combined with the virtual sequences helps to leverage this functionality across multiple interfaces and is highly relevant in the system context. For example, there are multiple virtual sequences that are part of the library and perform a combination of different sequential coherent transactions from different masters to the same slave.
// Write into M0’s local cache. Data is now dirty in local cache
M0 initiating MAKEUNIQUE to addr1
// Write data into memory. Data is now clean in local cache. Data in cache matches data in memory
M1 initiating WRITECLEAN to addr1
// Read data into M1’s local cache. Gets clean data from M0
M1 initiating READSHARED to addr1
The above sequence tests that the interconnect can:
- Initiate snoop transactions correctly;
- Fetch data from snooped masters and provide to another master; and
- Interact with main memory correctly.
Apart from building the explicit virtual sequences, the uvm_sequence_library can be used to achieve the same by adding the sequences registered with the sequence library on the per-requirement basis for a specified instance of the sequencer. Thus, sequences modeling functionalities such as overlapping store operations to verify the interconnect behavior for concurrent transactions, or those exercising multiple initiating masters attempting simultaneous shareable store operations to the same cache line, can easily be made part of the sequence library or collection. The end user can then readily leverage this library.
Using the AXI interconnect and system level checks
The system monitor observes transactions across the ports of a single interconnect and performs checks between the transactions of these ports. It does not perform port-level checks, which are accomplished by the checkers of each master/slave agent connected to a port. In ACE, the system monitor correlates coherent transactions and the corresponding snoop transactions to perform checks. The checks in the system monitor are geared toward checking the proper working of an interconnect DUT.
The system monitor requires transaction-level inputs from the master and slave ports that are connected to interconnect. By transaction-level inputs, we mean transactions created by port-level monitors as a result of signal-level activity. The system monitor does not require signal-level inputs. Transaction-level inputs are provided by port monitors. To provide transaction-level inputs, the system monitor could, in turn, instantiate port-level monitors. UVM provides the capabilities to easily connect various components. All transactions from the port-level monitors of each of the agents can easily be provided to the system monitor via transaction-level modeling (TLM) connections, thereby eliminating the need for instantiating these port level monitors in the system monitor. Figure 10 describes two examples for system-level checks.

Figure 10: System checks
Thus, by leveraging the UVM capabilities together with coherency knowledge, the system check provides robustness to verification of the device under test (DUT).
Distributed phasing
Finally, given the usage of such coherent systems in all handheld devices, it is imperative to devise a mechanism for a power-aware verification setup. Also, as mentioned earlier, different components might support a different subset of the protocol. Some of the components might be power aware and would be modeling components in power domains. Such components would need the phase-aware sequences to be executing in user-defined phases. Some of these might go to a powered-down phase in the middle of simulation and on ‘waking up’ would have to catch up the other phases. Again, the UVM hierarchical phasing schemes and configurable sequences can be leveraged to help the user to model the different power state transitions for the system.
UVM allows new domains to be created and components to be grouped into different domains that have executed their phases independent of each other. The default domain name is the ‘uvm’ domain, which contains the default runtime phases; see Figure 11.

Figure 11: Distributed phase synchronization
New phases can be inserted to the domains created. The components in a specific user-defined domain can be made to sync with the other domain at the end of run_phase. So, as shown in Figure 12, even if an ACE component is powered down, it alone can be made to rewind back to an earlier phase, wake-up and then get in phase with the other components running the default runtime phases.

Figure 12: UVM phasing – jump-back
Next: UVM enabled debug
Navigate to related information


HIMS
6/27/2012 1:19 AM EDT
Very much Informative.
Sign in to Reply