There are a number of pre-configured setups included in the Simics distribution that uses the MAI to simulate out-of-order processors. They are all examples on how to use the API in different ways. The scripts reside in the same directory as the ordinary Simics scrips for a specific target. The name of the scripts contains ooo (for out-of-order). Here follows a short description of each script and how they can be further configured:
These scripts uses the module ooo-micro-arch (see the source code in simics/src/extensions/ooo-micro-arch) to demonstrate how MAI works. The model can fetch, execute, and commit a configurable number of instructions per cycle. No branch speculation is performed, thus if an unresolved branch is found the fetches are stalled until the outcome of the branch is determined.
If an exception occurs the instruction tree is drained and all the speculative instructions beyond the one that caused the exception are discarded.
A pipeline is modeled with a combined fetch/decode stage, an execute state, and a commit stage.
Each processor gets an object of the ooo_micro_arch class attached to it that handles the simulation. These objects have some attributes that can be changed to alter the behavior of the model:
These scripts use the sample-micro-arch module (see simics/src/extensions/sample-micro-arch). The processors modeled can fetch/decode, execute, and commit a configurable number of instructions per cycle.
The model has a simple branch-predictor that uses a hash table (Branch Target Buffer) to lookup the target address from the address of the branch instruction. This allows for branch speculation. The hash table is updated for every successfully committed branch.
Besides speculating on the target address, the model also speculate fall through for every branch. This way two possible execution paths are created for every branch. This makes the instruction tree into a binary tree. The number of instructions executed and fetch per cycle is actually per branch in the instruction tree.
There is a compile time switch available called VALUE_PREDICTION that can be defined to switch on value prediction of loads. It works like a small cache that maps logical addresses to values. When a load is issued, the cache is looked up first to quickly get value that can be used by later instructions. When the load is finished the speculated value is checked against the real value. If they mismatch, the later instructions need to be squashed.
Each processor gets an object of the sample_micro_arch class attached to it that handles the simulation. The class implements the following attributes:
The following attributes in each CPU object can also be used to further configure the models: