- So far we have Fetch Decode Execute Memory Writeback.
- However if the decode step says it has a conditional branch, we want to make sure the next instruction we fetch is the right one.
- Without using the branch delay slot, how can we do that?
→ If we've come across this branch before, we know everything we need to know about it.
- Once we come to a branch, we have to go theough the process of decoding it. Once we decode it, we can cache it. We can also cache where it jumps to
→ This is called the branch target buffer
→ Buffer, size B, filled with the last B branch PC values
→ Table. Look up the branch, pull the predicted PC value.
→ Then we only miss when the loop ends, every other time it will be a hit. We trade off the thousands of stalls from a delay slot to one stall at the end of the loop when the branch is not taken.
Example: Generic BTB
Speedup = CPU time (old) / CPU time new
But IC and tclk are not changing
therfore speedup (aplpha) CPI(old)/CPI(new)
where CPI = ideal + stalls
or CPI(old) = 1 + no BTB
cpi(new) = 1 + BTB ← branch target buffer
CPI(old) = 1 + 0.15 x 2 = 1.3
CPI(btb) = 1 + (1.5 x 3)% + (1.35 x 4)% = 1.097
speedup = 1.3/1.097 = 1.183 = 18.3 % speedup w.r.t 15% of instruction lists
sorry ive barely taken notes ive got some crazy shit going on with my landlord and ive been texting
Index