What if the cache is full and you alter something with an age 3?
Its age needs to become 6 (freshest)
So everything above it needs to be aged.
cache slot : age:
0 : 0
1 : 1
2 : 2 ← these are already older
3 : 3 ← needs to become 6
4 : 4 ← these are fresher
5 : 5
6 : 6
Anything above 3 needs to be reduced in freshness
0 : 0
1 : 1
2 : 2
3 : 6
4 : 3
5 : 4
6 : 5
Can we combine direct mapping and associative?
What if we did:
- assume 8 entry cache
- we have 4 of them
- labelled 00, 01, 10, 11
- What if we take the last 2 bits of the instruction and use that to map to the cache blocks
- jk what if we took the 4th and 5th bit and used those as our direct mapping key?
- so it would switch based on the last 5 bits: 00xxx ← anything in the xxx bits would map to block 00
- then we could do a cheaper associative search on the 00 block
We are trying to make it so we have less processing in order to go through the entire cache
We break it into smaller components.
We've said we're going to choose bits 3 and 4, which will tell us which cache block to use.
That means that that cache block 00 takes all addresses xxxx xxxx xxx0 0xxx
This doesn't eliminate thrashing but it ideally would make it less likely.
We could also have a data cache and a code cache. What is the advantage of this?
- instruction cache doesn't need a dirty bit
→ never have to write to memory
→ could have a miss, but you don't need to write it out
→ it's always primary memory going to cache
Aside:
- Princeton architecture (what we've been doing) has instructions and data in the same memory
- Harvard architecture has separate instruction and data memory.
→ Instructions may be a different bit width than the data
→ we can use a full 32 bit data value and still only have 16 bit instructions
→ doing that in a princeton architecture with 16 bit instructions means having to fetch twice to get 32 bits of data.
→ If we have a Harvard architecture, can we have an immediate mode instruction?
⇒ the data is 32 bits wide, but the instruction is only 16 bits
⇒ could we support this?
⇒ we would have to pull in the data in bits and pieces. we could have a register and fill in 4 bytes one at a time.
Two operand instructions:
In the 6811, there's an instruction ADDA ___ which adds a value to accumulator A
ADDA #Value ; ← ACCA gets 8 bit value
can also have
ADDA #offset, IX ← index register
we're saying ACCA gets memory location mem[IX + 8-bit offset]
In the first case, it is a one operand instruction.. add this value to the accumulator.
The second case is a two-address insrtuction, which occupies the instruction and then the offset
[ADDA]
[OFFSET]
the operand dictates how we access
The operand the effective address:
ADD R1, R2
- the address of R1 is regfile[1]
- the CPU looks inside the register file
- its effective address is the register in the register file
ADD #1, R2
- The effective address of #1 is in the constant table
- the effective address of R2 is the register file
LD R0, R1
- What are the effective addresses here?
- R0: mem[R0]
- R1: register file #1
- once the effective addresses are found, the instruction can execute.
LDR R0, #_, R1
- effective address: R0 + #_
- data returned is MEM[effective address]
- R1's effective address is the register file #1
- When the instruction is executed,
- RF[EAdst] ← MEM[EAsrc]
In A2, the cache should fetch words instead of bytes Index