1.5 KiB
Ex. 1
a)
Spacial locality -(constant) proximity of memory spaces relative to a reference memory space
Logically following the multiplication process of matrices, C and A have spacial locality (while B does not, due to advancing in columns and not rows). Matrix C also has temporal locality.
b)
cache size = 25600 KB cache line/alignment= 64 B double -> 8
(64/8) = 8 elements
Estimation of misses:
- matrix C: n² / 8
- matrix B: n³
- matrix A: n³ / 8
[!info] The difference in being divided by 8 or not comes from the spacial locality of the matrix.
[!info]- Commands ran nano /proc/cpuinfo srun --partition=cpar perf cat /proc/cpuinfo
c)
C -> n² / 8 B -> n³ / 8 (since it has been transpost) C -> n³/8
d)
N | Version | Time | CPI | #I | L1_DMiss (estimated) | L1_DMiss | Miss/#I |
---|---|---|---|---|---|---|---|
512 | base() | 1 | 0 | 1 | 1 | 0 | |
512 | transp() | 0 | 1 | 0 | 0 | 1 | 1 |
512x512x8 = 2 MB 512x8 = 4KB
Base: (n=512):
- C -> n² / 8 = 262144
- A -> n³ / 8 = 16777216
- B -> n³ ( 8) = 16777216
- Total =
Transp: Impacto (n=512):
- C -> n² / 8 = 262144
- A -> n² / 8 = 262144
- B -> n³ ( 8) = 16777216
- Total = 17072128
[!note]- Commands run srun --partition=cpar perf stat -e L1-dcache-load-misses -M cpi ./b.out