my_digital_garden/4a1s/CP/PL - Aula 3.md

1.5 KiB

Ex. 1

a)

Spacial locality -(constant) proximity of memory spaces relative to a reference memory space

Logically following the multiplication process of matrices, C and A have spacial locality (while B does not, due to advancing in columns and not rows). Matrix C also has temporal locality.

b)

cache size = 25600 KB cache line/alignment= 64 B double -> 8

(64/8) = 8 elements

Estimation of misses:

  • matrix C: n² / 8
  • matrix B: n³
  • matrix A: n³ / 8

[!info] The difference in being divided by 8 or not comes from the spacial locality of the matrix.

[!info]- Commands ran nano /proc/cpuinfo srun --partition=cpar perf cat /proc/cpuinfo

c)

C -> n² / 8 B -> n³ / 8 (since it has been transpost) C -> n³/8

d)

N Version Time CPI #I L1_DMiss (estimated) L1_DMiss Miss/#I
512 base() 1 0 1 1 0
512 transp() 0 1 0 0 1 1

512x512x8 = 2 MB 512x8 = 4KB

Base: (n=512):

  • C -> n² / 8 = 262144
  • A -> n³ / 8 = 16777216
  • B -> n³ ( 8) = 16777216
  • Total =

Transp: Impacto (n=512):

  • C -> n² / 8 = 262144
  • A -> n² / 8 = 262144
  • B -> n³ ( 8) = 16777216
  • Total = 17072128

[!note]- Commands run srun --partition=cpar perf stat -e L1-dcache-load-misses -M cpi ./b.out

e)

f)

Ex. 2