991 B
991 B
4 de Outubro 2023 - #CP
Ex. 2
a) Limitações vetoriais
A -> consecutive elements in a row -> consecutive access in the vector C -> same element B -> consecutive elements in a collumn
Não vai ser vetorizável.
b)
result of change cycles to i, k , j : A -> same element C -> consecutive elements in a row -> consecutive access in the vector B -> consecutive elements in a row -> consecutive access in the vector
i k j 0 0 1
Vai ser vetorizável.
128b 8B -> 64b 2 elements
c)
N | Version | Time | CPI | #I |
---|---|---|---|---|
512 | base_v() | 0.492484818 | 0.91 | 1113554887 |
512 | vect() | 0.081604350 | 2.88 | 578275097 |
[!note]- Commands run module load gcc/9.3.0 gcc -O2 -ftree-vectorize -msse4 mmult.c srun --partition=cpar perf stat -e cycles,instructions ./a.out