diff --git a/4a1s/CP/PL - Aula 4.md b/4a1s/CP/PL - Aula 4.md index fc5505c..299bce2 100644 --- a/4a1s/CP/PL - Aula 4.md +++ b/4a1s/CP/PL - Aula 4.md @@ -51,7 +51,7 @@ Ganhos de 4 vezes mais. - +---- ## Ex. 3 #### a) Peak Performance 2 operações em FP @@ -60,6 +60,15 @@ Ganhos de 4 vezes mais. conclusion: 20 GFlop/s +>[!note]- Redoing the math +>AVX -> 256b -> 4 doubles +>machine is superscalar with 2 FOP units +>4x2= 8 double-perations +> +>freq = 2.5 GHz +>8x2.5= 20 GFlop/s +> ^ cpu limitiation + #### b) peak with vectorization: continuous 20 GFlop/s peak without vectorization: continuous 5 GFlop/s @@ -67,8 +76,8 @@ memory bandwith limitation: ***see alinea d)*** real achievable performance:***see alinea c)*** measured performance: -#### d) -memory bandwith limitation +#### d) Memory bandwidth limitation +1 FOP -> 2B | GFlop/s | Flop/Byte | | ------- | --------- | @@ -88,6 +97,4 @@ memory bandwith limitation | ------- | --------- | | 0.125 | 2.5 | - -#### d) -AVX -> 256b -> 4 doubles +#### e)