From fc96b3d6bc9aa3942d99944d541061965e9fc623 Mon Sep 17 00:00:00 2001 From: Alice Date: Wed, 4 Oct 2023 12:43:31 +0100 Subject: [PATCH] vault backup: 2023-10-04 12:43:31 --- 4a1s/CP/PL - Aula 4.md | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/4a1s/CP/PL - Aula 4.md b/4a1s/CP/PL - Aula 4.md index fc5505c..299bce2 100644 --- a/4a1s/CP/PL - Aula 4.md +++ b/4a1s/CP/PL - Aula 4.md @@ -51,7 +51,7 @@ Ganhos de 4 vezes mais. - +---- ## Ex. 3 #### a) Peak Performance 2 operações em FP @@ -60,6 +60,15 @@ Ganhos de 4 vezes mais. conclusion: 20 GFlop/s +>[!note]- Redoing the math +>AVX -> 256b -> 4 doubles +>machine is superscalar with 2 FOP units +>4x2= 8 double-perations +> +>freq = 2.5 GHz +>8x2.5= 20 GFlop/s +> ^ cpu limitiation + #### b) peak with vectorization: continuous 20 GFlop/s peak without vectorization: continuous 5 GFlop/s @@ -67,8 +76,8 @@ memory bandwith limitation: ***see alinea d)*** real achievable performance:***see alinea c)*** measured performance: -#### d) -memory bandwith limitation +#### d) Memory bandwidth limitation +1 FOP -> 2B | GFlop/s | Flop/Byte | | ------- | --------- | @@ -88,6 +97,4 @@ memory bandwith limitation | ------- | --------- | | 0.125 | 2.5 | - -#### d) -AVX -> 256b -> 4 doubles +#### e)