vault backup: 2023-10-04 12:43:31
This commit is contained in:
parent
ce64639d15
commit
fc96b3d6bc
1 changed files with 13 additions and 6 deletions
|
@ -51,7 +51,7 @@ Ganhos de 4 vezes mais.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
----
|
||||||
## Ex. 3
|
## Ex. 3
|
||||||
#### a) Peak Performance
|
#### a) Peak Performance
|
||||||
2 operações em FP
|
2 operações em FP
|
||||||
|
@ -60,6 +60,15 @@ Ganhos de 4 vezes mais.
|
||||||
|
|
||||||
conclusion: 20 GFlop/s
|
conclusion: 20 GFlop/s
|
||||||
|
|
||||||
|
>[!note]- Redoing the math
|
||||||
|
>AVX -> 256b -> 4 doubles
|
||||||
|
>machine is superscalar with 2 FOP units
|
||||||
|
>4x2= 8 double-perations
|
||||||
|
>
|
||||||
|
>freq = 2.5 GHz
|
||||||
|
>8x2.5= 20 GFlop/s
|
||||||
|
> ^ cpu limitiation
|
||||||
|
|
||||||
#### b)
|
#### b)
|
||||||
peak with vectorization: continuous 20 GFlop/s
|
peak with vectorization: continuous 20 GFlop/s
|
||||||
peak without vectorization: continuous 5 GFlop/s
|
peak without vectorization: continuous 5 GFlop/s
|
||||||
|
@ -67,8 +76,8 @@ memory bandwith limitation: ***see alinea d)***
|
||||||
real achievable performance:***see alinea c)***
|
real achievable performance:***see alinea c)***
|
||||||
measured performance:
|
measured performance:
|
||||||
|
|
||||||
#### d)
|
#### d) Memory bandwidth limitation
|
||||||
memory bandwith limitation
|
1 FOP -> 2B
|
||||||
|
|
||||||
| GFlop/s | Flop/Byte |
|
| GFlop/s | Flop/Byte |
|
||||||
| ------- | --------- |
|
| ------- | --------- |
|
||||||
|
@ -88,6 +97,4 @@ memory bandwith limitation
|
||||||
| ------- | --------- |
|
| ------- | --------- |
|
||||||
| 0.125 | 2.5 |
|
| 0.125 | 2.5 |
|
||||||
|
|
||||||
|
#### e)
|
||||||
#### d)
|
|
||||||
AVX -> 256b -> 4 doubles
|
|
||||||
|
|
Loading…
Reference in a new issue