vault backup: 2023-10-25 12:31:34
This commit is contained in:
parent
e212392f9d
commit
b1fc8a7ed7
3 changed files with 24 additions and 8 deletions
|
@ -557,7 +557,12 @@ Run \#1 (--cpus-per-task=2):
|
|||
Dot is 1.2020569029595982
|
||||
```
|
||||
|
||||
With the same cpus per task (cores), the result may vary depending on how
|
||||
Run \#n (--cpus-per-task=2):
|
||||
```
|
||||
Dot is 0.0000000001499965
|
||||
```
|
||||
With the same cpus per task (cores), the result may vary depending on how each threads randomly "picks up" the iterations of the for.
|
||||
|
||||
|
||||
Run \#2 (--cpus-per-task=4):
|
||||
```
|
||||
|
@ -569,6 +574,18 @@ Run \#3 (--cpus-per-task=6):
|
|||
Dot is 0.0000000013500135
|
||||
```
|
||||
|
||||
Run \#4 (--cpus-per-task=8):
|
||||
```
|
||||
Dot is 0.0000000000719980
|
||||
```
|
||||
|
||||
|
||||
Varies with the number of cpus per task (cores) because depending on the number of threads, the distribution of the instruction ```dot += a[i]*b[i];``` per thread will vary, aka the value ```i``` will constantly vary, making the memory space alternate between versions.
|
||||
|
||||
### c)
|
||||
### c)
|
||||
We can change the code to:
|
||||
|
||||
And the results will always be:
|
||||
|
||||
### d)
|
||||
We can use _reduction_ to do that.
|
Loading…
Add table
Add a link
Reference in a new issue