Here is a more idiomatic flavor of your code
Code:
#include <stdio.h>
#include <omp.h>
int main()
{
int const SIZE=1000000, SAMPLE = 20;
int i, A[SIZE], B[SIZE], C[SIZE];
double start, end, seq, par;
for(i = 0; i < SIZE; i++) {
A[i] = i;
B[i] = i;
}
// sequential test
start = omp_get_wtime();
for(i = 0; i < SIZE; i++)
C[i] = A[i] + B[i];
end = omp_get_wtime();
seq = (end - start);
// parallel test
start = omp_get_wtime();
#pragma omp parallel for
for(i = 0; i < SIZE; i++) {
C[i] = A[i] + B[i];
}
end = omp_get_wtime();
par = (end - start);
printf("sequential time execution is = %lf\n", seq);
printf("parallel time execution is = %lf\n", par);
printf(" the elements of array A are:\n");
for(i = 0; i < SAMPLE; i++)
printf("%5d",A[i]);
printf(" ... %5d\n\n", A[SIZE-1]);
printf(" the elements of array B are:\n");
for(i = 0; i < SAMPLE; i++)
printf("%5d", B[i]);
printf(" ... %5d\n\n", B[SIZE-1]);
printf(" the elements of array C are:\n");
for(i = 0; i < 20; i++)
printf("%5d", C[i]);
printf(" ... %5d\n\n", C[SIZE-1]);
printf(" time taken for sequential execution = %lf secs\n", seq);
printf(" time taken for parallel execution = %lf secs\n", par);
}
You need to compile it like ....
gcc -fopenmp -o foo foo.c
Then there is the issue of the crash ...
Quote:
[stevea@hypoxylon tmp]$ ./foo
Segmentation fault (core dumped)
|
Each of the 3 arrrays uses 4MB of stack space. That's not huge by modern standards, BUT the default ulimit is probably too small
Code:
[stevea@hypoxylon tmp]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 61815
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
So obviously 8MB of stack won't do the job, we need to set that limit higher.
Code:
[stevea@hypoxylon tmp]$ ulimit -s unlimited # remove stack limit
[stevea@hypoxylon tmp]$ ./foo
sequential time execution is = 0.007456
parallel time execution is = 0.011880
the elements of array A are:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 999999
the elements of array B are:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 999999
the elements of array C are:
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 ... 1999998
time taken for sequential execution = 0.007456 secs
time taken for parallel execution = 0.011880 secs
[stevea@hypoxylon tmp]$ OMP_NUM_THREADS=50 ./foo
sequential time execution is = 0.007107
parallel time execution is = 0.005135
the elements of array A are:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 999999
the elements of array B are:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ... 999999
the elements of array C are:
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 ... 1999998
time taken for sequential execution = 0.007107 secs
time taken for parallel execution = 0.005135 secs
You may need
ulimit -u 40000
to try really large numbers of threads.
---------- Post added at 04:02 PM ---------- Previous post was at 03:58 PM ----------
PS - you really need more calculation in the loop to take good advantage of parallelism. Otherwise the thread overhead costs more than it saves.
---------- Post added at 04:06 PM ---------- Previous post was at 04:02 PM ----------
P.P.S> Is it true they charge programmers 10 euro-cents for each extra space and a full euro for extra newlines in Belgium ?