Sale!

# HW 1: Parallel Programming with MPI

\$30.00

Category:

CSCE 689-600
HW 1: Parallel Programming with MPI

Compile and execute the program in the file compute_pi_mpi.c, which computes an
estimate of  using the parallel algorithm discussed in class. The program is available on the
shared Google Drive for this class. It should be compiled and executed on either
Load the Intel software stack prior to compiling and executing the code.
To compile, use the command:
mpiicc -o compute_pi_mpi.exe compute_pi_mpi.c
To execute the program, use
mpirun –np <p> ./compute_pi_mpi.exe <n>
where <n> represents the number of intervals and <p> represents the number of processes.
The output of a sample run is shown below.
mpirun -np 4 compute_pi_mpi.exe 100000000
n = 100000000, p = 4, pi = 3.1415926535897749, relative error =
5.80e-15, time (sec) = 0.0608
The run time of the code should be measured when it is executed in dedicated mode. Use
the batch file compute_pi_mpi.job, to execute the code in dedicated mode using the
bsub < compute_pi_mpi.job
On Terra, you will need to use compute_pi.terra_job, and the corresponding
command is:
sbatch compute_pi.terra_job
Execute the code for n=108 with p chosen to be 2k
, for k = 0, 1, …, 6. Specify ptile=4 in the
job file. Using the experimental data obtained from these experiments, answer the following
questions.
1. (10 points) Plot execution time versus p to demonstrate how time varies with the
number of processes. Use a logarithmic scale for the x-axis.
2. (10 points) Plot speedup versus p to demonstrate the change in speedup with p.
3. (5 points) Using the definition: efficiency = speedup/p, plot efficiency versus p to
demonstrate how efficiency changes as the number of processes is increased.
4. (5 points) What value of p minimizes the parallel runtime?
5. (10 points) With n=109 and p=64, determine the value of ptile that minimizes the
total_time. Plot time versus ptile to illustrate your experimental results for this
question.
6. (10 points) Repeat the experiments with p=64 for n=102
, 10
4
, 10
6 and 108
.
a. Plot the speedup observed w.r.t. p=1 versus n.
b. Plot the relative error versus n to illustrate the accuracy of the algorithm as a
function of n. HW 1: Parallel Programming with MPI