Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Assignment1: Heatdiffusion in openMP
Given is a C program for implementing the approximation of a simple heat diffusion scheme in 2D (five-point stencil).
It iteratively computes matrices of size n ∗ n from an initial n ∗ n-matrix That matrix has all values set to 0.0 but the two values in the first row at indeces [0, n/2] and [0,(n ∗ 3)/4]. These values are initialised by heat values 100.0 and 1000.0, respectively. The iteration step recomputes all non-boundary elements by computing a weighted sum of the current values of the element itself and its four immediate neighbouring elements. The number of iterations is provided as a parameter of the program.
This assignment explores how this algorithm can be executed on a multicore system using openMP.
Your tasks are to:
This assignment should be done by teams of 2-3 students. How you distribute the work within the team is up to you. However, you need to declare who did which part. All performance measurements need to be done on a single system with at least 4 cores. You need to make sure you provide all relevant technical details of that system. Make sure that you:
(error bars)
Evaluate the performance of the sequential code. Use the highest level of compiler optimisation on your machine. Typically, this is -O3 but you should look into the man pages of your compiler;Look for other flags such as -march=native. Present the wallclock time of the work-loop as a function of the matrix-size n ∗ n. Present the absolute performance in GFLOP/s as a function of the matrix-size n ∗ n. Make sure that you vary the parameter n of your matrices so that the sizes of your matrices vary from a few kB to something as large as 5GB. Note that one double requires 8 bytes. You may have to adjust the iter parameter for your machine to obtain a reasonable range of runtimes.
Analyse any anomalies this function may show. You may want to use tools such as gprof, valgrind, or the gperftools to find out what is going on. Summarize your findings in a few paragraphs.
Add openMP pragmas and library calls to your sequential C version. Repeat the evaluations from Task 1 for the openMP version with different numbers of threads. Make sure that you instruct slurm to provide you with sufficient cores and sufficient memory!. Run several experiments to figure out what impact the different scheduling strategies have and present all these in a single graph. Try to analyse the performance and try to explain your findings.
Provide a discussion of your overall findings. This should include figures reporting speedups, scaling (strong and weak), and efficiency. Can you explain what limits your speedups?
Furthermore, you should discuss and compare the effort that was required to achieve these figures (programming effort and debugging effort).
Provide a short description on how you divided up the work, i.e., who did what? Attribute percentages to your overall contributions.