Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CS 484 - Parallel Programming
Introduction
Learning Goals
You will learn the following:
• Communication patterns of parallel 2D Van der Waals gas simulator
• Using MPI, and AMPI for distributed memory execution
Please read the entire document before beginning any part of the assignment, as some parts are
interdependent.
Assignment Tasks
The basic workflow of this assignment is as follows (there are more details in the relevant sections):
• Clone your turnin repo to VMFarm 1
• You may need to iterate:
– Implement the algorithms / code for the various programming tasks.
– Build and test on VMFarm.
– Check in and push your complete, correct code.
– Benchmark on Campus Cluster (scripts/batch script.run)
• Check in any benchmarking results that you wish to and final versions of code files.
• Write and check in your writeup.
1 Part I: MPI
You will implement the communication functions in part1/solution.h and part1/solution.cpp. You may modify the
classes in these files to add member variables/functions or even create other helper classes, etc. Although you are allowed
to create or alter other files and use them for debugging/testing purposes, they will be discarded at grading time.
1.1 Implementation
The MPI code revolves around a class named MPISimulationBlock defined in part1/solution.h, which inherits from
SimulationBlock defined in common/simblock.h. The parent class SimulationBlock contains most of the program
parameters as well as particle data, which can be accessed from MPISimulationBlock’s member methods. Each MPI
rank contains one MPISimulationBlock that represents a block of the entire simulation grid. Remember that with MPI,
the decomposition (i.e. number of blocks/ranks used for the simulation) is dependent on the number of CPU cores.
You may add code to the constructor and destructor bodies if want (not required). You should implement the following
functions:
1You are free to use the Docker container uiuccs484parallelprog/cs484 student , but we will provide no support for Docker itself.
1
• MPISimulationBlock::exchange particles()
When this function is called, any particle outside the current block’s bounds must be moved to the appropriate
adjacent block. That is, it must be removed from this block’s SimulationBlock::all particles array (using
the provided SimulationBlock::remove particle(int) function) and added to one and only one appropriate
recipient’s all particles array (either via SimulationBlock::add particle(phys particle t) or by directly
placing it in the all particles array and updating N particles).
You can determine which direction a particle needs to move by calling check migrant direction(particle). The
return value of this function is an int, which is one of the following:
SimulationBlock::DIR SELF, SimulationBlock::DIR N, SimulationBlock::DIR S,
SimulationBlock::DIR E, SimulationBlock::DIR W, SimulationBlock::DIR NE,
SimulationBlock::DIR NW, SimulationBlock::DIR SE, SimulationBlock::DIR SW
SELF means the particle does not need to move to another block. N means north, S means south, NW means northwest,
and so on. You may use the provided macro DIR EQ (provided in common/simblock.h) to check the direction:
e.g. DIR EQ(SimulationBlock::DIR N, direction) will return true (i.e. 1) if the particle should migrate to the
north neighbor.
For the particle exchange, you would need to use MPI communication calls (MPI send, MPI recv, etc.) and
proper synchronization (if necessary). Note that the receiving rank will need to know how many particles it is going
to receive before posting a receive MPI call, since you want to avoid sending particles one by one over the network.
• MPISimulationBlock::communicate ghosts()
Any time a particle from SimulationBlock::all particles is within a certain distance of the current block’s
edge, it must be communicated to the appropriate adjacent, or corner-touching SimulationBlock and placed in
that block’s SimulationBlock::all ghosts array. SimulationBlock::N ghosts must be set to the total number
of ghost particles received in this iteration. Communicating ghost particles is very similar to exchanging particles,
except that a single particle may be sent as a ghost to several adjacent blocks, and particles that are sent are
not removed from the local all particles. You can determine which direction(s) a particle needs to be sent
by calling check ghost direction(particle). Because a particle may be sent to multiple neighbors unlike in
exchange particles(), you should use the provided DIR HAS macro:
e.g. DIR HAS(SimulationBlock::DIR N, direction) will return true if north is one of the directions that the
particle needs to be sent to.
You should use MPI communication calls to send and receive the ghosts, similar to exchange particles().
• MPISimulationBlock::init communication()
• MPISimulationBlock::finalize communication()
These functions will be called by the main program before and after the simulation runs. You may use them to do
whatever setup and teardown you wish, but do not call MPI Init() or MPI Finalize(), which our main program
will do. Mostly this is for setting up/tearing down whatever variables you will need for communication. Try not to
leak any memory, we may choose to test your code under valgrind or another memory debugger.
1.2 Compilation and Testing
1. Create a new build directory.
2. In build, run cmake . This will go through the system configurations and generate a Makefile.
3. Run make. This will compile binaries for all parts of the assignment (bin/part1, bin/part2, bin/part3).
This compilation process is identical for all parts of the assignment.
To test if your MPI program works, run bin/part1 -N 100 -i 1. This will run the simulation for 100 iterations,
outputting the iteration value every iteration.
2 Part II: AMPI
2.1 Implementation
For this section, you do not need to modify/add any code, but please do read the code in part1/main.cpp that is wrapped
with #ifdef AMPI. These code blocks demonstrate how load balancing can be invoked with AMPI, and will be included
in the AMPI version of the program.
You do need to benchmark this code, as explained in the last section.
2
2.2 Testing
Run bin/part2 +vp 4 -N 100 -i 1 +balancer GreedyRefine +isomalloc sync. This will run the AMPI program
with 4 virtual ranks, with the GreedyRefine load balancing strategy.
3 Benchmarking
We have provide you with a batch script that will run both parts of the assignment (MPI, and aMPI, Charm will be
excluded this term. ). As always, you should run this on the campus cluster. It will vary the number of utilized CPU
cores from 1 to 36, running on at most 2 physical nodes (each node has 20 CPU cores) with a fixed decomposition. The
results will be stored in writeup/benchmark *.txt, where * is the one of mpi and ampi. You should plot these results
and evaluate the performance in your writeup. More specifically, explain how the performance for each version of the
simulation code scales with the number of CPU cores.
4 Submission
Your writeup must be a single PDF file that contains the tasks outlined above.
Please save the file as “./writeup/mp3 .pdf” , commit it to git, and push it to the main branch of your
turnin repo before the submission deadline.
3