Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Computing clusters
Version 1.01
Updates to the project, including any corrections and clarifications, will be posted on the
course website. Make sure that you check the course website regularly for updates.
Change log
? Version 1.01 (27 March 2024). There is a mistake in the denominators of the two probability
density functions in Section 5.1.1. For g0(t), it should be t raised to the power of η0+1 where
the +1 was missing. A similar error appeared in g1(t), it should be t raised to the power of
η1+1.
? Version 1.00. Issued on 19 March 2024.
1 Introduction and learning objectives
You have learnt in Week 4A’s lecture that a high variability of inter-arrival times or service times
can cause a high response time. Measurements from real computer clusters have found that the
service times in these clusters have very high variability [1]. The reference paper [1] also has a
number of suggestions to deal with this issue. One suggestion is to separate the jobs according
to their service time requirements, and have one set of servers processing jobs with short service
times and another set of servers for jobs with long service times. This arrangement is the same
as supermarkets having express checkouts for customers buying not more than a certain number
of items and other checkouts that do not have a limit on the number of items. You had seen this
theory in action in Week 4A’s revision Problem 1. We also highly recommend you to read the
paper [1].
In this project, you will use simulation to study how to reduce the response time of a server
farm that uses different servers to process jobs with different service time requirements.
In this project, you will learn:
1. To use discrete event simulation to simulate a computer system
2. To use simulation to solve a design problem
3. To use statistically sound methods to analyse simulation outputs
We mentioned a number of times in the lectures that simulation is not simply about writing
simulation programs. While it is important to get your simulation code correct, it is also important
that you use statistically sound methods to analyse simulation outputs. There, roughly half of
the marks of this project is allocated to the simulation program, and the other half to statistical
analysis; see Section 7.2.
1
Server 0
Server n - 1
New jobs
submitted
by users
Dispatcher
?
?
?
Queue 0 ↓
Queue 1 ↑
Jobs that have completed
their processing will
depart the system
permanently
Jobs that are killed are
sent back
to the dispatcher
Jobs killed by servers in
Group 0
Server n0
Server n0 - 1
?
?
?
Jobs that have completed
their processing will
depart the system
permanently
Group 0 →
Group 1 →
Figure 1: The multi-server system for this project.
2 Support provided and computing resources
If you have problems doing this project, you can post your question on the course forum. We
strongly encourage you to do this as asking questions and trying to answer them is a
great way to learn. Do not be afraid that your question may appear to be silly, the
other students may very well have the same question! Please note that if your forum post
shows part of your solution or code, you must mark that forum post private.
Another way to get help is to attend a consultation (see the Timetable section of the course
website for dates and times).
If you need computing resources to run your simulation program, you can do it on the VLAB
remote computing facility provided by the School. Information on VLAB is available here: https:
//taggi.cse.unsw.edu.au/Vlab/