cs3121 Identify the key features of an algorithm
Identify the key features of an algorithm
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Module Learning Outcomes 2
Identify the key features of an algorithm
Solve problems by creatively applying standard data structures
and searching and sorting algorithms
Communicate algorithmic ideas at different abstraction levels
Evaluate the efficiency of algorithms and justify their
correctness
Apply the LATEX typesetting system to produce high-quality
technical documents
Table of Contents 3
1. Time Complexity
2. Revision of Data Structures and Algorithms
3. Proofs
4. Puzzle
Fast and slow algorithms 4
You should be familiar with true-ish statements such as:
Heap sort is faster than bubble sort.
Linear search is slower than binary search.
We would like to make such statements more precise.
We also want to understand when they are wrong, and why it
matters.
Best and worst case performance 5
The statements on the previous slide are most commonly
made in comparing the worst case performance of these
algorithms.
They are actually false in the best case!
In most problems in this course, we are concerned with worst
case performance, so as to be robust to maliciously created
(or simply unlucky) instances.
We will hardly ever discuss the best case.
Average case performance 6
In some circumstances, one might accept occasional poor
performance as a trade-off for good average performance over
all possible inputs (or a large sample of them).
Analysing the average case or expected performance of an
algorithm requires probabilistic methods and is beyond the
scope of this course, except where we rely purely on results of
this type from prior courses (e.g. hash table operations,
quicksort).
Rates of growth 7
We need a way to compare two functions, in this case
representing the worst case runtime of each algorithm.
A natural starting point is to compare function values directly.
Question
If f (100) > g(100), then does f represent a greater running time,
i.e. a slower algorithm?
Answer
Not necessarily! n = 100 might be an outlier, or too small to
appreciate the efficiencies of algorithm f . We care more about
which algorithm scales better.
Rates of growth 8
We prefer to talk in terms of asymptotics, i.e. long-run
behaviour.
For example, if the size of the input doubles, the function
value could (approximately) double, quadruple, etc.
A function which quadruples will eventually exceed a function
which doubles, regardless of the values for small inputs.
We’d like to categorise functions (i.e. runtimes, and therefore
algorithms) by their asymptotic rate of growth.
We’ll primarily talk about the worst-case performance of
algorithms, but the same method of analysis could be applied
to the best case or average case performance of an algorithm.
Big-Oh notation 9
Definition
We say f (n) = O(g(n)) if for large enough n, f (n) is at most a
constant multiple of g(n).
g(n) is said to be an asymptotic upper bound for f (n).
This means that the rate of growth of function f is no greater
than that of function g .
An algorithm whose running time is f (n) scales at least as
well as one whose running time is g(n).
It’s true-ish to say that the former algorithm is ‘at least as
fast as’ the latter.
Big-Oh notation 10
Useful to (over-)estimate the complexity of a particular
algorithm.
For uncomplicated functions, such as those that typically arise
as the running time of an algorithm, we usually only have to
consider the dominant term.
Big-Oh notation: Examples 11
Example 1
Let f (n) = 100n. Then f (n) = O(n), because f (n) is at most 100
times n for large n.
Example 2
Let f (n) = 2n + 7. Then f (n) = O(n2), because f (n) is at most 1
times n2 for large n. Note that f (n) = O(n) is also true in this
case.
Example 3
Let f (n) = 0.001n3. Then f (n) ̸= O(n2), because for any constant
multiple of n2, f (n) will eventually exceed it.
Big-Oh notation: Examples 12
Recall from prior courses that:
inserting into a binary search tree takes O(h) time in the
worst case, where h is the height of the tree, and
insertion sort runs in O(n2) time in the worst case, but O(n)
in the best case.
Note that these statements are true regardless of:
the details of how the algorithm is implemented, and
the hardware used to execute the algorithm.
Both of these issues change the actual running time, but the
dominant term of the running time will still be a constant times h,
n2 or n respectively.
Big-Oh notation 13
Question
The definition states that we only care about the function values
for “large enough” n. How large is “large enough”?
Put differently, how small is “small enough” that it can be safely
ignored in assessing whether the definition is satisfied?
Answer
It doesn’t matter how f (1) compares to g(1), or even how
f (1, 000, 000) compares to g(1, 000, 000). We only care that f (n)
is bounded above by a multiple of g(n) eventually.
Big-Oh notation 14
Disclaimer
Of course, how f (1, 000, 000) compares to g(1, 000, 000) is also
important.
When choosing algorithms for a particular application with inputs
of size at most 1, 000, 000, the behaviour beyond this size doesn’t
matter at all!
Asymptotic analysis is one tool by which to compare algorithms. It
is not the only consideration; the actual input sizes used and the
constant factors hidden by the O(·) should also be taken into
account.
Big-Omega notation 15
Definition
We say f (n) = Ω(g(n)) if for large enough n, f (n) is at least a
constant multiple of g(n).
g(n) is said to be an asymptotic lower bound for f (n).
This means that the rate of growth of function f is no less
than that of function g .
An algorithm whose running time is f (n) scales at least as
badly as one whose running time is g(n).
It’s true-ish to say that the former algorithm is ‘no faster
than’ the latter.
Big-Omega notation 16
Useful to (under-)estimate the complexity of any algorithm
solving a particular problem.
For example, finding the maximum element of an unsorted
array takes Ω(n) time, because you must consider every
element.
Once again, for every function that we care about, only the
dominant term will be relevant.
Challenge
We didn’t present the formal definitions of O(·) and Ω(·), but they
can be found in either textbook.
Using these formal definitions, prove that if f (n) = O(g(n)) then
g(n) = Ω(f (n)).
Big-Theta notation 17
Definition
We say f (n) = Θ(g(n)) if
f (n) = O(g(n)) and f (n) = Ω(g(n)).
f (n) and g(n) are said to have the same asymptotic growth
rate.
An algorithm whose running time is f (n) scales as well as one
whose running time is g(n).
It’s true-ish to say that the former algorithm is ‘as fast as’ the
latter.
Big-Theta notation 18
Question
You’ve previously seen statements such as
bubble sort runs in O(n2) time in the worst case.
Should these statements be written using Θ(·) instead of O(·)?
Answer
They can, but they don’t have to be. The statements
bubble sort runs in O(n2) time in the worst case
and
bubble sort runs in Θ(n2) time in the worst case
are both true: they claim that the worst case running time is at
most quadratic and exactly quadratic respectively.
Big-Theta notation 19
Answer (continued)
The Θ(·) statement conveys slightly more information than the
O(·) statement. However in most situations we just want to be
sure that the running time hasn’t been underestimated, so O(·) is
the important part.
Sum property 20
Fact
If f1 = O(g1) and f2 = O(g2), then f1 + f2 = O(g1 + g2).
The last term O(g1 + g2) is often rewritten as
O (max(g1, g2)), since g1 + g2 ≤ 2max(g1, g2).
The same property applies if O is replaced by Ω or Θ.
Sum property 21
This property justifies ignoring non-dominant terms: if f2 has
a lower asymptotic bound than f1, then the bound on f1 also
applies to f1 + f2.
For example, if f2 is linear but f1 is quadratic, then f1 + f2 is
also quadratic.
This is useful for analysing algorithms that have two or more
stages executed sequentially.
If f1 is the running time of the first stage and f2 of the second
stage, then we can bound each stage and add the bounds, or
simply take the most ‘expensive’ stage.