DS4023 Machine Learning
Machine Learning
DS4023 Machine Learning
SVM Exercise1
Q1. What linear function is used by a SVM for classification? How is an
input vector 𝐱𝐢
(instance) assigned to the positive or negative class?
Q2. If the training examples are linearly separable, how many decision
boundaries can separate positive from negative data points? Which
decision boundary does the SVM algorithm calculate? Why?
Q3. Use Lagrange multiplier method to answer the following questions.
3.1 Consider the Entropy definition:
– If we are given a probability distribution 𝑃 = (𝑝1, 𝑝2, … , 𝑝𝑛) ,
then the information conveyed by this distribution, also called the
Entropy of 𝑃, is 𝐼(𝑃) = −(𝑝1 × log 𝑝1 + 𝑝2 log 𝑝2 + ⋯ +
𝑝𝑛 log 𝑝𝑛) (The base of the logarithm is 2)
What is the range of entropy of 𝑃 and which distribution gives
maximum entropy? Show the details of your answer.
3.2 Given the probability distribution 𝑃 = (𝑝1, 𝑝2, … , 𝑝𝑛), Gini index
is another way to measure the uncertainty 𝐺𝑖𝑛𝑖(𝑃) = 1 − ∑ 𝑝𝑖
𝑛 2
𝑖
.
What is the range of Gini index and which distribution gives maximum
Gini index value? Show the details of your answer.
Q4. Given the SVM optimization problem:
min
w,b
1
2
‖w‖
2
𝑠.𝑡. 𝑦𝑖
(w𝑇x𝑖 + 𝑏) ≥ 1, 𝑖 = 1,2, … , 𝑚
Derive the dual optimization problem and show the detail steps.
Q5.
for the dual problem, if exist
𝛼𝑗
∗
that 𝛼𝑗
∗ > 0, then the solution for primal problem is:
Show the detail steps for deriving the solution given 𝛼
∗