Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
EECE5644 Assignment
Please submit in Canvas a single PDF file showing all results, and include code as appendix
or as an external repository link in the PDF file. Cite all appropriate sources you benefit from.
Directly exchanging writing/code/results between classmates are not acceptable, but you may talk
to each other in classes, office periods, benefit from each others’ questions and answers for those.
Question 1 (60%)
In this exercise, you will train many multilayer perceptrons (MLP) to approximate the class
label posteriors, using maximum likelihood parameter estimation (equivalently, with minimum
average cross-entropy loss) to train the MLP. Then, you will use the trained models to approximate
a MAP classification rule in an attempt to achieve minimum probability of error (i.e. to minimize
expected loss with 0-1 loss assignments to correct-incorrect decisions).
Data Distribution: ForC= 4 classes with uniform priors, specify Gaussian class-conditional
pdfs for a 3-dimensional real-valued random vector x (pick your own mean vectors and covariance
matrices for each class). Try to adjust the parameters of the data distribution so that the MAP
classifier that uses the true data pdf achieves between 10%−20% probability of error.
MLP Structure: Use a 2-layer MLP (one hidden layer of perceptrons) that has P perceptrons
in the first (hidden) layer with smooth-ramp style activation functions (e.g., ISRU, Smooth-ReLU,
ELU, etc). At the second/output layer use a softmax function to ensure all outputs are positive
and add up to 1. The best number of perceptrons for your custom problem will be selected using
cross-validation.
Generate Data: Using your specified data distribution, generate multiple datasets: Training
datasets with 100,200,500,1000,2000,5000 samples and a test dataset with 100000 samples. You
will use the test dataset only for performance evaluation.
Theoretically Optimal Classifier: Using the knowledge of your true data pdf, construct the
minimum-probability-of-error classification rule, apply it on the test dataset, and empirically esti-
mate the probability of error for this theoretically optimal classifier. This provides the aspirational
performance level for the MLP classfier.
Model Order Selection: For each of the training sets with different number of samples,
perform 10-fold cross-validation, using minimum classification error probability as the objective
function, to select the best number of perceptrons (that is justified by available training data).
Model Training: For each training set, having identified the best number of perceptrons using
cross-validation, using maximum likelihood parameter estimation (minimum cross-entropy loss)
train an MLP using each training set with as many perceptrons as you have identified as optimal
for that training set. These are your final trained MLP models for class posteriors (possibly each
with different number of perceptrons and different weights). Make sure to mitigate the chances
of getting stuck at a local optimum by randomly reinitializing each MLP training routine multiple
times and getting the highest training-data log-likelihood solution you encounter.
Performance Assessment: Using each trained MLP as a model for class posteriors, and using
the MAP decision rule (aiming to minimize the probability of error) classify the samples in the test
set and for each trained MLP empirically estimate the probability of error.
Report Process and Results: Describe your process of developing the solution; numerically
and visually report the test set empirical probability of error estimates for the theoretically opti-
mal and multiple trained MLP classifiers. For instance show a plot of the empirically estimated
1
test P(error) for each trained MLP versus number of training samples used in optimizing it (with
semilog-x axis), as well as a horizontal line that runs across the plot indicating the empirically
estimated test P(error) for the theoretically optimal classifier.
Note: You may use software packages for all aspects of your implementation. Make sure you
use tools correctly. Explain in your report how you ensured the software tools do exactly what you
need them to do.
Question 2 (40%)
Conduct the following model order selection exercise using 10-fold cross-validation procedure
and report your procedure and results in a comprehensive, convincing, and rigorous fashion:
1. Select a Gaussian Mixture Model as the true probability density function for 2-dimensional
real-valued data synthesis. This GMM will have 4 components with different mean vectors,
different covariance matrices, and different probability for each Gaussian to be selected as
the generator for each sample. Specify the true GMM that generates data.
2. Generate multiple data sets with independent identically distributed samples using this true
GMM; these datasets will have respectively 10, 100, 1000, 10000 samples.
3. For each data set, using maximum likelihood parameter estimation principle (e.g. with the
EM algorithm), within the framework of K-fold (e.g., 10-fold) cross-validation, evaluate
GMMs with different model orders; specifically evaluate candidate GMMs with 1, 2, 3,
4, 5, 6 Gaussian components. Note that both model parameter estimation and validation
performance measures to be used is log-likelihood of data.
4. Repeat the experiment multiple times (e.g., at least 30 times) and report your results, indi-
cating the rate at which each of the the six GMM orders get selected for each of the datasets
you produced. Develop a good way to describe and summarize your experiment results in
the form of tables/figures.