CSE514 Programming Assignment 1
This assignment is to enhance your understanding of objective functions, regression models, and the gradient descent algorithm for optimization. It consists of a programming assignment (with optional extensions for bonus points) and a report. This project is individual work, no code sharing please, but you may post bug questions to Piazza for help.
Topic
Design and implement a gradient descent algorithm or algorithms for regression.
Programming work
A) Data pre-processing
Pre-process the attribute values of your data by normalizing or standardizing each variable. Make sure to keep a copy that was not pre-processed, so you can analyze the effect that pre-processing the data has on the optimization.
B) Univariate linear regression
In lecture, we discussed univariate linear regression y = f(x) = mx+b, where there is only a single independent variable x, using MSE as the loss function.
Your program must specify the objective function of mean squared error and be able to apply the gradient descent algorithm for optimizing a univariate linear regression model.
C) Multivariate linear regression
In practice, we typically have multi-dimensional (or multi-variate) data, i.e., the input x is a vector of features with length p. Assigning a parameter to each of these features, plus the b parameter, results in p+1 model parameters. Multi-variate linear models can be succinctly represented as:
y = f(x) = (m · x) (i.e., dot product between m and x),
where m = (m0, m1, ..., mp)T and x = (1, x1, ..., xp)T, with m0 in place of b in the model.
Your program must be able to apply the gradient descent algorithm for optimizing a multivariate linear regression model using the mean squared error objective function.
D) Optional extension 1 – Mean Absolute Error as the loss function
For bonus points, include the option of optimizing for the MAE instead of MSE. Calculating MAE as your error is insufficient! You must define a new gradient calculation to be used for the gradient descent optimization process.
E) Optional extension 2 – Ridge Regression
For bonus points, include the option of optimizing an l2 penalty as part of your loss function. Calculating MSE + l2 as your error is insufficient! You must define a new gradient calculation to be used for the gradient descent optimization process. You must tune the ??? hyperparameter value for minimizing test error.
IMPORTANT: Regression is basic, so there are many implementations available, but you MUST implement your method yourself. This means that you cannot use an embedded function for regression or gradient descent from a software package. You may use other basic functions like matrix math, but the gradient descent and regression algorithm must be implemented by yourself.
Data to be used
We will use the Concrete Compressive Strength dataset in the UCI repository at
UCI Machine Learning Repository: Concrete Compressive Strength Data Set
Note that the last column of the dataset is the response variable (i.e., y). There are 1030 instances in this dataset.
Use 900 instances for training and 130 instances for testing, randomly selected. This means that you should learn parameter values for your regression models using the training data, and then use the trained models to predict the testing data’s response values without ever training on the testing dataset.
What to submit – follow the instructions here to earn full points
(80 pts total + 17 bonus points) The report as a pdf o Introduction (15 pts + 5 bonus points)
(4 pts) Your description/formulation of the problem (what’s the data and what practical application could there be for your work with it, beyond just “this is my homework” or “I want to optimize this equation”),
(3 pts) a description of how you normalized or standardized your data. Include some figures that illustrate how the distribution of feature values changed because of your pre-processing