Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Financial Statistics
ST326 Assessed Coursework
1 Questions
This project is on the analysis of a bundle of stocks that are constituents
of S&P500. You have the freedom to choose 10 stocks from the top 100
constituents by weight. The weights change everyday, but as long as the 10
stocks you have chosen have been the top 100 constituents on a particular
day and have been traded over the past 5 years, it is fine.
1. Download the daily closing prices of the 10 stocks and the S&P500
index price for the past 5 years. Do not include 2 or more stocks from
the same company but only of different classes. Plot their log-prices
on the same plot.
You can deal with potential missing values using R codes similar to
Chapter 3 of your lecture notes, or any other methods, but you need
to justify them.
If you are downloading data using quantmod package, you may want to
export the data to a text file first using
library(quantmod)
getSymbols(’F’)
F = as.data.frame(F)
F = cbind(as.numeric(as.Date(rownames(F))), F)
write.table(F, "F.txt", row.names=FALSE)
getSymbols(’^GSPC’)
GSPC = as.data.frame(GSPC)
GSPC = cbind(as.numeric(as.Date(rownames(GSPC))), GSPC)
write.table(GSPC, "GSPC.txt", row.names=FALSE)
Then the corresponding lines in read.bossa.data inside a for loop
should be changed to
Financial Statistics
filename <- paste("project/", vec.names[i], ".txt", sep="")
### If you store your .txt files in a folder called "project"
tmp <- scan(filename, list(date=numeric(), NULL, NULL, NULL,
close=numeric(), NULL, NULL), skip=1, sep="")
Then you can read
ind = read.bossa.data(c("F", "GSPC"))
Are there any similar trends or not?
2. Our aim in this part is to predict the next day S&P500 return using q
lags of S&P500 as well as the most up-to-date returns of the 10 stocks
you have chosen.
Split the data set into 50% training, 25% validation and 25% test sets.
If you are following the steps in part 1., remember to change shift.indices
in pred.footsie.prepare to appropriate values, and any other changes
you need if you want to use the function in its entirety. (remember we
are using the past 5 years of data only)
For the 11 daily return series, write an R programme to use exponential
smoothing to estimate their daily volatilities over the horizon of the
training data. Individual λ for each series should be estimated by
MLE. In doing so, you should write down the assumed model for each
time series.
3. Write down a modified prediction algorithm similar to the one in Sec-
tion 3.4 of your lecture notes (define all notations involved), so that:
i. It takes in a warmup time t0, a window length D, and the ap-
propriately normalised (using the same λ’s found in part 2) 10+q
return series as input.
ii. It uses ordinary least squares for linear regression over a rolling
window of length D as a way to estimate the next day normalised
return for the S&P500.
iii. The investment strategy is to invest 1 unit of money into S&P500
if the next day return is predicted to be positive, and -1 unit of
money (i.e., short-selling) if the next day return is predicted to be
negative.
Financial Statistics
iv. The annualised Sharpe ratio is calculated in the end, using daily
true return (i.e., true day-(t + 1) return for your investment at
time t for S&P500), but ignoring all transaction costs.
4. Code the above algorithm in R, for training, validation and test sets.
For the validation and test sets, the same λ found in part 2 can be
used. The output should be Sharpe ratios for different values of window
lengths.
Run the algorithm with q = 0 and q = 1. In both cases, comment
on the appropriateness of using ordinary least squares over the train-
ing, validation and test sets, with justifications (include corresponding
graphs if possible) to your arguments.
5. As a way to improve upon ordinary least squares, the one-day-ahead
S&P500 return is to be predicted using the factors from the 10 stocks
you have chosen as covariates. Instead of determining the number of
factors using a scree plot for each window, treat the number of factors
as another tuning parameter, on top of the window length. To simplify
your task, consider number of factors up to 2. (The technique is called
principal component regression)
Hence in each window, perform a multi-factor analysis, and use the
estimated factor series as the covariates, still using a linear model for
predicting the one-day-ahead S&500 return. The output Sharpe ratios
for our trading strategy should then be dependent on window length as
well as number of factors considered in each window (you can assume
the number of factors used in each window is a constant).
Is this method better than just using ordinary least squares? Describe
your findings, with supporting arguments and outputs.
2 Submission
• Submit your work anonymously under your candidate number in
LSE For You. (NOT your ID Number starting with 20XX). Write
your candidate number on a cover page as well within the pdf
file.
• Plagiarism will be checked, and students who found to plagiarise will
not only be penalised, but also face potential disciplinary actions from
the school.
Financial Statistics
• Upload a single pdf file to the corresponding course-work upload link
on Moodle.
• The single pdf file should contain your presented answers including
graphs and tables. All R codes used should be added in an appendix
in the end.
• The upload link will stop working after the deadline indicated on the
link. You can still submit then by sending the file directly to me.
Late submission will result in penalties: 5 marks (out of maximum
100) will be deducted for every half-day (12 hours). This will result in a
maximum penalty of 10 marks for the first 24 hours. A further 5 marks
will be deducted per 24 hour period thereafter (including weekends.)
• Extensions to deadlines for coursework will only be given in fully doc-
umented serious extenuating circumstances.