BMAN60422 DATA ANALYTICS FOR BUSINESS DECISION MAKING
DATA ANALYTICS FOR BUSINESS DECISION MAKING
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
BMAN60422
DATA ANALYTICS FOR BUSINESS DECISION MAKING
Use your Student ID as the file name of your submission
Answer ALL questions
You have 48 hours to complete this examination, but it is expected that you spend
no more than three times the duration of the original examination completing it.
You must not exceed the specified word limit for short-answer questions. This is a
maximum word limit (+10% does not apply). Markers will be instructed that they do
not have to mark anything beyond the words limit.
You will not be penalised for answers that are shorter than the limit; answers will be
given credit for being comprehensive, rather than for being a certain length.
Electronic calculators may be used in accordance with the University regulations
PTO
BMAN60422
Page 2 of 4
SECTION A. Multiple-choice questions - WITHHELD
PTO
BMAN60422
Page 3 of 4
SECTION B. Short-answer questions (your answer to each question in this section should
be no more than 150 words in length)
4. Introduce five different ways of reducing the number of variables in predictive
modelling, and discuss briefly their main strengths.
(10 marks)
5. The following table gives a sample data set of the churn management for a telecom
company, where the target variable Churn_flag indicates if the customer has
churned (Churn_flag=1) or not (Churn_flag=0), and the last two columns provide the
classification results from a decision tree model. Comment on the performance of the
decision tree model through constructing a confusion matrix, and plot the
corresponding ROC curve.
Actual data Decision tree model
Customer ID Churn_flag
Predicted:
Churn_flag = 1
Predicted:
Churn_flag
1 1 0.95 1
2 1 0.90 1
3 1 0.75 1
4 0 0.65 1
5 0 0.55 1
6 1 0.50 0
7 1 0.45 0
8 0 0.30 0
9 0 0.25 0
10 0 0.10 0
(10 marks)
6. For the data shown below, briefly discuss whether the regression model or the
neural network would be considered as a better modelling choice.
(10 marks)
PTO
BMAN60422
Page 4 of 4
7. Use two appropriate sketches to explain how the Silhouette Width can be used to
support the following tasks:
- Determination of the number of clusters.
- Assessment of the quality of a given partition, with consideration of the
assignment of individual data points.
(10 marks)
8. At a high level, explain the concept of topic modelling, as implemented in the SAS
software. What does topic modelling tell us about a document collection and about
each document individually? What does a topic represent? How can we validate the
results of topic modelling?
(10 marks)
9. Describe the four V’s of big data, and use examples to explain the challenges each
of these poses for traditional data analytics.
(10 marks)
SECTION C. Short-essay question (your answer should be no more than 300 words in
length)
10. Reflect on the key steps you have taken to complete the group coursework project
for this course unit, in line with the cross industry standard process for data mining
(CRISP-DM) framework, and discuss briefly the limitations of the data analytics
techniques you have applied.