COMM1190 Data, Insights and Decisions
Data, Insights and Decisions
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMM1190 Data, Insights and
Decisions
Assessment 1: Initial report
Customer churn project
The Head of Management Services of Freshland needs to make a presentation to the Senior
Executive Group who have requested an update on customers who belong to their rewards card
program. The Head commissioned a pilot study that was undertaken by a summer intern using a
sample of administrative data that contains information on rewards card customers and their
store purchases.
The intern's report written in response to these instructions is contained in a memo to the
Head, a copy of which is included in Appendix A. The data dictionary in Appendix B refers to the
data you will use. This data set will contain, as a subset, the initial pilot data analysed by the
intern. See Moodle in the Assessment section for further information on your personalized data.
Appendix A: Analysis of customer pilot data
MEMORANDUM
RE: Research project using customer pilot data
Introduction
This report documents an initial statistical analysis of customer data collected from the rewards
program database. All information was extracted in the first week of January 2024. The initial
data set contains observations on 5,847 customers who were all members of the rewards
program and includes information on customer demographics and spending at their last
purchase over the preceding 12 months. The analysis is divided into 4 sections covering general
customer characteristics, spending, level of satisfaction with our services at that last purchase,
and a concluding section with recommendations.
Customer characteristics
Figure 1 and Table 1 provide some key characteristics of customers in the rewards program.
Figure 1: Customer characteristics
Most customers are female (79.4752%) and live in metropolitan areas (63.3145%). The average
age is 48.017059 years, and they have been members of the rewards program for an average of
2.636224 years. One way to define a loyal customer is to use their length of membership as an
indicator. Define a new variable ltmem=1 if member>=3 and zero otherwise. According to this
definition, 58.235% of customers are loyal.
Table 1: Customer characteristics
According to Figure 2, the age distribution of customers is bell-shaped, and there are no outliers.
Figure 2: Age
Customer spending
The spending variable that was made available represents the amount spent on the last
shopping occasion during 2023. Table 2 provides some summary statistics for last and cash.
There was missing data because some customers had no transactions in the last year, leaving a
sample size of 5330 for the analysis. On average, customers spent $79.87305. Because the mean
($79.87305) and median ($64.17379) are not the same, we know that this distribution is not
symmetric. The proportion of customers using cash to conduct this transaction was 0.208818.
Table 2: Customer spending
A correlation analysis was conducted, and the results presented in Table 3. Older customers
spent less as did customers belonging to the rewards program for longer. Female customers and
those in metro areas spent more but none of the correlations were very large. There were
stronger correlations between customer characteristics and cash. Customers who were older
and belonged to the rewards program for longer were more likely to use cash. The strong
negative correlation between cash and last indicates that cash transactions tended to be for
smaller amounts.
Table 3: Correlation matrix
Customer satisfaction
When asked about whether they were satisfied with the last shopping experience, the average
rating of customers was 2.147092, see Table 4.
Table 4: Satisfaction ratings
The underlying distribution of the satisfaction rating is very skewed, with nearly 50% of
customers providing the lowest rating, see Figure 3.
Figure 3: Satisfaction rating
age female metro member last cash sat
age 1
female -0.00746 1
metro 0.010747 0.012926 1
member -0.01398 0.007044 -0.26843 1
last -0.01398 0.005112 0.027406 -0.0707 1
cash 0.426928 -0.01273 -0.06302 0.074617 -0.26851 1
sat 0.003437 0.007428 -0.20347 0.621292 -0.08399 0.075732 1
sat
Mean 2.147092
Standard Error 0.018225
Median 2
Mode 1
Standard Deviation 1.330579
Sample Variance 1.770441
Kurtosis -0.59481
Skewness 0.809454
Range 4
Minimum 1
Maximum 5
Sum 11444
Count 5330
5 | P a g e
Referring to the correlation matrix in Table 3 reveals there is little correlation between sat and
age or female. However, higher ratings are positively associated with longer term members and
negatively associated with being a metropolitan customer. Those who used cash and spent less
were more satisfied.
The strong positive association between length of membership and satisfaction is highlighted in
Figure 4 where the distribution of ratings has been separated by the previously defined
variable, ltmem. The low satisfaction rating is concentrated amongst the newer members.
Figure 3: Satisfaction rating
Conclusion and recommendation
Several conclusions can be drawn from the analysis that has been conducted. The typical
rewards program customer is female, about 48 years old, lives in a metropolitan area and has
been a member for about 2 years.
Cash is still being used for transactions and not surprisingly is more prevalent amongst older
customers who have belonged to the rewards program for longer.
Satisfaction levels are low and need to be improved. Because the correlation between higher
ratings and length of membership was high, more attention should be paid to newer members
and the reasons for why they are dissatisfied. This is an important finding in understanding
customer loyalty but there needs to be further analysis to better inform retention strategies.
Recommendation: The current data set is not overly large, and obtaining more customer
observations is essential to facilitate extra analyses and to find more significant results.
6 | P a g e
Appendix B: Data dictionary
The data set on customers from the rewards program data base includes the following
variables:
age Age of the customer in years
female =1 if customer is female; =0 otherwise
member Number of years as a member of loyalty club (top coded at 4)
metro =1 if customer is located in a metropolitan area; =0 otherwise
location Customer location: 1=metropolitan; 2=regional; 3=all other regions
ID Unique customer identifier
last Amount ($) of the last transaction in the previous 12 months
cash =1 if the last transaction was paid in cash; =0 otherwise
sat Satisfaction rating of last transaction; 1 (highest=Excellent) to 5 (lowest=Poor)
pilot =1 if initial data collected for pilot study; =0 otherwise