Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Marketing Analytics
Assignment
§ Submission Format: PDF or DOCX file
Context
Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance using the data that they already have. Store sales are influenced by many factors, including promotions, competition, school and state holidays, and seasonality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied. Reliable sales forecasts enable store managers to create effective staff schedules that increase productivity and motivation. By helping Rossmann create a robust prediction model, you will help store managers stay focused on what is most important to them: their customers and their teams!
Question 1: Fundamentals of Data Analysis (2 points)
a. How do you think the following variables influence sales, including promotions, competition, school and state holidays, and seasonality? For example: I think promotions increase sales because consumers are more likely to purchase products when they are on promotions. Make sure to provide your reason(s). (1pt)
b. What are some other factors that you think also influence sales? Name at least two and give your reasons. For example, I think weather influences sales because consumers don’t like going shopping when it’s raining or snowing. For one of the factors that you listed above, please explain how to collect it in reality. Please make it as specific as possible. For example, I can collect weather data from the website http://www.wunderground.com/ given the locations of the stores. (1pt)
Question 2: JMP (6 points)
Use the table A1.xlsx. There are two sheets: sales and store. Use JMP. Here’s the definition of data fields.
Store - a unique ID for each store
Day of Week - numerical representation of weekdays (1 for Monday - 7 for Sunday)
Date - format in MM/DD/YYYY
Sales - the turnover for any given day
Customers - the number of customers on a given day
Open - an indicator for whether the store was open: 0 = closed, 1 = open
StateHoliday - indicates a state holiday. Normally all stores, with few exceptions, are closed on state holidays. Note that all schools are closed on public holidays and weekends a = public holiday, b = Easter holiday, c = Christmas, 0 = none
SchoolHoliday - indicates if the (Store, Date) was affected by the closure of public schools 1 = school holiday, 0 = no school holiday
StoreType - differentiates between 4 different store models: a, b, c, d
Assortment - describes an assortment level: a = basic, b = extra, c = extended, d = unknown
CompetitionDistance - distance in meters to the nearest competitor store
CompetitionOpenSince[Month/Year] - gives the approximate year and month of the time the nearest competitor was opened
Promo - indicates whether a store is running a promo on that day
Promo2 - Promo2 is a continuing and consecutive promotion for some stores: 0 = store is not participating, 1 = store is participating
Promo2Since[Year/Week] - describes the year and calendar week when the store started participating in Promo2
a. What are the data types? Specify the data types separately for each data spreadsheet. I
Is the ‘sales’ datasheet: primary vs. secondary, experimental vs. non-experimental, stated vs. revealed preference, cross-sectional vs. longitudinal, and/or individual vs. aggregate? Is the ‘stores’ datasheet: primary vs. secondary, experimental vs. non-experimental, and/or cross-sectional vs. longitudinal? (1pt)
b. Now, what are the variable types? Are they correctly categorized? If not, indicate the correct variable types (continuous vs. nominal vs. ordinal). (1pt)
c. Summarize all variables. Briefly explain your findings for each variable by answering all of the following: (2pts)
- How many stores are there? What’s the starting and ending dates of the data?
- What’s the average number of customers per store per day?
- What’s the median sales per store per day?
- How many types of state holidays are there (i.e., # of possible values for this variable)?
- What’s the average competition distance for each store?
d. Are sales significantly higher when there’s promotion (i.e., “Promo” variable)? Indicate the statistical test you use and answer the question based on the test results. Make sure to provide relevant JMP output as well. (2pts)
Question 3: Tableau (4 points)
a. Plot and show the time series of sum of sales at different levels: year, month, and day of the week. Any interesting stories? Explain the patterns you detect and make sure to provide all relevant visualizations. (1.25pts)
b. Plot and show the time series of average sales at different levels: year, month, and day of the week. Do they look different from those in Question 3a? Why? Explain the patterns you detect and make sure to provide all relevant visualizations. (1.25pts)
c. Create and provide plots of sales and at least 3 of other variables (e.g. Customers, SchoolHoliday, Promo, Promo 2, etc.). If the other variable is a continuous variable, do a scatter plot and add a trend line. If the other variable is a nominal variable, do a bar plot with mean values. Which variable do you think impacts sales? Compare with your answer in Question 1a. Make sure to explain the pattern you detect and provide your insights for each plot. (1pt)