Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MKTG6010 Assignment
Data Case Analysis (1,000 words) With the proliferation of the Web 2.0 technologies product, ‘experiences’ can be conveniently shared on the internet which can reduce information uncertainty for decision making. Online comments reflect others’ product experience and therefore help in making a purchase decision. Information gathered via word-of-mouth (WOM) significantly influences product evaluations and purchase decisions. eWOM(online WOM) has shifted the power to the consumer. A staggering 92 percent of consumers around the world say they trust earned media, such as recommendations from friends and family, above all other forms of advertising. Therefore, sometimes it does not matter how effective your campaign is because a bad review through the internet can destroy it quickly. Hence, it is important to understand the experiences of the consumer, both negative as well as positive to be able to capture the complete picture, adopt the right marketing strategy, and improve their experience in the future. A fundamental objective of the motion picture industry has been to understand the overall experience of moviegoers/ spectators/audience and consequently derive better financial remuneration from its theatrical exhibition. Classical film theorists conceived the spectator as a passive participant in receiving the film as a mediated message. However, there is a substantial transformation in understanding the moviegoers’ satisfaction from mere spectators watching films to “experiencing” the film. Considering that the ability of a film to provide a memorable experience colored with emotions, affects and fantasies is dependent on the pleasures the movie offers, the desires it elicits and most importantly the motivations behind the viewer’s watching of the movie, the key to understand this experience is to understand the nature of the film spectator’s response to the film. Consumers including moviegoers want experiences which provide a novel and creative escape from everyday life. The film provides such opportunity through the vicarious experiences it provides, thus making an indelible impression on their memories and intersecting with their lives in significant ways. Suppose you are a movie producer and want to learn about what consumers have been sharing online and how it will influence box office revenues. A data set from IMDB has been collected for this purpose. You are only required to analyse and interpret the data provided. This assessment will test your knowledge and ability in analysing quantitative data by using a variety of methods learned in class. You will be required to use various machine learning techniques to address the following questions by analysing the data provided. Q1: Use topic modelling to figure out what users are talking about on IMDB. Are the topics for action the same as the ones for comedy? (40 marks) Q2: Use sentiment analysis to estimate ratings of the top 2 topics for action and comedy movies and interpret the results. (30 marks) (Hint: You can apply sentiment analysis on the tweets highly relevant to each of the topics separately. For example, the top 30% of the tweets which are highly relevant to a topic.) Page 2 of 3 Q3: Use regression analysis to figure out how the sentiment scores (you obtained from Q2) for the top 2 topics for action and comedy movies influence the box office revenue. What are the differences in results for action and comedy movies? What are the managerial implications from the results? (30 marks) You need to get yourselves familiar with the corresponding data. Please go through the data description carefully. What information has been collected in the data? What does each variable (column) represent? You may need to clean up the data before actual analysis. You’re free to use other necessary techniques learned in/outside this course (e.g., descriptive statistics, tabulation). For each question, you will need to specify what test(s) was (were) used and what information from the survey e.g., variable(s) was used? Data Description The data file is Exam_data.csv. It contains: movie: movie name imdbid: unique IMDB ID review_post_date: the data that a review was posted on IMDB review: review on IMDB rating: rating on IMDB (10 point scale-max: 10; min:1) user_name: the name of the user who posted this review num_helpful: the number of yes votes for helpfulness on IMDB num_helpful; the total number of votes for helpfulness on IMDB (i.e., yes + no votes) box_office_revenue: the total sales ($) of the movie movie_distributor: movie distributor budget: movie budget ($) release_date: movie release date close_date: movie close date mpaa: movie ratings by Motion Picture Association (i.e., G: General audiences – All ages admitted; PG: Parental guidance suggested – Some material may not be suitable for children; PG-13: Parents strongly cautioned – Some material may be inappropriate for children under 13; R: Restricted – Under 17 requires accompanying parent or adult guardian.) genre: the movie genre (i.e., Action, Comedy, Drama, Fantasy, and Horror) max_screens: the maximum number of screens shown on for this movie Page 3 of 3 Submission Instructions: 1. Your answers are to be typed with appropriate outputs shown in word document. 2. Word limit: 1,000 words (excluding appendices). This limit is a soft limit; meaning you can go somewhat beyond this limit without grading penalty. But remember the more you write, the more contribution and insight the grader will expect to see. 3. You need to submit your answers in word document (or PDF) as well as the python codes and output (.ipynb file). Marking Criteria: Marks awarded will be based on the following: 1. Good understanding of issues (questions) you want to address. 2. The appropriate application of ML techniques using the “right” measures/variables. 3. Provision of appropriate analysis citing relevant Python outputs as evidences to your findings. 4. Ability to apply and communicate results in context of the research questions/issues highlighted. 5. Overall professional presentation of written work; e.g. Layout, grammar, integration of results & findings, clarity of recommendations etc. Note that you will be provided with a different dataset depending on your SID. Please check your SID and the last digit of SID, and download and use an appropriate dataset. Don’t try to collaborate with other students as it is an individual assignment involving different datasets.