Although the term 'Generative AI' (GenAI) is widely recognized, its practical application in daily workflows has yet to be understood. This exercise introduces students to GenAI tools, demonstrating how they can be seamlessly integrated into professional work practices to co-invent, analyze data, generate images, summarize text, etc. The exercise guides students through developing a fictional snack company, showcasing the versatility of GenAI in tasks such as market analysis, brand development, and the formulation of marketing strategies. The exercise culminates with students creating a comprehensive design document and presentation that can be used to pitch to investors. Key takeaways include understanding how GenAI works, incorporating AI into developing and launching a new product, and offering valuable lessons on AI's potential and limitations in the modern workplace.
This case study explores the opportunities and challenges of the digital transformation journey of French wine and spirits company Pernod Ricard. As part of the transformation, the company launched four key digital programs (KDPs) aimed at using data and artificial intelligence to automate processes and drive data-driven decision-making. The case primarily focused on two of these: D-STAR, a sales recommendation system, and Matrix, a tool that optimized the allocation of advertising spend across brands. The company's future direction with the KDPs depended on addressing resistance, providing effective training and support, aligning with strategic goals, and overcoming logistical and data-related hurdles. The company needed to find a way to expand the KDPs further into new markets while reinforcing adoption where the KDPs had already been launched, and the decisions made would shape the path forward.
This note provides an overview of causal inference for an introductory data science course. First, the note discusses observational studies and confounding variables. Next the note describes how randomized experiments can be used to account for the effect of confounding variables. Then it walks through the steps to designing an experiment, including a discussion of how to calculate the power of a test.
This note provides an overview of linear regression for an introductory data science course. It begins with a discussion of correlation, and explains why correlation does not necessarily imply causation. The note then describes the method of least squares , and how to interpret the r-squared and model coefficient values of a simple linear regression model. Next, the note describes how the interpretation of a model coefficient changes when there are multiple independent variables in the model. Finally, the note explains how to interpret the coefficients on dummy variables in a regression model. The appendix includes R code for implementing all of these topics.
This note provides an introduction to machine learning for an introductory data science course. The note begins with a description of supervised, unsupervised, and reinforcement learning. Then, the note provides a brief explanation of the difference between traditional statistical modeling and machine learning. Next, the note covers two models used for classification, logistic regression and decision trees. After introducing these two models, the note explains how train, validation, and holdout sets (and k-fold cross validation) are used to tune and evaluate different models. Finally, the note concludes with a discussion of different performance metrics (ROC cruves, confusion matrices, log loss) that are used to evaluate classification models.
This note provides an overview of statistical inference for an introductory data science course. First, the note discusses samples and populations. Next the note describes how to calculate confidence intervals for means and proportions. Then it walks through the logic of hypothesis testing and the interpretation of p-values (in the context of two-sample hypothesis testing for means and proportions). The appendix of the note contains R code for all of these topics.
This module note provides an overview of exploratory data analysis for an introduction to data science course. It begins by defining the term "data", and then describes the different types of data that companies work with (structured v. unstructured, categorical v. numeric, etc.). Next, the note describes the basic summary statistics that firms use to track key business outcomes. Finally, the note provides an overview of different visualizations. An appendix is provided, which includes the R code for creating all of the figures and visualizations shown in the note.
The case explores the development and early growth of a data science team at the Golden State Warriors, an NBA team based in San Francisco. The case begins by explaining the initial rationale for investing in data science, then covers a debate on the appropriate team structure, navigating the initial hires, and which projects the team should prioritize. The rest of the case describes the first major project the team worked on: a model to predict when customers would purchase tickets. Along the way, the team had a number of important decisions that they needed to make around what outcome to model and what model to use. Unfortunately, just as the model was completed, the team faced a setback when the NBA season was suspended due to COVID-19, pausing the project for a year. The case then takes place before the start of the 2021-22 seasons, with the team needing to determine if they should use the seasons to run an experiment and evaluate their model on real customers or if they should fully launch the model in the hope that it increases ticket sales. The case comes with supplementary data and code (HBS Supplement no 624-712) that allows students to perform the analysis described in the case.
Describes a marketing director about to launch a new process for demand forecasting. Provides data that allow students to do a multivariable regression analysis. A rewritten version of an earlier case.
Orchadio, a direct-to-consumer grocery business, needs to conduct its first two A/B tests-one to evaluate the effectiveness and functioning of its newly redesigned website, and one to market-test four versions of a new banner for the website. To do so, it will rely on a technology management platform designed by Split Software, whose feature flags allow Orchadio engineers to (1) turn specific software features on and off and (2) to limit access to those features to specific groups of website visitors. These capabilities in turn enable A/B feature testing. Split also offers data analytics to allow Orchadio to assess how its tests affect a plethora of Orchadio's business, software, and operating metrics. Orchadio managers now need to decide how to design their experiments for maximum impact, including whether or how to sequence them.
Goldman Sachs runs an annual internship for over 3,000 participants, spread across dozens of the firm's global offices. In 2020, the team brought all its resources to bear to transform the internship program into a fully virtual format in just a few short weeks. The new all-virtual internship faced challenges, but also benefited from unexpected opportunities. As the program ended, the team reflected on what worked, what they would change, and what the future of the internship program at Goldman Sachs could look like in 2021 and beyond.
Over the last decade, experimentation has become integral to the research and development processes of technology companies-including Yelp-for understanding customer preferences and mitigating innovation risks. The case describes Yelp's journey with experimentation, from running a few experiments across various teams to building a centralized experimentation platform that standardized and improved the experimentation process. Concurrently, the case describes a pivotal experiment to evaluate the impact of geographically constrained adverts on user experience and revenue. The results of the experiment are not included in the case but are part of supplementary data (courseware no. 621-703); instructors can either provide the data to the students or give the results in a table as part of the assignment questions. Based on the results, the protagonist must determine an appropriate course of action, carefully managing the various trade-offs.