St. Camillus is a fictional non-profit hospital in rural Maine facing a serious budget deficit. As Director of Marketing, Victoria Stern is building a team to modernize the hospital fundraising efforts. An interview with a promising candidate who is also a digital native forces her to confront shifting attitudes toward issues of data and privacy across generations against the backdrop of continued cybercrime and data breaches within the healthcare industry. Students grapple with the tension between data as "useful" or private within an organization. In this context they explore the ethical issues they will face around data security and data governance in their roles as managers.
This case builds directly on the case Predicting Purchasing Behavior at PriceMart (A), in which VP of Marketing, Jill Wehunt, and analyst Mark Morse build a logistic regression model to predict whether a customer household is expecting a baby. In this case, Wehunt and Morse are concerned about the logistic regression model overfitting to the training data, so they explore two methods for reducing the sensitivity of the model to the data by regularizing the coefficients of the logistic regression. Wehunt and Morse then compare the models and select the model most effective at correctly classifying households as expecting. Students explore the relationship between the model's confusion matrix, which organizes the model's correct and incorrect classifications, the cutoff point on the curve that matches true positives and true negatives, and the payoff matrix Wehunt and Morse construct. Students can then follow the link directly from their model to their marketing strategy. Technical topics covered: Ridge logistic regression (or L2 regularization) as a modelling technique; Lasso logistic regression (or L1 regularization) as a modelling technique; Comparing models, thinking about coefficients, and selecting model for deployment; Evaluating model output; ROC curve, cutoff point, confusion matrix; payoff matrix as a framework for utilizing the model to carry out marketing strategy.
This case follows VP of Marketing, Jill Wehunt, and analyst Mark Morse as they tackle a predictive analytics project to increase sales in the Mom & Baby unit of a nationally recognize retailer, PriceMart. Wehunt observed that in the midst of the chaos that surrounded a new baby, parents' shopping habits became quickly ingrained. She hypothesized that if she could get households expecting a new baby to make PriceMart a part of their routines before becoming parents, she might keep them as customers for the next several years, winning significant additional revenue. Technical topics covered: Collecting data and preparing a dataset; constructing training, validation, and holdout sets; cross-validation; Linear regression as a modelling technique; statistical tests; Logistic regression as a modelling technique for estimating predictions between 0 and 1; maximum likelihood estimation; log likelihood; Comparing model outputs.
LendingClub was founded in 2006 as an alternative, peer-to-eer lending model to connect individual borrowers to individual investor-lenders through an online platform. Since 2014 the company has worked with institutional investors at scale. While the company assigns grades and sub-grades to each application using its own risk evaluation model, it also makes detailed data on each loan applications available to both kinds of investors for their own analyses. The case follows MBA graduate Emily Figel as she researches LendingClub as a potential investment vehicle for the small wealth management firm she will join in the fall. Using LendingClub's historical data, she learns the fundamentals of predictive analytics to see whether she can build models to predict whether a borrower will repay or, ultimately, default on the obligation. This first case (A) presents students will relevant, detailed data about how the LendingClub model works. This includes LendingClub's business model, the grading of loans, the unique opportunities and risks. It also follows Figel as she dives into the data to use it to build a model. In the B and C cases, Figel explores several specific techniques for training models. Technical topics include: understanding the data, data preparation, balanced and unbalanced data sets, constructing training-validation-holdout sets, cross-validation, predictions and target leakage.
This case builds directly on Chateau Winery (A). In this case Bill Booth, marketing manager of a regional wine distributor, shifts to supervised learning techniques to try to predict which deals he should offer to customers based on the purchasing behavior of those customers closest to them. Topics include: Supervised learning; collaborative filtering; K-nearest neighbor as a modeling technique; collaborative filtering with cosine similarity of customers; collaborative filtering with cosine similarity of products; comparison of different prediction models.
This case follows Bill Booth, marketing manager of a regional wine distributor, as he applies unsupervised learning on data about his customers' purchases to better understand their preferences. Specifically, he uses the K-means clustering technique to identify groups of customers who have purchased any number of 32 specific "deals" Booth offered over the year, differentiated by the wine varietal as well as its country of origin and a minimum number of bottles to purchase. Insights from this analysis may help him understand themes across the deals that can inform construction of new deals in the future. Topics include: Unsupervised learning; similarity and proximity; K-means clustering, with measures of Euclidean distance and cosine similarity; Gaussian mixture models; interpreting clusters.
This case builds directly on the cases LendingClub (A) and (B). In this case students follow Emily Figel as she builds an even more sophisticated model using the gradient boosted tree method to predict, with some probability, whether a borrower would repay or default on his loan. Having now built three models, Figel compares them to determine which model is most effective at classifying borrowers correctly then uses that model to determine how to invest in a portfolio of loans. Students explore the relationship between her model's confusion matrix, which organizes the model's correct and incorrect classifications, the cutoff point on the curve that matches true positives and true negatives, and the payoff matrix Figel constructs. Students can then follow the link directly from the model Figel builds to her specific investment decisions. Technical topics include: (1) Gradient boosted trees as a modelling technique; hyperparameters and learning rate; model validation; and (2) Evaluating model output; ROC curve, cutoff point, confusion matrix; payoff matrix as a framework for utilizing the model to compare investment opportunities.
This case builds directly on the case LendingClub (A). In this case students follow Emily Figel as she builds two tree-based models using historical LendingClub data to predict, with some probability, whether borrower will repay or default on his loan. Technical topics include: (1) Decision trees as a modelling technique, overfitting and induction bias, model validation; (2) Random forest as an ensemble-style modelling technique, bootstrapping, random feature selection; and (3) Log loss as a metric for evaluating and comparing models, feature impact.
Insigne Health is a fictional for-profit, integrated health insurer/health care provider whose leadership believes that by shifting members' focus from "sickness" to "well-being" it could increase the overall health of its insured population and decrease the resources it spent each year on delivering care. The case puts students in the role of design researcher charged with understanding the member segment about which Insigne Health leadership is most concerned: The "silent middle." This cohort represents 70% of membership. They are "neither sick nor well," and may, without changes in a range of behaviors, be quietly developing conditions that will evolve into costly chronic diseases. From interviews included in the case, students uncover insights into member behavior and, based on these insights, generate and develop concepts to help members change behaviors and lead healthier lives.
Paritosh Desai joined Target.com in 2013 as VP of Business Intelligence, Analytics & Testing to explore how the retailer could use its relatively small but thriving e-commerce arm to drive sales and win customers. The case explores the technological and organizational challenges Desai faced and the trade offs he considered in his four-year journey to develop the larger retail business into a data science organization.
This case was written for the EC course "Managing with Data Science." The course provides MBA students with no programming experience an introduction to the field of data science and its applications in business. Students learn to (1) carefully articulate the business ask, (2) reason carefully from the ask; through metrics and models, and outputs; and (3) evaluate outputs from models to (4) develop a plan for action. In this case students explore the challenges of using sentiment analysis to monitor and understand public perception around brands. Technical topics include building a filtering classifier using naive Bayes and sentiment analysis.
This case was written for the EC course "Managing with Data Science." The course provides MBA students with no programming experience an introduction to the field of data science and its applications in business. Students learn to (1) carefully articulate the business ask, (2) reason carefully from the ask, through metrics and models, and outputs; and (3) evaluate outputs from models to (4) develop a plan for action. The case features the work of LP Maurice (HBS '08) as he decides to launch an online business to unify the scheduling data of a fragmented bus industry, only to find that the data he needs simply does not exist. The case follows him and his team as they take on challenges as the start-up evolves and develops. Topics include acquiring data, web-based prototyping to understand customer preferences, and customer acquisition and retention metrics (SEO & SEM).
Bhagwan Mahaveer Viklang Sahayata Samiti (BMVSS) is an Indian not-for-profit organization engaged in assisting differently-abled persons by providing them with the legendary low-cost prosthesis, the Jaipur Foot, and other mobility-assisting devices, free of cost. Known for its patient-centric culture, its focus on innovation, and for developing the $20 Stanford-Jaipur knee, BMVSS has assisted over a million people in its lifetime of 44 years. As the founder, Mr. D.R. Mehta, thinks about the financial sustainability of BMVSS, he must devise a strategy that will sustain its human impact well into the future.