學門類別
哈佛
- General Management
- Marketing
- Entrepreneurship
- International Business
- Accounting
- Finance
- Operations Management
- Strategy
- Human Resource Management
- Social Enterprise
- Business Ethics
- Organizational Behavior
- Information Technology
- Negotiation
- Business & Government Relations
- Service Management
- Sales
- Economics
- Teaching & the Case Method
最新個案
- A practical guide to SEC ï¬nancial reporting and disclosures for successful regulatory crowdfunding
- Quality shareholders versus transient investors: The alarming case of product recalls
- The Health Equity Accelerator at Boston Medical Center
- Monosha Biotech: Growth Challenges of a Social Enterprise Brand
- Assessing the Value of Unifying and De-duplicating Customer Data, Spreadsheet Supplement
- Building an AI First Snack Company: A Hands-on Generative AI Exercise, Data Supplement
- Building an AI First Snack Company: A Hands-on Generative AI Exercise
- Board Director Dilemmas: The Tradeoffs of Board Selection
- Barbie: Reviving a Cultural Icon at Mattel (Abridged)
- Happiness Capital: A Hundred-Year-Old Family Business's Quest to Create Happiness
Multivariate Datasets: Data Cleaning and Preparation, and Model Development with Python and Machine Learning
內容大綱
Data cleaning, data preparation, and model development are the crucial steps in data analytics. The first two steps aim to improve data quality for higher accuracy, improved productivity, and better efficiency in modelling and obtaining results. The last step, model development, seeks to improve accuracy of prediction, especially in predictive modelling. In this technical note, we use a sample to illustrate how to work with a multivariate dataset in Python. This dataset's massive number of variables requires different approaches to data cleaning, preparation, and model development, such as data normalization and dimension reduction.