Chapter 10 Random Forest and Gradient Boosting Models: An Application in Finance
Description
The financial industry is one of the first industries to have begun its digital transformation. The importance of managing internal data, but also of contextualizing it with external data while thinking about new data sources has accelerated this transformation. In this session, we discuss ensemble models, a powerful technique that allows for the combination of many models to create improved classifiers. We will use a case study based on a U.S. financial firm whose goal is to build predictive models to improve efficiency in decision making. We will also begin to address the ethical issues and statistical biases of data collection as well as the algorithmic methods themselves.
Concepts discussed :
1 numerical transformation
2 new data sources
3 methods of random forest and gradient reinforcement
Pre-Session Activities/Resources
- Datar, Srikant M., and Caitlin N. Bowler. 2018a. “LendingClub (A): Data Analytic Thinking (Abridged) | Harvard Business Publishing Education.” 2018. https://hbsp.harvard.edu/product/119020-PDF-ENG?itemFindingMethod=Other.
Session Activities/Resources
Datar, Srikant M., and Caitlin N. Bowler. 2018b. “LendingClub (B): Decision Trees & Random Forests | Harvard Business Publishing Education.” 2018. https://hbsp.harvard.edu/product/119021-PDF-ENG?itemFindingMethod=Other.
Datar, Srikant M., and Caitlin N. Bowler. 2018c. “LendingClub (C): Gradient Boosting & Payoff Matrix | Harvard Business Publishing Education.” 2018c. 2018. https://hbsp.harvard.edu/product/119022-PDF-ENG?itemFindingMethod=Other.
Post-session Activities/Resources
- Sanders, Nathan. 2019. “A Balanced Perspective on Prediction and Inference for Data Science in Industry.” Harvard Data Science Review 1 (1). https://doi.org/10.1162/99608f92.644ef4a4.
General Resources