DataPalooza 2022

Internal Use Only

Assumptions

• Model is built on historical quarterly call report data starting from 2006 • Economic data from Moody’s Analytics is quarterly • Data from state and national community banks were included in the model training and test data • Banks missing necessary data were excluded from the model

Internal Use Only

Model Development • Identified input variables to help predict high risk banks using a random forest algorithm • Examined variables that contributed the most by their relationship to the risk level • Narrowed the list of input variables from 280+ to around 40 • With the more selective list of variables, likelihood of high risk is modeled using logistic regression • Tested additional variables for significance based on regulator input • A model using the Liquidity Ratio as the target variable was developed but the predictive accuracy was not satisfactory

Made with FlippingBook Digital Publishing Software