Session preview: When ML doesn't work as expected, what went wrong and how can you recover?

Given the intrinsic complex and dynamic nature of the machine learning (ML) the possibility of failure does not come like a surprise.

There are many reasons why this can happen. One of them is the biases in the training data and method (e.g. sampling, data preparation). Another reason is that the ultimate scope of the ML is not well defined and transparent. Further issues are linked to the machine learning techniques which are not able to inform us when the information is not clear or they cannot effectively learn from the data. Finally, ML uses a high number of hyper parameters (e.g. how many trees I consider in a decision process like random forest). These hyper parameters are defined by the developer and they cannot be derived by the data. 

Of course perfection always starts with mistakes. So, how can we make the ML a better place? Of course, the starting point is the data.

First of all, it is important that data is accurate, complete and sufficient to extract statistically significant insights. Data inputs must be interpretable, coherent with the internal policy of the firms and give a business rationale. In addition, we need to have a robust approach to pre-process the data to avoid any corrupted learning processes. 

Another important point is the calibration. As we know, this is a crucial part for the traditional models and it is even more important for ML given the amount of parameters, data and the frequency they are updated. In this case, we can establish specific controls to assess if the calibration is appropriate and develop a monitoring framework including thresholds and triggers to inform if the model is working as expected. Of course, the above requires some changes in the way we review model risk for ML. First of all, we need to review the model risk policy to reflect the features of the ML discussed above. Secondly, the validators must be sure they are equipped with the right tools to deal with the big data and computation complexity behind the ML exploit.