From Exploring to Building Accurate Interpretable Machine Learning Models for Decision-Making: Think Simple, not Complex

By Health Canada

In spite of an increasing number of examples, where both simple and complex prediction models have been used for decision-making, accurate prediction continues to be pertinent for both models. The added element is that the more complex a model, the more potential it has for less uptake from novice users who may not be familiar with machine learning (ML). Complex prediction models can arise from attempts to maximize predictive accuracy without regard to how difficult it would be for an individual to anticipate the predictions from the input data. However, even with a method considered as simple as linear regression, the complexity increases as more variables and their interactions are added. At the other extreme of using numerous non-linear functions for prediction, as with Neural Nets, it is possible that the results can be too complex to understand. Such models are usually called black box prediction models. Accurate interpretable models can also vary from accurate decision trees and rule lists that are so concise they can be fully described in a sentence or two for tabular data, through modern generalized additive models (e.g., for more challenging medical records), to methods to disentangled neural nets for unstructured data such as pixels. A recent notable addition is the use of Bayesian soft complexity constrained unsupervised learning of deep layers of latent structure that is then used to construct a concise rule list with high accuracy (Gu and Dunson, 2021).

An early example, over 20 years ago, of a simple method providing as accurate prediction as more complex models is the 1998 study, by Ennis et al., of various ML learning methods to the GUSTO-I database where none of the methods could outperform a relatively simple logistic regression model. More recent accounts of complex methods, even when simple ones could suffice, are noted in the 2019 article by Rudin and Radin. The often suggested simple remedy for this unmanageable complexity is just finding ways to explain these black box models; however, those explanations can sometimes miss key information. In turn, rather than being directly connected with what is going on in the black box model, they result in being "stories" for getting concordant predictions. Given that concordance is not perfect, they can result in very misleading outcomes for many situations.

Perhaps what is needed is a wider awareness of the increasing number of techniques to build simple interpretable models from scratch that achieve high accuracy. The techniques are not simple refinements of linear or logistic regression (by rounding their coefficient to integers which loses accuracy), but involve discernment of appropriate domain-based constraints and newer methods of constrained optimization. This results in a spectrum of ease of interpretability of prediction across different applications.

Understanding where and when to be simple!

While we need to accept what we cannot understand, we should never overlook the advantages of what we can understand. For example, we may never fully understand the physical world. Nor how people think, interact, create and or decide. In ML, Geoffrey Hinton's 2018 YouTube drew attention to the fact that people are unable to explain exactly how they decide in general if something is the digit 2 or not. This fact was originally pointed out, a while ago, by Herbert Simon, and has not been seriously disputed (Erickson and Simon, 1980). However, prediction models are just abstractions and we can understand the abstractions created to represent that reality, which is complex and often beyond our direct access. So not being able to understand people is not a valid reason to dismiss desires to understand prediction models.

In essence, abstractions are diagrams or symbols that can be manipulated, in error-free ways, to discern their implications. Usually referred to as models or assumptions, they are deductive and hence can be understood in and of themselves for simply what they imply. That is, until they become too complex. For instance, triangles on the plane are understood by most, while triangles on the sphere are understood by less. Reality may always be too complex, but models that adequately represent reality for some purpose need not be. Triangles on the plane are for navigation of short distances while on the sphere, for long distances. Emphatically, it is the abstract model that is understood not necessarily the reality it attempts to represent.

However, for some reason, a persistent misconception has arisen in ML that models for accurate prediction usually need to be complex. To build upon previous examples, there remains some application areas where simple models have yet to achieve accuracy comparable to black box models. On the other hand, simple models continue to predict as accurately as any state of the art black box model and thus, the question, as noted in the 2019 article by Rudin and Radin, is: "Why Are We Using Black Box Models in AI When We Don't Need To?"

In application areas where simple models can be as accurate, not using such models has unnecessarily led to recommendations that can impact areas including societal, health, freedom, and safety. An often-discussed hypothetical choice between the accurate machine-learning-based robotic surgeon and the less-accurate human surgeon is moot once someone builds an interpretable robotic surgeon that is as accurate as any other robot. Again, it is the prediction model that is understandable, not necessarily the prediction task itself.

Simple and interpretable models?

The number of application areas where accurate simple prediction models can be built to be understood has been increasing over time. Arguably, perhaps these models should be labeled as "interpretable" ML, as they are designed from scratch to be interpretable. They are purposely constrained so that their reasoning processes are more understandable to most if not all human users. This not only makes the connection between input data and predictions almost obvious, but it is also easier to troubleshoot and modify as needed. Interpretability is in the eye of the domain and interpretability constraints can include the following:

  • Sparsity of the model
  • Monotonicity with respect to a variable
  • Decomposability into sub-models
  • An ability to perform case-based reasoning
  • Disentanglement of certain types of information within the model's reasoning process
  • Generative constraints (e.g. biological processes)
  • Preferences among the choice of variables
  • Any other type of constraint that is relevant to the domain.

Some notable examples of interpretable models include sparse logical models (such as decision trees, decision lists, and decision sets) and scoring systems which are linear classification models that require users to add, subtract, and multiply only a few small numbers to make a prediction. These models can be much easier to understand than multiple regression and logistic regression, which can be difficult to interpret. Now, the intuitive simplification of these regression models, by restricting the number of predictors and rounding the coefficients, does not provide optimal accuracy. This is just a post hoc adjustment. It is better to build in interpretability from the very start.

There is increasing understanding based on considering numerous possible prediction models in a given prediction task. The not-too-unusual observation of simple models performing well for tabular data (a collection of variables, each of which has meaning on its own) was noted over 20 years ago and was labeled the "Rashomon effect" (Breiman, 2001). Breiman posited the possibility of a large Rashomon set in many applications; that is, a multitude of models with approximately the same minimum error rate. A simple check for this is to fit a number of different ML models to the same data set. If many of these are as accurate as the most accurate (within the margin of error), then many other untried models might also be. A recent study (Semenova et al., 2019), now supports running a set of different (mostly black box) ML models to determine their relative accuracy on a given data set to predict the existence of a simple accurate interpretable model—that is, a way to quickly identify applications where it is a good bet that accurate interpretable prediction model can be developed.

What's the impact on ML from full data science life-cycle?

The trade-off between accuracy and interpretability with the first fixed data set in an application area may not hold over time. In fact, it is expected to change as either more data accumulate, the application area becomes better understood, data collection is refined or new variables are added or defined and the application area changes. In a full data science process itself, even in the first data set, one should critically assess and interpret the results and tune the processing of the data, the loss function, the evaluation metric, or anything else that is relevant. More effectively turning data into increasing knowledge about the prediction task which can then be leveraged to increase both accuracy and likely generalization. Any possible trade-off between accuracy and interpretability therefore should be evaluated in the full data science process and life cycle of ML.

The full data science and life-cycle process likely is different when using interpretable models. More input is needed from domain experts to produce an interpretable model that make sense to them. This should be seen as an advantage. For instance, it is not too unusual at a given stage to find numerous equally interpretable and accurate models. To the data scientist, there may seem little to guide the choice between these. But, when shown to domain experts, they may easily discern opportunities to improve constraints as well as indications of which ones are less likely to generalize well. All equally interpretable and accurate models are not equal in the eyes of domain experts.

Interpretable models are far more trustworthy in that they can be more readily discerned where and when they should be trusted or not and in what ways. But, how can one do this without understanding how the model works, especially for a model that is patently not trustworthy? This is especially important in cases where the underlying distribution of data changes, where it is critical to trouble shoot and modify without delays, as noted in the 2020 article by Hamamoto et al. It is arguably much more difficult to remain successful in the ML full life cycle with black box models than with interpretable models. Even for applications where interpretable models are not currently accurate enough, interpretable models can be used a tool to help debug black box models.

Misunderstanding explanations

There is now a vast and confusing literature, which conflates interpretability and explainability. In this brief blog, the degree of interpretability is taken simply as how easily the user can grasp the connection between input data and what the ML model would predict. Erasmus et al. (2020) provide a more general and philosophical view. Rudin et al. (2021) avoid trying to provide an exhaustive definition by instead providing general guiding principles to help readers avoid common, but problematic ways of thinking about interpretability. On the other hand, the term "explainability" often refers to post hoc attempts to explain a black box by using simpler 'understudy' models that predict the black box predictions. However, as noted in the Government of Canada's (GoC's) Guideline on Service and Digital, prediction is not explanation, and when they are proffered as explanations they can seriously mislead (GoC, 2021). Often this literature assumes that one would just explain a black box without consideration of whether there is an interpretable model of the same accuracy, perhaps having uncritically bought into the misconception that only models that are too complex to understand can achieve acceptable accuracy.

The increasing awareness of the dangers of these "explanations" has led one group of researchers to investigate how misunderstanding can actually be purposefully designed in; something regulators may increasingly need to worry about (Lakkaraju and Bastani, 2019). It is also not uncommon for those who routinely do black box modeling to offer explanations of these models as an alternative or even a reason to forego learning about and developing interpretable models.

Keeping it simple

Interpretable ML models are simple and can be relied upon when relying upon ML tools for decision-making. On the other hand, even interpretability is probably not needed for decisions where humans can verify or modify the decision afterwards (e.g. suggesting options). Notwithstanding the desire for simple and accurate models, it is important to note that currently interpretable MLs cannot match the accuracy of black box models in all application areas. For applications involving raw data (pixels, sound waves, etc.) black box neural networks have a current advantage over other approaches. In addition, black box models allow users to delegate responsibility for grasping implications of adopting the model. Although a necessary trade-off between accuracy and interpretability does remain in some application areas, its ubiquity remains an exaggeration and the prevalence of the trade-off may continually decrease in the future. This has created a situation in ML where opportunities to understand and reap the benefits are often overlooked. Therefore, the advantages of newer interpretable modelling techniques should be fully considered in any ML application, at a minimum to determine if adequate accuracy is achievable. Perhaps and in the end it may boil down to the fact that if simple works, then why make things more complex.

Team members: Keith O'Rourke (Pest Management Regulatory Agency), Yadvinder Bhuller (Pest Management Regulatory Agency).

Keep on machine learning...

Breiman, L. (2001). Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author). Statist. Sci. 16(3): 199-231. DOI: 10.1214/ss/1009213726

Ennis, M., Hinton, G., Naylor, D., Revow, M., and Tibshirani, R. (1998). A Comparison of Statistical Learning Methods on the Gusto Database. Statistics. Med. 17, 2501-2508. A comparison of statistical learning methods on the GUSTO database

Erasmus, A., Bruent, T.D.P., and Fisher E. (2020). What is Interpretability? Philosophy & Technology. What is Interpretability?

Ericsson, K. A., & Simon, H. A. (1980). Verbal reports as data. Psychological Review, 87(3), 215–251. Verbal reports as data.

Government of Canada. (2021). Guideline on Service and Digital. Guideline on Service and Digital. [Accessed: May 13, 2021].

Gu, Y., and Dunson, D.B. (2021). Identifying Interpretable Discrete Latent Structures from Discrete Data. arXiv:2101.10373 [stat.ME]

Hinton, G. (2018). Why Is a Two a Two? Why Is A Two A Two? With Geoffrey Hinton and David Naylor [Accessed: May 13, 2021].

Hamamoto, R., Suvarna, K., Yamada, M., Kobayashi, K., Shinkai, N., Miyake, M., Takahashi, M., Jinnai, S., Shimoyama, R., Sakai, A., Taksawa, K., Bolatkan, A., Shozu, K., Dozen, A., Machino, H., Takahashi, S., Asada, K., Komasu, M., Sese, J., and Kaneko., S. (2020). Application of Artificial Intelligence Technology in Oncology: Towards the Establishment of Precision Medicine. Cancers. 12(12), 3532; Application of Artificial Intelligence Technology in Oncology: Towards the Establishment of Precision Medicine

Lakkaraju, H., and Bastani, O. (2019). "How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations. arXiv:1911.06473 [cs.AI]

Rudin, C., Chen, C., Chen, Z., Huang, H., Semenova, L., and Zhong, C. (2021). Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges. arXiv:2103.11251 [cs.LG]

Rudin, C., & Radin, J. (2019). Why Are We Using Black Box Models in AI When We Don't Need To? A Lesson From An Explainable AI Competition. Harvard Data Science Review, 1(2). Why Are We Using Black Box Models in AI When We Don't Need To? A Lesson From An Explainable AI Competition

Semenova, R., Rudin, C., and Parr, R. (2019). A study in Rashomon curves and volumes: A new perspective on generalization and model simplicity in machine learning. arXiv:1908.01755 [cs.LG]

Date modified: