61 61 D. P. Green and H. L. Kern, “Modeling heterogeneous treatment effects in large‐scale experiments using Bayesian additive regression trees,” in Proc. Annu. Summer Meeting Soc. Political Methodol., 2010, pp. 1–40.
62 62 Chipman, H.A., George, E.I., and McCulloch, R.E. (2010). BART: Bayesian additive regression trees. Appl. Statist. 4 (1): 266–298.
63 63 Elith, J., Leathwick, J., and Hastie, T. (2008). A working guide to boosted regression trees. J. Anim. Ecol. 77 (4): 802–813.
64 64 S. H. Welling, H. H. F. Refsgaard, P. B. Brockhoff, and L. H. Clemmensen. (2016). “Forest floor visualizations of random forests.” [Online]. Available: https://arxiv.org/abs/1605.09196
65 65 Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24 (1): 44–65. https://doi.org/10.1080/10618600.2014.907095.
66 66 G. Casalicchio, C. Molnar, and B. Bischl. (2018). “Visualizing the feature importance for black box models.” [Online]. Available: https://arxiv.org/abs/1804.06620
67 67 U. Johansson, R. König, and I. Niklasson, “The truth is in there—Rule extraction from opaque models using genetic programming,” in Proc. FLAIRS Conf., 2004, pp. 658–663.
68 68 M. H. Aung, P. Lisboa, T. Etchells, et al., “Comparing analytical decision support models through Boolean rule extraction: A case study of ovarian tumour malignancy,” in Proc. Int. Symp. Neural Netw. Berlin, Germany: Springer, 2007, pp. 1177–1186.
69 69 T. Hailesilassie. (2017). “Rule extraction algorithm for deep neural networks: A review.” [Online]. Available: https://arxiv.org/abs/1610.05267
70 70 Andrews, R., Diederich, J., and Tickle, A.B. (1995). Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl.‐Based Syst. 8 (6): 373–389.
71 71 GopiKrishna, T. (2014). Evaluation of rule extraction algorithms. Int. J. Data Mining Knowl. Manage. Process 4 (3): 9–19.
72 72 Etchells, T.A. and Lisboa, P.J.G. (Mar. 2006). Orthogonal search‐based rule extraction (OSRE) for trained neural networks: a practical and efficient approach. IEEE Trans. Neural Netw. 17 (2): 374–384.
73 73 Barakat, N. and Diederich, J. (2005). Eclectic rule‐extraction from support vector machines. Int. J. Comput. Intell. 2 (1): 59–62.
74 74 P. Sadowski, J. Collado, D. Whiteson, and P. Baldi, “Deep learning, dark knowledge, and dark matter,” in Proc. NIPS Workshop High‐Energy Phys. Mach. Learn. (PMLR), vol. 42, 2015, pp. 81–87.
75 75 G. Hinton, O. Vinyals, and J. Dean. (2015). “Distilling the knowledge in a neural network.” [Online]. Available: arXiv:1503.02531v1 [stat.ML]
76 76 Z. Che, S. Purushotham, R. Khemani, and Y. Liu. (2015). “Distilling knowledge from deep networks with applications to healthcare domain.” [Online]. Available: arXiv:1512.03542v1 [stat.ML]
77 77 K. Xu, D. H. Park, D. H. Yi, and C. Sutton. (2018). “Interpreting deep classifier by visual distillation of dark knowledge.” [Online]. Available: https://arxiv.org/abs/1803.04042
78 78 S. Tan, “Interpretable approaches to detect bias in black‐box models,” in Proc. AAAI/ACM Conf. AI Ethics Soc., 2017, pp. 1–2.
79 79 S. Tan, R. Caruana, G. Hooker, and Y. Lou. (2018). “Auditing blackbox models using transparent model distillation with side information.” [Online]. Available: arXiv:1710.06169v4 [stat.ML]
80 80 S. Tan, R. Caruana, G. Hooker, and A. Gordo. (2018). “Transparent model distillation.” [Online]. Available: https://arxiv.org/abs/1801.08640
81 81 Y. Zhang and B. Wallace. (2016). “A sensitivity analysis of (and practitioners' Guide to) convolutional neural networks for sentence classification.” [Online]. Available: https://arxiv.org/abs/1510.03820
82 82 Cortez, P. and Embrechts, M.J. (2013). Using sensitivity analysis and visualization techniques to open black box data mining models. Inform. Sci. 225: 1–17.
83 83 P. Cortez and M. J. Embrechts, “Opening black box data mining models using sensitivity analysis,” in Proc. IEEE Symp. Comput. Intell. Data Mining (CIDM), Apr. 2011, pp. 341–348.
84 84 Bach, S., Binder, A., Montavon, G. et al. (2015). On pixel‐wise explanations for non‐linear classifier decisions by layer‐wise relevance propagation. PLoS One 10 (7): e0130140.
85 85 A. Fisher, C. Rudin, and F. Dominici. (2018). “Model class reliance: Variable importance measures for any machine learning model class, from the ‘rashomon’ perspective.” [Online]. Available: https://arxiv.org/abs/1801.01489
86 86 Bien, J. and Tibshirani, R. (2011). Prototype selection for interpretable classification. Ann. Appl. Statist. 5 (4): 2403–2424.
87 87 B. Kim, C. Rudin, and J. A. Shah, “The Bayesian case model: A generative approach for case‐based reasoning and prototype classification,” in Proc. Adv. Neural Inf. Process. Syst., 2014, pp. 1952–1960.
88 88 K. S. Gurumoorthy, A. Dhurandhar, and G. Cecchi. (2017). “ProtoDash: Fast interpretable prototype selection.” [Online]. Available: https://arxiv.org/abs/1707.01212
89 89 B. Kim, R. Khanna, and O. O. Koyejo, “Examples are not enough, learn to criticize! criticism for interpretability,” in Proc. 29th Conf. Neural Inf. Process. Syst. (NIPS), 2016, pp. 2280–2288.
90 90 S. Wachter, B. Mittelstadt, and C. Russell. (2017). “Counterfactual explanations without opening the black box: Automated decisions and the GDPR.” [Online]. Available: https://arxiv.org/abs/1711.00399
91 91 X. Yuan, P. He, Q. Zhu, and X. Li. (2017). “Adversarial examples: Attacks and defenses for deep learning.” [Online]. Available: https://arxiv.org/abs/1712.07107
92 92 G. Montavon, S. Bach, A. Binder, W. Samek, and K.‐R. Muller Explaining NonLinear Classification Decisions with Deep Taylor Decomposition, arXiv:1512.02479v1 [cs.LG] 8 Dec 2015, also in Pattern Recognition, vol. 65 May 2017, Pages pp. 211–222.
93 93 W. J. Murdoch, A. Szlam, Automatic Rule Extraction from Long Short Term Memory Networks, ICLR 2017 Conference
94 94 R. Babuska, Fuzzy Systems, Modeling and Identification https://www.researchgate.net/profile/Robert_Babuska/publication/228769192_Fuzzy_Systems_Modeling_and_Identification/links/02e7e5223310e79d19000000/Fuzzy‐Systems‐Modeling‐and‐Identification.pdf
95 95 Glisic, S. (2016). Advanced Wireless Networks: Technology and Business Models. Wiley.
96 96 B. E. Boser, I. Guyon, V. N. Vapnik A training algorithm for optimal margin classifiers. In D. Haussler, editor, Proceedings of the Annual Conference on Computational Learning Theory, pages 144–152, Pittsburgh, PA, 1992. ACM Press.
97 97 Chan, W.C. et al. (2001). On the modeling of nonlinear