Towards the first principles of explaining DNNs: interactions explain the learning dynamics

Huilin ZHOU; Qihan REN; Junpeng ZHANG; Quanshi ZHANG

doi:10.1631/FITEE.2401025

Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (7) :1017 -1026. DOI: 10.1631/FITEE.2401025

Personal View

Towards the first principles of explaining DNNs: interactions explain the learning dynamics

Author information +

History +

PDF (1305KB)

Abstract

Most explanation methods are designed in an empirical manner, so exploring whether there exists a first-principles explanation of a deep neural network (DNN) becomes the next core scientific problem in explainable artificial intelligence (XAI). Although it is still an open problem, in this paper, we discuss whether the interaction-based explanation can serve as the first-principles explanation of a DNN. The strong explanatory power of interaction theory comes from the following aspects: (1) it establishes a new axiomatic system to quantify the decision-making logic of a DNN into a set of symbolic interaction concepts; (2) it simultaneously explains various deep learning phenomena, such as generalization power, adversarial sensitivity, representation bottleneck, and learning dynamics; (3) it provides mathematical tools that uniformly explain the mechanisms of various empirical attribution methods and empirical adversarial-transferability-boosting methods; (4) it explains the extremely complex learning dynamics of a DNN by analyzing the two-phase dynamics of interaction complexity, which further reveals the internal mechanism of why and how the generalization power/adversarial sensitivity of a DNN changes during the learning process.

Keywords

First-principles explanation / Theory of equivalent interactions / Two-phase dynamics of interactions / Learning dynamics

Cite this article

Download citation ▾

Huilin ZHOU, Qihan REN, Junpeng ZHANG, Quanshi ZHANG. Towards the first principles of explaining DNNs: interactions explain the learning dynamics. Front. Inform. Technol. Electron. Eng, 2025, 26(7): 1017-1026 DOI:10.1631/FITEE.2401025