Batch Normalization Is Blind to the First and Second Derivatives of the Loss
Published in AAAI, 2024
Recommended citation: Zhou, Z., Shen, W., Chen, H., Tang, L., Chen, Y., & Zhang, Q. Batch Normalization Is Blind to the First and Second Derivatives of the Loss. In AAAI 2024. https://ojs.aaai.org/index.php/AAAI/article/download/29978/31715
Abstract. This paper proves that batch normalization blocks the influence of the first- and second-order terms of the loss on earlier layers under a Taylor-series perspective.
Recommended citation: Zhou, Z., Shen, W., Chen, H., Tang, L., Chen, Y., & Zhang, Q. Batch Normalization Is Blind to the First and Second Derivatives of the Loss. In AAAI 2024.
