Notes on Training Neural Networks for Consensus

This paper presents the first framework to deliberately train neural networks for accuracy and agreement between feature attribution techniques: PEAR (Post hoc Explainer Agreement Regularizer). In addition to the conventional task loss, PEAR incorporates a correlation-based consensus loss that combines Pearson and Spearman correlation measures, promoting alignment across explainers like Grad and Integrated Gradients. By using a soft ranking approximation to address differentiability issues, the loss function is completely trainable by backpropagation. Tested on three OpenML tabular datasets, multilayer perceptrons trained using PEAR surpass linear baselines in accuracy and explanation consensus, and in certain instances, even compete with XGBoost. The findings advance reliable and interpretable AI by showing that consensus-aware training successfully reduces explanation disagreement while maintaining prediction performance.

Source: HackerNoon →

Blog

Notes on Training Neural Networks for Consensus

Category

Related News

What Quantum Machine Learning Means for the Future of AI

The Geek’s Guide to ML Experimentation

Can PEAR Make Deep Learning Easier to Trust?

Consensus Loss Proves AI Can Be Both Accurate and Transparent

The Trade-Off Between Accuracy and Agreement in AI Models

Top Category

Blog

Notes on Training Neural Networks for Consensus

Category

Share

Related News

What Quantum Machine Learning Means for the Future of AI

The Geek’s Guide to ML Experimentation

Can PEAR Make Deep Learning Easier to Trust?

Consensus Loss Proves AI Can Be Both Accurate and Transparent

The Trade-Off Between Accuracy and Agreement in AI Models

Top Category