1

On Orthogonality Constraints for Transformers
TL;DR: Orthogonality constraints the encourages numerical stability improves model’s performance in NLP tasks.
Abstract: …