Differential Transformer could improve text AIs
Microsoft and Tsinghua University have developed a new AI architecture called “Differential Transformer” that improves the performance of large language models. Furu Wei from Microsoft Research told VentureBeat that the new method amplifies attention to relevant contexts and filters out noise. This is designed to reduce problems such as the “lost-in-the-middle” phenomenon and hallucinations in …