A new study from Apple reveals that large language models (LLMs) don’t reason logically but rely on pattern recognition. This finding, published by six AI researchers at Apple, challenges the common understanding of LLMs. The researchers discovered that even small changes, such as swapping names, can alter the models’ results by about 10%. Gary Marcus, author of “The Algebraic Mind,” highlights the significance of this study in a recent post.
Similar results were found in a 2017 study by Robin Jia and Percy Liang from Stanford. Furthermore, the performance of LLMs often declines as tasks become more complex, as shown by an analysis of GPT-4 by Subbarao Kambhapati’s team. LLMs also exhibit weaknesses in tasks like integer arithmetic and chess, suggesting a lack of abstract, formal reasoning. According to Marcus, this confirms his thesis, dating back to 1998, that symbolic manipulation is necessary for reliable AI. He advocates for a neuro-symbolic approach that combines neural networks with symbolic processing.