Scale AI publishes AI rankings

For the first time, Scale AI publishes rankings for large language models, evaluating their performance in specific application areas such as generative AI programming, regression, mathematics, and multilingualism. OpenAI’s GPT models ranked first in three of the four areas (coding, multilingual, instruction following), while Anthropic’s Claude 3 Opus ranked first in the remaining one (Math).

Five chatbots compared

Journalists Dalvin Brown, Kara Dapena, and Joanna Stern tested ChatGPT, Claude, Copilot, Gemini, and Perplexity in everyday situations. Each chatbot was asked questions formulated by Wall Street Journal editors and columnists. The responses were evaluated by an independent panel of judges based on accuracy, usefulness and overall quality. The health category included questions about pregnancy, …

Read more

Ranking of most secure LLMs

Enkrypt has published a ranking of the most secure large language models (LLMs) to help companies choose the most suitable models. OpenAI’s GPT-4-Turbo tops the list with the lowest risk score, while models such as Saul Instruct-V1 and Phi3-Mini-4K are at the bottom of the list.

Generating music and sound with AI – three examples

AIs can generate not only text, images, and video, but also sound and music. The progress in quality is amazing. Let’s look at three prominent examples: Udio Launched a week ago as part of a public beta, Udio has already caused quite a stir. The website contains numerous examples of songs created with this tool. …

Read more