Safety - Page 7 of 7 - ✦ Smart Content Report

Vectorview evaluates performance and security

February 5, 2025April 5, 2024

Vectorview helps to evaluate the performance and security of language models. Targeted testing with real-world scenarios is supposed to detect and prevent unintended behavior that is often missed by generic benchmarks. Sources: TechCrunch, Y Combinator

Jailbreak with ASCII trick

February 5, 2025March 22, 2024

Researchers from Washington and Chicago have developed “ArtPrompt“, a new method to bypass security measures in language models. Using this method, chatbots such as GPT-3.5, GPT-4, Gemini, Claude, and Llama2 can be tricked into responding to requests they are supposed to reject using ASCII art prompts. This includes advice on how to make bombs and …