Cohere releases Embed 4 model that handles 200-page documents

Cohere has launched Embed 4, an upgraded multimodal embedding model with a 128,000 token context window that can process documents up to 200 pages long. According to Emilia David at VentureBeat, the new model improves enterprise retrieval augmented generation (RAG) capabilities. Embed 4 handles unstructured data in over 100 languages and is designed to work with complex business materials without extensive pre-processing. The company states the model performs well with imperfect enterprise data, including scanned documents and handwriting. Cohere claims the model is particularly effective in regulated industries such as finance, healthcare, and manufacturing. Organizations can deploy Embed 4 on virtual private clouds or on-premise technology stacks for enhanced security. Cohere positions the model as an optimal search engine for AI agents across enterprises, creating compressed data embeddings to reduce storage costs.

Related posts:

Stay up-to-date: