
Elastic announced the availability of jina-embeddings-v5-text, a family of two small, Elasticsearch-native multilingual embedding models at 0.2B and 0.6B parameters that deliver state-of-the-art performance across key search and semantic tasks.
Despite their compact size, they outperform significantly larger models with 7B to 14B parameters and achieve best-in-class results on the MMTEB (Multilingual MTEB) benchmark among models of comparable size and purpose. Their small footprint enables outstanding hybrid search at lower infrastructure cost, faster query response, and new deployment scenarios where memory and compute budgets are tight - including edge devices and resource-constrained environments.
jina-embeddings-v5-text are available through multiple channels: as open-weight models on HuggingFace for self-hosted deployment via vLLM, llama.cpp, or MLX, and on Elastic Inference Service (EIS), a GPU-accelerated inference-as-a-service that makes it easy to run fast, high-quality inference without complex setup. By bringing the Jina v5 family to EIS, users get a complete data platform that consolidates state-of-the-art multilingual embedding models, a high-performance vector database, and more into one unified enterprise stack across cloud and on-premises.
“Vector search, RAG, and AI agents depend on high-quality retrieval,” said Steve Kearns, general manager, Search, Elastic. “With the addition of the Jina v5’s multilingual embeddings, Elasticsearch continues to be the platform of choice for end-to-end context engineering.”
The family includes two models, jina-embeddings-v5-text-small (239M parameters) and jina-embeddings-v5-text-nano (677M parameters). Both models are optimized for four common tasks in search and agentic applications:
- Retrieval: Allowing users to query with natural language and find the most relevant documents
- Text Matching: Allowing users to find duplicates in their data, and align paraphrases or translations
- Classification: Allowing users to categorize documents, detect sentiments, and find anomalies
- Clustering: Allowing users to group documents by topic, subject, or meaning
The Jina v5 models are now available through the Elastic Inference Service (EIS) on Elastic Cloud Serverless and Elastic Cloud Hosted. All Elastic Cloud Trials include access to EIS.
These models are also available via an online API, and available for local hosting via vLLM, llama.cpp and MLX. Detailed instructions can be found on Hugging Face.
The Latest
For years, infrastructure teams have treated compute as a relatively stable input. Capacity was provisioned, costs were forecasted, and performance expectations were set based on the assumption that identical resources behaved identically. That mental model is starting to break down. AI infrastructure is no longer behaving like static cloud capacity. It is increasingly behaving like a market ...
Resilience can no longer be defined by how quickly an organization recovers from an incident or disruption. The effectiveness of any resilience strategy is dependent on its ability to anticipate change, operate under continuous stress, and adapt confidently amid uncertainty ...
Mobile users are less tolerant of app instability than ever before. According to a new report from Luciq, No Margin for Error: What Mobile Users Expect and What Mobile Leaders Must Deliver in 2026, even minor performance issues now result in immediate abandonment, lost purchases, and long-term brand impact ...
Artificial intelligence (AI) has become the dominant force shaping enterprise data strategies. Boards expect progress. Executives expect returns. And data leaders are under pressure to prove that their organizations are "AI-ready" ...
Agentic AI is a major buzzword for 2026. Many tech companies are making bold promises about this technology, but many aren't grounded in reality, at least not yet. This coming year will likely be shaped by reality checks for IT teams, and progress will only come from a focus on strong foundations and disciplined execution ...
AI systems are still prone to hallucinations and misjudgments ... To build the trust needed for adoption, AI must be paired with human-in-the-loop (HITL) oversight, or checkpoints where humans verify, guide, and decide what actions are taken. The balance between autonomy and accountability is what will allow AI to deliver on its promise without sacrificing human trust ...
More data center leaders are reducing their reliance on utility grids by investing in onsite power for rapidly scaling data centers, according to the Data Center Power Report from Bloom Energy ...
In MEAN TIME TO INSIGHT Episode 21, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses AI-driven NetOps ...
Enterprise IT has become increasingly complex and fragmented. Organizations are juggling dozens — sometimes hundreds — of different tools for endpoint management, security, app delivery, and employee experience. Each one needs its own license, its own maintenance, and its own integration. The result is a patchwork of overlapping tools, data stuck in silos, security vulnerabilities, and IT teams are spending more time managing software than actually getting work done ...
2025 was the year everybody finally saw the cracks in the foundation. If you were running production workloads, you probably lived through at least one outage you could not explain to your executives without pulling up a diagram and a whiteboard ...
