Inference Embeddings Rerank Images Speech Guard
DE ES IT

Sovereign AI Reranking,
built for Europe.

German HQ 100% EU data residency

Re-score and rank documents with cross-encoder precision on European GPUs. Boost your RAG pipeline accuracy without your data leaving the EU.

Create free account 100K tokens/month free
// models + pricing

Reranker Models

We run the Qwen3 Reranker family: instruction-aware cross-encoder models that score query-document relevance with high precision. 100+ languages, 32K context. Perfect as the second stage after embedding search.

All models run on modern Blackwell or newer chips for ideal performance. Pricing per million tokens. Free tier included on all models.

65.8
Qwen3-Reranker-0.6B
Fast, lightweight reranking. Ideal for high-throughput RAG pipelines.
Input: 0,02 € / 1M tokens Coming soon
Parameters0.6B
Context32K tokens
Languages100+
Scoringyes/no logits
Qwen3-Reranker-0.6B
65.8
Cohere Rerank v3
67.1
bge-reranker-v2
62.4
Jina Reranker v2
63.8
72.1
Qwen3-Reranker-4B
Best balance of speed and accuracy. Production-ready for RAG.
Input: 0,06 € / 1M tokens Coming soon
Parameters4B
Context32K tokens
Languages100+
Scoringyes/no logits
Qwen3-Reranker-4B
72.1
Cohere Rerank v3
67.1
bge-reranker-v2
62.4
Jina Reranker v2
63.8
72.9
Qwen3-Reranker-8B
Maximum reranking quality for critical retrieval workloads.
Input: 0,10 € / 1M tokens Coming soon
Parameters8B
Context32K tokens
Languages100+
Scoringyes/no logits
Qwen3-Reranker-8B
72.9
Cohere Rerank v3
67.1
bge-reranker-v2
62.4
Jina Reranker v2
63.8
Free tier
100K tokens/month All models 10 req/min No credit card
// what you can build

Use Cases

Reranking is the precision layer in modern retrieval systems. Add a reranker after embedding search to dramatically improve relevance.

RAG Pipeline Optimization
Retrieve 20 candidates with embeddings, rerank to the top 5. Your LLM gets only the most relevant context, producing better answers with less noise.
Enterprise Search
Boost search accuracy for internal knowledge bases, legal documents, and support portals. Cross-encoder scoring understands nuance that keyword and vector search miss.
Multilingual Retrieval
Rerank documents across languages without translation. Query in German, match English documents, score by relevance. Ideal for European multilingual workloads.
E-Commerce & Recommendations
Re-score product search results by true relevance to the query. Improve conversion by surfacing the right products, not just similar ones.
// for teams that need more
Need more? The Business Plan covers all Nodion.ai products: Inference, Embeddings, and more. 500 €/month, 50M tokens, dedicated GPU capacity, 99.5% SLA.
View Business Plan →
// getting started

API Documentation

The Reranking API uses a simple scoring endpoint. Send a query and a list of documents, get back relevance scores.

# Base URL
https://api.nodion.ai/v1
# Example: curl
curl https://api.nodion.ai/v1/rerank \
  -H "Authorization: Bearer $NODION_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen3-reranker-0.6b",
    "query": "How do I cancel my subscription?",
    "documents": [
      "To cancel, go to Settings > Billing > Cancel Plan.",
      "Our pricing starts at 10 EUR per month.",
      "You can upgrade your plan at any time."
    ],
    "top_n": 2
  }'

Returns relevance scores (0-1) for each document. Supports instruction-aware reranking via the instruction parameter.

// why this matters
GDPR-native. Not a policy checkbox, it's how the infrastructure is built. No data leaves the EU. No transatlantic transfers. No adequacy decision risks.
Nordic green energy. GPU clusters in Sweden and Finland run on renewable energy. Cold climate means natural cooling, lower energy waste, smaller footprint.
No US dependency. German company. EU servers. Open-source models. Full stack sovereignty without hyperscaler lock-in.
Open-source only. Every model we serve is fully open. You can inspect the weights, understand the architecture, audit the outputs.
OpenAI-compatible API. Drop-in replacement. Change your base URL and you're running on sovereign European infrastructure.

Ready to start?

100K free tokens per month. No credit card required. All models included.

Create free account