How-To: Tune Search Ranking¶
Goal: Improve search result quality for your documentation tenants.
Prerequisites: Working deployment with synced tenants, familiarity with search basics.
Time: ~15-30 minutes
When to Tune Search¶
You might need to tune search if: - Relevant documents appear too low in results - Irrelevant documents rank too high - Certain document types (tutorials vs reference) should rank differently - Search feels "off" for your documentation style
Note: Default BM25 parameters work well for most documentation. Only tune if you're seeing specific issues.
Understanding BM25 Parameters¶
docs-mcp-server uses BM25 with these tunable parameters:
| Parameter | Default | Effect |
|---|---|---|
bm25_k1 |
1.2 |
Term saturation (0.5-3.0). Higher = more weight on term frequency |
bm25_b |
0.75 |
Length normalization (0.1-1.0). Higher = shorter docs favored |
k1: Term Frequency Saturation¶
- Low k1 (0.5-1.0): Terms appearing once vs many times matter less
- High k1 (1.5-3.0): Documents with many keyword occurrences rank higher
- Use lower k1 for short docs (README, quick reference)
- Use higher k1 for long-form content (tutorials, guides)
b: Length Normalization¶
- Low b (0.1-0.4): Long and short documents treated more equally
- High b (0.7-1.0): Short documents boosted over long ones
- Use lower b if your best content is in long documents
- Use higher b if short, focused docs should rank first
Step-by-Step Tuning¶
1. Identify the Problem¶
Run test queries using the debug script:
Review the results: - Which documents rank 1-3? - What are the scores? - Are the top results actually the most relevant?
2. Adjust Parameters in deployment.json¶
Add or modify the search configuration for your tenant:
{
"source_type": "online",
"codename": "drf",
"docs_name": "Django REST Framework Docs",
"search": {
"ranking": {
"bm25_k1": 1.5,
"bm25_b": 0.6
}
}
}
Optional ranking flags (performance trade-offs)¶
Phrase proximity and fuzzy matching are disabled by default to minimize CPU and latency. Enable them explicitly if relevance quality requires it:
Tip: Enable one at a time and re-run the search tests to confirm the quality/latency trade-off.
3. Redeploy and Rebuild Index¶
# Redeploy with new config
uv run python deploy_multi_tenant.py --mode online
# Rebuild search index
uv run python trigger_all_indexing.py --tenants drf
4. Test Again¶
Compare results to Step 1. Iterate as needed.
Field Boosts¶
Control which parts of documents matter most:
{
"search": {
"boosts": {
"title": 2.5,
"headings_h1": 2.5,
"headings_h2": 2.0,
"headings": 1.5,
"body": 1.0,
"code": 1.2,
"path": 1.5,
"url": 1.5
}
}
}
| Field | Default | When to Increase |
|---|---|---|
title |
2.5 | Titles should dominate ranking |
headings_h1 |
2.5 | H1s are highly relevant |
headings_h2 |
2.0 | H2s contain key concepts |
code |
1.2 | Code snippets are primary content |
path |
1.5 | URL structure is meaningful |
Example: Boost code snippets for API docs:
Snippet Configuration¶
Adjust how search snippets are generated:
{
"search": {
"snippet": {
"style": "plain",
"fragment_char_limit": 240,
"max_fragments": 2,
"surrounding_context_chars": 120
}
}
}
| Setting | Default | Description |
|---|---|---|
style |
"plain" |
"plain" (brackets) or "html" (<mark> tags) |
fragment_char_limit |
240 |
Max characters per snippet |
max_fragments |
2 |
Number of snippets per result |
surrounding_context_chars |
120 |
Context around matched terms |
Analyzer Profiles¶
Choose a tokenization strategy:
| Profile | Best For |
|---|---|
default |
General documentation |
aggressive-stem |
Documents with many word variants |
code-friendly |
Technical docs with code identifiers |
Testing with debug_multi_tenant.py¶
Use the debug script for systematic testing:
# Test search with default queries from deployment.json
uv run python debug_multi_tenant.py --tenant drf --test search
# Test specific queries
uv run python debug_multi_tenant.py --tenant drf --test search --query "nested serializer"
Inspect match trace diagnostics¶
Search responses ship lean by default—match_stage, match_reason, ripgrep flags, and performance stats are omitted from payloads.
To enable full diagnostics, set search_include_stats to true in your deployment.json infrastructure section:
This single switch enables both stats and match-trace metadata across all tenants globally. Clients cannot toggle diagnostics per request—only infrastructure owners control this setting.
Troubleshooting¶
All scores are very low¶
Cause: Small corpus or common terms dominating.
Fix: IDF floor is automatic, but check if: - Tenant has too few documents (< 20) - Query terms appear in every document
Long documents always rank first¶
Cause: b parameter too low.
Fix: Increase bm25_b to favor shorter documents:
Exact phrase matches don't rank high¶
Cause: BM25 is term-based, not phrase-based.
Fix: Phrase proximity bonuses are always enabled. If results still feel weak, try more specific query terms or confirm the content is indexed.
Code examples not matching¶
Cause: Code tokens not indexed properly.
Fix: Use code-friendly analyzer and boost code field:
A/B Testing Approach¶
To compare configurations:
- Deploy tenant with config A
- Run test queries, record scores
- Deploy tenant with config B
- Run same queries, compare
- Keep the better configuration
# Record baseline using debug script
uv run python debug_multi_tenant.py --host localhost --port 42042 --tenant drf --test search > baseline.txt
# Change config, redeploy, rebuild index
# Compare results
uv run python debug_multi_tenant.py --host localhost --port 42042 --tenant drf --test search > modified.txt
diff baseline.txt modified.txt
Related¶
- Tutorial: Custom Search Configuration — Deep dive into search tuning
- Reference: deployment.json Schema — All search options
- Explanation: Search Ranking (BM25) — Why BM25 works this way