The RAG Debugging Decision Tree

Read this as Was the correct answer in the retrieved context?

Failure Trap: Prompt-tuning a retrieval bug, or rebuilding retrieval for a generation bug.
Decision Rule: Inspect the retrieved chunks first; then choose generation, retrieval, or corpus fixes.

1 / ?

Start with the symptom

The model gave a wrong answer, but that fact alone does not tell you which part of the RAG system broke. RAG has at least two moving halves: retrieval finds context, then generation turns that context into an answer.

Do not start by rewriting the prompt.
Keep the user query, retrieved chunks, and final answer together.
The first job is to locate the fault boundary.

Ask the load-bearing question

The diagnostic question is: is the correct answer in the retrieved documents? Check the actual chunks that reached the prompt, not the whole corpus. This one question separates generator failures from retriever failures.

If the answer span is present, retrieval did its job.
If it is absent, generation never had the needed evidence.
Use logs or eval fixtures so the check is repeatable.

If yes: generation problem

When the right evidence was retrieved but the answer is still wrong, the generator ignored, contradicted, or misused its context. Fix the generation side before changing the index.

Tighten grounding and fallback instructions.
Move the answer-bearing chunk away from the middle of a long prompt.
Score faithfulness to catch unsupported claims.

If no, but the corpus has it: retrieval problem

If the right document exists somewhere in your knowledge base but did not reach the prompt, the prompt cannot fix the root cause. The retriever failed to surface the right evidence.

Check vocabulary mismatch between user query and document wording.
Retune chunk size when the answer is split across chunks.
Try hybrid retrieval, reranking, or query rewriting.

If no, and the corpus lacks it: knowledge gap

If the answer does not exist in the corpus, the correct behavior is not a cleverer prompt. Add the missing content or teach the assistant to say it does not know.

Add the missing policy, runbook, or source document.
Keep the fallback instruction honest: no evidence means no answer.
The common misdiagnosis is prompt tuning when the real bug is missing or unretrieved knowledge.