π§ Context
The retrieval layer has no tests yet. Two small, stable modules are good targets, and both can be tested by mocking their boundaries so no Ollama or database is involved:
src/retrieval/services/embedding_service.py β two thin wrappers (embed_query, embed_document) over llm.embed. The only behavior to lock in is that each passes the right is_query flag.
src/retrieval/services/retrieval_service.py β get_relevant_chunks(question) embeds the question, opens a DB session, and asks the repository for the top-k chunks. The behavior to lock in is the wiring: the embedding it produces is the one handed to the repository, with k = settings.top_k, and the repository's result is returned unchanged.
Tests go in tests/retrieval/services/ (the package already exists). No changes to the source modules.
π Implementation Plan
-
tests/retrieval/services/test_embedding_service.py. Mock llm.embed and assert the wrappers call it correctly:
embed_query(text) calls llm.embed with is_query=True; embed_document(text) calls it with is_query=False. Assert this on the mock's call args (e.g. assert_awaited_once_with(text, is_query=True)), which also confirms the text is forwarded unchanged.
- The wrapper is a transparent passthrough: set the mock's
return_value to a known sentinel (e.g. [0.1, 0.2, 0.3]) and assert the wrapper returns that same value. This verifies the wrapper doesn't alter the result β you're not checking real embedding values (the model is mocked), only that whatever llm.embed returns comes straight back out.
llm.embed is async, so mock it with AsyncMock. Patch it at its definition: src.infrastructure.llm.embed (the wrappers call llm.embed via the imported module, so patching there takes effect). unittest.mock is enough β no new dependency.
-
tests/retrieval/services/test_retrieval_service.py. get_relevant_chunks has three boundaries; mock all three so there's no network or DB:
embedding_service.embed_query β AsyncMock returning a fake embedding. Patch src.retrieval.services.embedding_service.embed_query.
Repository.top_k_chunks β AsyncMock returning a sentinel result (see note below). Patch the method on the class, e.g. patch.object(Repository, "top_k_chunks", ...).
async_session_factory β it's used as async with async_session_factory() as session:, so it must return an async context manager. Patch it in the retrieval_service module's namespace β src.retrieval.services.retrieval_service.async_session_factory β because retrieval_service imported the name directly (from ... import async_session_factory); patching it at its original location would not affect the already-imported name. This is the standard "patch where it's looked up, not where it's defined" rule β see the unittest.mock "Where to patch" docs.
Then assert the wiring:
embed_query was called with the question.
Repository.top_k_chunks was called with the embedding returned by embed_query, and with k=settings.top_k (read from settings β don't hardcode 3; asserting the literal just duplicates the source and breaks if the default changes).
get_relevant_chunks returns exactly what top_k_chunks returned.
-
Sentinels over hand-built domain objects. get_relevant_chunks does not transform the repository's result β it returns it as-is. So top_k_chunks's mock can return a simple sentinel (e.g. ["sentinel"] or a MagicMock) and you assert identity. Don't hand-construct RetrievedChunk/Chunk for this β they're frozen + strict and require every field, which is pointless boilerplate when the function just passes the value through.
-
Match the repo's async-test convention: asyncio_mode is "auto" (see pyproject.toml), so a plain async def test_... works with no @pytest.mark.asyncio marker, exactly like the existing repository tests.
π Notes
- No Docker, Ollama, or network β everything is mocked. The full
make test still needs Docker for the existing DB tests, so run it before pushing.
- The
async with async_session_factory() as session: mock is the one part to set up carefully: the factory is called and the result is used as an async context manager, so __aenter__ must be an AsyncMock (it's awaited) while the returned session can be any placeholder (it's only passed to the already-mocked top_k_chunks). The reference section at the end shows the exact shape.
- No changes to the source modules. If you find you need to edit them to make them testable, keep it minimal and call it out in the PR.
- No new dependencies β
unittest.mock covers all of this.
β
Acceptance Criteria
tests/retrieval/services/test_embedding_service.py asserts embed_query uses is_query=True, embed_document uses is_query=False, and that each returns the mocked embedding unchanged β with llm.embed mocked (no network).
tests/retrieval/services/test_retrieval_service.py asserts get_relevant_chunks passes the embedded query into Repository.top_k_chunks with k=settings.top_k, and returns the repository's result unchanged β with embed_query, top_k_chunks, and async_session_factory all mocked (no DB, no network).
k is asserted against settings.top_k, not a hardcoded number.
- Tests run without Ollama or Docker. No new dependencies.
make test and make lint pass.
π§ Context
The retrieval layer has no tests yet. Two small, stable modules are good targets, and both can be tested by mocking their boundaries so no Ollama or database is involved:
src/retrieval/services/embedding_service.pyβ two thin wrappers (embed_query,embed_document) overllm.embed. The only behavior to lock in is that each passes the rightis_queryflag.src/retrieval/services/retrieval_service.pyβget_relevant_chunks(question)embeds the question, opens a DB session, and asks the repository for the top-k chunks. The behavior to lock in is the wiring: the embedding it produces is the one handed to the repository, withk = settings.top_k, and the repository's result is returned unchanged.Tests go in
tests/retrieval/services/(the package already exists). No changes to the source modules.π Implementation Plan
tests/retrieval/services/test_embedding_service.py. Mockllm.embedand assert the wrappers call it correctly:embed_query(text)callsllm.embedwithis_query=True;embed_document(text)calls it withis_query=False. Assert this on the mock's call args (e.g.assert_awaited_once_with(text, is_query=True)), which also confirms the text is forwarded unchanged.return_valueto a known sentinel (e.g.[0.1, 0.2, 0.3]) and assert the wrapper returns that same value. This verifies the wrapper doesn't alter the result β you're not checking real embedding values (the model is mocked), only that whateverllm.embedreturns comes straight back out.llm.embedis async, so mock it withAsyncMock. Patch it at its definition:src.infrastructure.llm.embed(the wrappers callllm.embedvia the imported module, so patching there takes effect).unittest.mockis enough β no new dependency.tests/retrieval/services/test_retrieval_service.py.get_relevant_chunkshas three boundaries; mock all three so there's no network or DB:embedding_service.embed_queryβAsyncMockreturning a fake embedding. Patchsrc.retrieval.services.embedding_service.embed_query.Repository.top_k_chunksβAsyncMockreturning a sentinel result (see note below). Patch the method on the class, e.g.patch.object(Repository, "top_k_chunks", ...).async_session_factoryβ it's used asasync with async_session_factory() as session:, so it must return an async context manager. Patch it in the retrieval_service module's namespace βsrc.retrieval.services.retrieval_service.async_session_factoryβ because retrieval_service imported the name directly (from ... import async_session_factory); patching it at its original location would not affect the already-imported name. This is the standard "patch where it's looked up, not where it's defined" rule β see the unittest.mock "Where to patch" docs.Then assert the wiring:
embed_querywas called with thequestion.Repository.top_k_chunkswas called with the embedding returned byembed_query, and withk=settings.top_k(read from settings β don't hardcode3; asserting the literal just duplicates the source and breaks if the default changes).get_relevant_chunksreturns exactly whattop_k_chunksreturned.Sentinels over hand-built domain objects.
get_relevant_chunksdoes not transform the repository's result β it returns it as-is. Sotop_k_chunks's mock can return a simple sentinel (e.g.["sentinel"]or aMagicMock) and you assert identity. Don't hand-constructRetrievedChunk/Chunkfor this β they're frozen + strict and require every field, which is pointless boilerplate when the function just passes the value through.Match the repo's async-test convention:
asyncio_modeis"auto"(seepyproject.toml), so a plainasync def test_...works with no@pytest.mark.asynciomarker, exactly like the existing repository tests.π Notes
make teststill needs Docker for the existing DB tests, so run it before pushing.async with async_session_factory() as session:mock is the one part to set up carefully: the factory is called and the result is used as an async context manager, so__aenter__must be anAsyncMock(it's awaited) while the returned session can be any placeholder (it's only passed to the already-mockedtop_k_chunks). The reference section at the end shows the exact shape.unittest.mockcovers all of this.β Acceptance Criteria
tests/retrieval/services/test_embedding_service.pyassertsembed_queryusesis_query=True,embed_documentusesis_query=False, and that each returns the mocked embedding unchanged β withllm.embedmocked (no network).tests/retrieval/services/test_retrieval_service.pyassertsget_relevant_chunkspasses the embedded query intoRepository.top_k_chunkswithk=settings.top_k, and returns the repository's result unchanged β withembed_query,top_k_chunks, andasync_session_factoryall mocked (no DB, no network).kis asserted againstsettings.top_k, not a hardcoded number.make testandmake lintpass.