feat: add health and readiness check endpoints to CLI serve API by markstur · Pull Request #1100 · generative-computing/mellea

markstur · 2026-05-19T19:17:04Z

Pull Request

Issue

Description

Adds /health and /ready endpoints to the FastAPI server for Kubernetes liveness and readiness probes.

/health: always returns 200 (liveness check)
/ready: returns 200 when ready, 503 otherwise (readiness check)

The readiness check is basic check that run_server happened which provides the chat endpoint. In the future this could be extended to let serve modules report readiness of their backends (etc, needs some design). Would also need to be adapted appropriately when we add support for multiple serve modules.

Assisted-by: IBM Bob

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code was added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

AI coding assistants used

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

Component
Requirement
Sampling Strategy
Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

Adds /health and /ready endpoints to the FastAPI server for Kubernetes liveness and readiness probes. - /health: always returns 200 (liveness check) - /ready: returns 200 when ready, 503 otherwise (readiness check) The readiness check is basic check that run_server happened which provides the chat endpoint. In the future this could be extended to let serve modules report readiness of their backends (etc, needs some design). Would also need to be adapted appropriately when we add support for multiple serve modules. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com> Assisted-by: IBM Bob

markstur · 2026-05-19T19:19:34Z

Did a little bit of import cleanup (unused or moved to top) which was not related to the PR but I think is good to do while touching the files vs the overhead of a nit PR. I'm open to splitting if preferred.

jakelorocco · 2026-05-20T17:10:18Z

+    if _server_ready:
+        return {"status": "ready"}
+    else:
+        raise HTTPException(status_code=503, detail="Server not ready")


Should this be an exception? I don't think anything on our side is actually causing failures. I think it should just be a response with the 503 status code and same detail / status message.

uvicorn handles this the way you want it to

jakelorocco · 2026-05-20T17:16:30Z

@@ -295,5 +334,9 @@ def run_server(
        methods=["POST"],
        response_model=ChatCompletion | OpenAIErrorResponse,
    )
+
+    # Mark server as ready after route is successfully registered
+    _server_ready = True
+
    typer.echo(f"Serving {route_path} at http://{host}:{port}")
    uvicorn.run(app, host=host, port=port)


Is it possible to even hit the readiness endpoint before the server is ready since the fastapi app isn't run with uvicorn until after the module is loaded and the route added?

Good point. I will remove /ready for now. That's what I get for sneaking in a bit extra.

FYI --

I cannot hit the not-ready state with the expected run_server() use and since I put _server_ready=True in run_server() it's not very usable any other way right now. I didn't realize that even a breakpoint() would not make this work.

Technically, however, I can run the server w/o using run_server and I get healthy but not ready:
uv run python -c 'from cli.serve.app import app; import uvicorn; uvicorn.run(app, host="0.0.0.0", port="8080")' <-- but don't care about that right now.

I will add an issue to implement /ready later. It should:

1 - handle shutdown signals for k8s!
2 - be for future use (speculative, but if we want to allow module(s) to have startup or other ready dependencies)

* changed status value to "pass" because that is IETF standard. k8s doesn't care what the string is. * removing the /ready implementation which is not ready and not part of this issue Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

markstur · 2026-05-20T20:52:53Z

NOTE: I also changed the status return json to {"status": "pass"} because I learned that is IETF standard while k8s doesn't care what the string is. pass, ok, healthy are common, but pass s most standard.

markstur requested a review from a team as a code owner May 19, 2026 19:17

markstur requested review from jakelorocco and planetf1 May 19, 2026 19:17

github-actions Bot added the enhancement New feature or request label May 19, 2026

jakelorocco reviewed May 20, 2026

View reviewed changes

feat: cli serve health endpoint

e775d8b

* changed status value to "pass" because that is IETF standard. k8s doesn't care what the string is. * removing the /ready implementation which is not ready and not part of this issue Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add health and readiness check endpoints to CLI serve API#1100

feat: add health and readiness check endpoints to CLI serve API#1100
markstur wants to merge 2 commits into
generative-computing:mainfrom
markstur:health

markstur commented May 19, 2026

Uh oh!

markstur commented May 19, 2026

Uh oh!

jakelorocco May 20, 2026

Uh oh!

markstur May 20, 2026

Uh oh!

jakelorocco May 20, 2026

Uh oh!

markstur May 20, 2026

Uh oh!

markstur commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

markstur commented May 19, 2026

Pull Request

Issue

Description

Testing

Attribution

Adding a new component, requirement, sampling strategy, or tool?

Uh oh!

markstur commented May 19, 2026

Uh oh!

jakelorocco May 20, 2026

Choose a reason for hiding this comment

Uh oh!

markstur May 20, 2026

Choose a reason for hiding this comment

Uh oh!

jakelorocco May 20, 2026

Choose a reason for hiding this comment

Uh oh!

markstur May 20, 2026

Choose a reason for hiding this comment

Uh oh!

markstur commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants