Initial Python SDK with generated API models and codegen tooling#1
Conversation
b4d569a to
fc3250b
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces the initial Baseten Python SDK foundation, including generated low-level API clients/models (from OpenAPI specs) plus codegen tooling, tests, and CI to ensure regeneration stays in sync with committed outputs.
Changes:
- Added synchronous/async Management and Inference clients that wrap generated API clients (
.api) with configurable base URLs and lifecycle management. - Introduced OpenAPI preprocessing + client/model generation tooling under
scripts/apigen/. - Added initial test suite, packaging config, README/CONTRIBUTING docs, and GitHub Actions CI.
Reviewed changes
Copilot reviewed 19 out of 26 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tests/conftest.py |
Adds a small httpx MockTransport-based fake transport to capture requests and return fixed responses. |
tests/client/test_management.py |
Adds sync/async tests validating management client request construction, escaping, and error handling. |
tests/client/test_inference.py |
Adds sync/async tests for inference predict/wake endpoints and typed vs generic error behavior. |
tests/client/test_client.py |
Adds tests for option resolution, base URL selection, immutability, and context manager behavior. |
scripts/apigen/specs/inference.json |
Adds the inference OpenAPI spec used for generation. |
scripts/apigen/preprocess.py |
Adds preprocessing to hoist schemas, fix “bare object” schemas, and rename *V1 schemas for cleaner model names. |
scripts/apigen/clientgen.py |
Adds a small generator that emits an httpx-based API client with typed models and error dispatch. |
scripts/apigen/__main__.py |
Adds CLI entrypoint to (optionally) download specs, preprocess, generate models/client/init, and run ruff. |
README.md |
Adds initial usage documentation for ManagementClient and AsyncManagementClient. |
pyproject.toml |
Adds packaging metadata, runtime deps, dev dependency group, poe tasks, and pytest config. |
CONTRIBUTING.md |
Documents uv + poe workflow for generating/linting/testing. |
baseten/client/managementapi/_client.py |
Adds generated Management API client implementation. |
baseten/client/managementapi/__init__.py |
Exposes generated Management API client/models via stable imports and __all__. |
baseten/client/managementapi/_models.py |
Adds generated Management API Pydantic models. |
baseten/client/inferenceapi/_models.py |
Adds generated Inference API Pydantic models. |
baseten/client/inferenceapi/_client.py |
Adds generated Inference API client implementation with typed error dispatch. |
baseten/client/inferenceapi/__init__.py |
Exposes generated Inference API client/models and error types. |
baseten/client/_management.py |
Adds user-facing ManagementClient / AsyncManagementClient wrappers and options. |
baseten/client/_inference.py |
Adds user-facing InferenceClient / AsyncInferenceClient wrappers, options, and base URL computation. |
baseten/client/__init__.py |
Exposes the public client entrypoints and options types. |
baseten/py.typed |
Marks the package as typed for type checkers. |
.gitignore |
Adds common Python/venv/tool cache ignores. |
.github/workflows/ci.yml |
Adds CI for lint/typecheck, regeneration-diff check, and tests across OS/Python versions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def make_sync_client(fake: FakeTransport) -> ManagementClient: | ||
| client = ManagementClient(api_key="test-key") | ||
| client.http_client._transport = fake.sync_transport # type: ignore[attr-defined] | ||
| return client | ||
|
|
||
|
|
||
| def make_async_client(fake: FakeTransport) -> AsyncManagementClient: | ||
| client = AsyncManagementClient(api_key="test-key") | ||
| client.http_client._transport = fake.async_transport # type: ignore[attr-defined] | ||
| return client |
There was a problem hiding this comment.
These helpers mutate httpx.Client / AsyncClient by assigning to the private _transport attribute, which is not a stable public API and may break across httpx versions. Prefer constructing an httpx.Client(transport=..., base_url=..., headers=...) and passing it via ManagementClient(http_client=...) (and similarly for async) so tests only use supported interfaces.
There was a problem hiding this comment.
We would not test the default httpx client behavior if we did this. We accept this tradeoff of it possibly breaking in the future. These are just tests and we will rework if/when we have to.
| def make_sync_client(fake: FakeTransport) -> InferenceClient: | ||
| client = InferenceClient(api_key="test-key", model_id="abc123") | ||
| client.http_client._transport = fake.sync_transport # type: ignore[attr-defined] | ||
| return client | ||
|
|
||
|
|
||
| def make_async_client(fake: FakeTransport) -> AsyncInferenceClient: | ||
| client = AsyncInferenceClient(api_key="test-key", model_id="abc123") | ||
| client.http_client._transport = fake.async_transport # type: ignore[attr-defined] | ||
| return client |
There was a problem hiding this comment.
These helpers mutate httpx.Client / AsyncClient by assigning to the private _transport attribute, which is not a stable public API and may break across httpx versions. Prefer constructing an httpx.Client(transport=..., base_url=..., headers=...) and passing it via InferenceClient(http_client=...) (and similarly for async) so tests only use supported interfaces.
There was a problem hiding this comment.
Same as other comment
d0d0148 to
4915c67
Compare
4915c67 to
bcf3c02
Compare
f101b52 to
6c8f604
Compare
| self.close_http_client_on_close = ( | ||
| True | ||
| if close_http_client_on_close is None | ||
| else close_http_client_on_close | ||
| ) |
There was a problem hiding this comment.
This pattern seems to come up a lot in this PR - let's create a helper function for null coalescing?
There was a problem hiding this comment.
Not following which pattern has come up a lot? Ruff does make it like it's more than "user-set bool or true", but it's a single, simple statement (there is only one other use I saw, unless you mean combine with the below "user-set bool or false", but that's also just a single Python statement)
| return self._options | ||
|
|
||
| @property | ||
| def http_client(self) -> httpx.Client: |
There was a problem hiding this comment.
I feel like we can create something like SyncClient and AsyncClient as base classes for the sync/async inference/management clients to reduce redundant code.
There was a problem hiding this comment.
We can, but it doesn't save much. My concern here is that there are only two classes for this expected, and I am more concerned about the user-facing API surface than this bit of redundancy. You either would have 1) SyncClient/AsyncClient user-facing ABCs that muddy up the user-facing API surface that we'd expect no user to ever use, or 2) _SyncClient/_AsyncClient which makes code API groking/discovery a bit harder when user-facing classes extend non-user-facing classes containing user-facing methods. I think we can extract it out if it becomes more than a few methods w/ a line or two each.
What changed
baseten.client.ManagementClient/AsyncManagementClientandbaseten.client.InferenceClient/AsyncInferenceClient, but no high-level methods yetapiproperty that gives access to raw, auto-generated client