Add User-Agent header to outbound requests#5
Conversation
| if http_client is None: | ||
| if http_client_override is None: | ||
| request_headers: dict[str, str] = {**(headers or {})} | ||
| # Empty api_key is an advanced opt-out from sending Authorization. |
There was a problem hiding this comment.
What is the use case for this? Testing/dev only?
There was a problem hiding this comment.
To give more full freedom. Technically forcing auth headers on people with 0 workaround can be bad practice for those that have reverse proxies (that only set header on the way upstream). I just didn't want to make API key optional for this rare use case, but it does happen in the wild for API clients like these.
There was a problem hiding this comment.
Got it, so this is against the "DUMMY_KEY" value that you'd otherwise need to set.
Though this does stand in tension with fail loud, early and actionable. Setting DUMMY_KEY must be done with clear intent and understanding. Forgetting an API key could be a common mistake. If that happens, the question is how clear would it surface to users that this is the problem and what they need to do to fix it.
There was a problem hiding this comment.
so this is against the "DUMMY_KEY" value that you'd otherwise need to set.
We were actually force-overriding until this point, there was nothing you could do short of providing your own HTTP client instance
Forgetting an API key could be a common mistake
This is why I don't accept None as a type. But yeah, absent key or bad key have the same server-side auth fail upon invocation. I could accept api_key: str | IDoNotWantAuthHeaderSentinel type of thing, but not even trying to broadcast this via type system (note empty string behavior not in docstring).
| api_key: str, | ||
| headers: Mapping[str, str] | None = None, | ||
| model_id: str | None = None, | ||
| chain_id: str | None = None, |
There was a problem hiding this comment.
I don't understand why you can specify IDs of two differently typed inference targets for one client.
My reasoning is that a user must know and in advance decide whether they want to invoke a chain or a model, and if that's so, isn't it cleaner to also select the respective inference client type to init?
This hinges now on the assumption that both chains and models always have exact semantics and types for invoking.
Was the goal here more about impl/code sharing or about not bothering the user with two distinct interfaces/types? But would the latter be even good, rather than obfuscating/misleading?
There was a problem hiding this comment.
I also think the "exactly one must be provided"-semantics is both awkward for the code we have to write for checking this at runtime.
But also client side there is no linter/type check that helps enforcing this properly before runtime.
There was a problem hiding this comment.
My reasoning is that a user must know and in advance decide whether they want to invoke a chain or a model, and if that's so, isn't it cleaner to also select the respective inference client type to init?
Cleaner to whom, us and our type checker, or the user? Which looks cleaner to you: InferenceClient(api_key=my_api_key, chain_id=my_chain_id) or InferenceClient(api_key=my_api_key, ref=ChainReference(chain_id=my_chain_id)). It is very clearly the former. However, if it is of great concern to have kwarg mutual exclusivity in the type system, we can leverage @overload to prevent both together (though this can get messy as it grows).
Was the goal here more about impl/code sharing or about not bothering the user with two distinct interfaces/types? But would the latter be even good, rather than obfuscating/misleading?
It was more to match the API/spec surface. So we have https://docs.baseten.co/reference/inference-api/, not separate APIs. The only difference is in the construction. It is more misleading to have some places at Baseten share an API surface under a common concept (inference API in this case) and some split.
There was a problem hiding this comment.
Ok I understand your reasoning. InferenceClient(api_key=my_api_key, chain_id=my_chain_id) does look clean, but I was thinking about InferenceClient(api_key=my_api_key) or InferenceClient(api_key=my_api_key, chain_id=my_chain_id, model_id=...).
@overload is actually the ideal solution here, I think you should add that.
There was a problem hiding this comment.
I can do that no prob (will do on successive PR)
What changed
Add
User-Agentheader on HTTP calls to Baseten and other minor adjustments