feat: InferenceParams specific for backend#126
Merged
Madzionator merged 11 commits intomainfrom Mar 19, 2026
Merged
Conversation
Introduce IProviderInferenceParams and provider-specific parameter types (LocalInferenceParams, AnthropicParams, OpenAiParams, GeminiParams, GroqCloudParams, DeepSeekParams, OllamaParams, XaiParams). Replace usages of the old InferenceParams with IProviderInferenceParams/LocalInferenceParams across the codebase: Chat now stores ProviderParams (with LocalParams and InferenceGrammar helpers), mappers updated to handle LocalInferenceParams, interfaces and AgentService adjusted to accept IProviderInferenceParams, and tests updated accordingly. LLM and OpenAI-compatible services now apply provider-specific settings (temperature, max_tokens, top_p, etc.) via ApplyProviderParams, and grammar handling was unified to use Chat.InferenceGrammar. Examples updated to pass LocalInferenceParams. These changes enable multi-backend provider support and provider-specific configuration handling.
Introduce InvalidProviderParamsException for clearer bad-request errors when provider params types mismatch. Add an ExpectedParamsType contract to OpenAiCompatibleService and implement it in several providers (Gemini, GroqCloud, Ollama, OpenAi, Xai, DeepSeek). Perform runtime type checks in OpenAiCompatibleService, LLMService (local inference), and AnthropicService to throw the new exception when chat.ProviderParams is the wrong type. Misc: remove TopK from GeminiParams and stop emitting top_k in Gemini requests; preserve Chat.Backend if already set; minor cleanup (pragmas/using/whitespace).
Add ProviderParamsTests integration test suite that verifies provider-specific inference parameters for OpenAI, Anthropic, Gemini, DeepSeek, GroqCloud, Xai, local (Gemma2), and Ollama models. Each passing test checks the model returns the expected answer when custom params are supplied; additional negative tests assert InvalidProviderParamsException is thrown when wrong provider params are used. Add Xunit.SkippableFact package reference to enable conditional skipping based on environment (API keys, local model file, or Ollama availability).
Introduce ProviderParamsFactory to create IProviderInferenceParams based on BackendType. Update Home.razor to set the backend and inference params on the new chat context (casting to IChatConfigurationBuilder) before preserving message history, ensuring the correct provider settings are applied when switching models. Also simplify ChatService by using the null-coalescing assignment (chat.Backend ??= settings.BackendType) to default the backend.
Centralize message, image and tool handling into ChatHelper and update services to use it. Added ServiceConstants properties for ToolCalls/ToolCallId/ToolName and replaced hardcoded property keys in LLMService/OpenAiCompatibleService with ServiceConstants.Properties. Moved image extraction, message merging and message-array building logic out of Anthropic/OpenAi services into ChatHelper, made BuildAnthropicRequestBody asynchronous and now use ChatHelper.BuildMessagesArray. Removed duplicated helper methods and consolidated image MIME detection and message content construction.
wisedev-pstach
previously approved these changes
Mar 19, 2026
wisedev-pstach
approved these changes
Mar 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request refactors how inference parameters are handled for chat and agent contexts, moving from a generic
InferenceParamsclass to a backend-specific interface and classes. Each cloud service now receives and validates its own typed inference parameters that are actually sent in API requests — but only when explicitly set by the user.Key changes
IBackendInferenceParamsinterface withBackendType Backendas the only contract — each provider defines its own fields independentlyLocalInferenceParams(Self),OpenAiInferenceParams,DeepSeekInferenceParams,GeminiInferenceParams,AnthropicInferenceParams,GroqCloudInferenceParams,XaiInferenceParams,OllamaInferenceParamsnullmeans "don't send, let the API use its own default". This avoids errors with models that reject certain parameters (e.g. OpenAI reasoning models rejectingtop_p)Dictionary<string, object>? AdditionalParamson all param classes — catch-all for provider-specific parameters not covered by typed properties (e.g.max_completion_tokensfor newer OpenAI models)BackendParamsFactory— creates default params for a givenBackendTypeApplyBackendParams()template method inOpenAiCompatibleService— each service override maps its typed params to API request body fields, skipping nullsInvalidBackendParamsException— thrown when wrong params type is passed (e.g.OpenAiInferenceParamsto DeepSeek service)Chat.BackendParamsdefaults tonull—ChatServiceauto-creates correct params via factory when not explicitly setInitializeChatContext()now setsWithBackend()+WithInferenceParams()based on selected backendWhat stays unchanged
MemoryParams— shared across all providers (RAG/kernel memory)LocalInferenceParams— keeps concrete default values (llama.cpp needs them), only renamed fromInferenceParams