Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/content/advanced/advanced-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,7 @@ A list of the environment variable that tweaks parallelism is the following:
```
### Python backends GRPC max workers
### Default number of workers for GRPC Python backends.
### This actually controls wether a backend can process multiple requests or not.
### This actually controls whether a backend can process multiple requests or not.
### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
Expand Down
6 changes: 3 additions & 3 deletions docs/content/features/image-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,15 +199,15 @@ Pipelines types available:

##### Advanced: Additional parameters

Additional arbitrarly parameters can be specified in the option field in key/value separated by `:`:
Additional arbitrary parameters can be specified in the option field in key/value separated by `:`:

```yaml
name: animagine-xl
options:
- "cfg_scale:6"
```

**Note**: There is no complete parameter list. Any parameter can be passed arbitrarly and is passed to the model directly as argument to the pipeline. Different pipelines/implementations support different parameters.
**Note**: There is no complete parameter list. Any parameter can be passed arbitrarily and is passed to the model directly as argument to the pipeline. Different pipelines/implementations support different parameters.

The example above, will result in the following python code when generating images:

Expand Down Expand Up @@ -342,4 +342,4 @@ diffusers:
```bash
(echo -n '{"prompt": "spiderman surfing","size": "512x512","model":"txt2vid"}') |
curl -H "Content-Type: application/json" -X POST -d @- http://localhost:8080/v1/images/generations
```
```
4 changes: 2 additions & 2 deletions docs/content/features/text-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -897,7 +897,7 @@ The backend will automatically download the required files in order to run the m
- `OVModelForCausalLM` requires OpenVINO IR [Text Generation](https://huggingface.co/models?library=openvino&pipeline_tag=text-generation) models from Hugging face
- `OVModelForFeatureExtraction` works with any Safetensors Transformer [Feature Extraction](https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers,safetensors) model from Huggingface (Embedding Model)

Please note that streaming is currently not implemente in `AutoModelForCausalLM` for Intel GPU.
Please note that streaming is currently not implemented in `AutoModelForCausalLM` for Intel GPU.
AMD GPU support is not implemented.
Although AMD CPU is not officially supported by OpenVINO there are reports that it works: YMMV.

Expand Down Expand Up @@ -1008,4 +1008,4 @@ template:

completion: |
{{.Input}}
```
```
2 changes: 1 addition & 1 deletion docs/content/whats-new.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ It is now possible for single-devices with one GPU to specify `--single-active-b

#### Resources management

Thanks to the continous community efforts (another cool contribution from {{< github "dave-gray101" >}} ) now it's possible to shutdown a backend programmatically via the API.
Thanks to the continuous community efforts (another cool contribution from {{< github "dave-gray101" >}} ) now it's possible to shutdown a backend programmatically via the API.
There is an ongoing effort in the community to better handling of resources. See also the [🔥Roadmap](https://localai.io/#-hot-topics--roadmap).

#### New how-to section
Expand Down