mirror of
https://github.com/clearml/clearml-docs
synced 2025-06-26 18:17:44 +00:00
Add llama.cpp model deployment app (#976)
This commit is contained in:
@@ -38,6 +38,7 @@ Applications for deploying user interfaces for models:
|
||||
Applications for deploying machine learning models as scalable, secure services:
|
||||
* [**Embedding Model Deployment**](apps_embed_model_deployment.md) - Deploy embedding models as networking services over a secure endpoint (available under ClearML Enterprise Plan)
|
||||
* [**Model Deployment**](apps_model_deployment.md) - Deploy LLM models as networking services over a secure endpoint (available under ClearML Enterprise Plan)
|
||||
* [**llama.cpp**](apps_llama_deployment.md) - Deploy LLM models in GGUF format using [`llama.cpp`](https://github.com/ggerganov/llama.cpp) as networking services over a secure endpoint (available under ClearML Enterprise Plan)
|
||||
|
||||
:::info Autoscalers
|
||||
Autoscaling ([AWS Autoscaler](apps_aws_autoscaler.md) and [GCP Autoscaler](apps_gcp_autoscaler.md))
|
||||
|
||||
Reference in New Issue
Block a user