Add llama.cpp model deployment app (#976)

This commit is contained in:
pollfly
2024-12-01 11:43:18 +02:00
committed by GitHub
parent 0b1a98d9e6
commit 952d8acc6b
7 changed files with 125 additions and 1 deletions

View File

@@ -38,6 +38,7 @@ Applications for deploying user interfaces for models:
Applications for deploying machine learning models as scalable, secure services:
* [**Embedding Model Deployment**](apps_embed_model_deployment.md) - Deploy embedding models as networking services over a secure endpoint (available under ClearML Enterprise Plan)
* [**Model Deployment**](apps_model_deployment.md) - Deploy LLM models as networking services over a secure endpoint (available under ClearML Enterprise Plan)
* [**llama.cpp**](apps_llama_deployment.md) - Deploy LLM models in GGUF format using [`llama.cpp`](https://github.com/ggerganov/llama.cpp) as networking services over a secure endpoint (available under ClearML Enterprise Plan)
:::info Autoscalers
Autoscaling ([AWS Autoscaler](apps_aws_autoscaler.md) and [GCP Autoscaler](apps_gcp_autoscaler.md))