Add llama.cpp model deployment app (#976)

2025-06-26 18:17:44 +00:00 · 2024-12-01 11:43:18 +02:00
parent 0b1a98d9e6
commit 952d8acc6b
7 changed files with 125 additions and 1 deletions
--- a/docs/webapp/applications/apps_overview.md
+++ b/docs/webapp/applications/apps_overview.md
@@ -38,6 +38,7 @@ Applications for deploying user interfaces for models:
 Applications for deploying machine learning models as scalable, secure services:
 * [**Embedding Model Deployment**](apps_embed_model_deployment.md) - Deploy embedding models as networking services over a secure endpoint (available under ClearML Enterprise Plan)
 * [**Model Deployment**](apps_model_deployment.md) - Deploy LLM models as networking services over a secure endpoint (available under ClearML Enterprise Plan)
+* [**llama.cpp**](apps_llama_deployment.md) - Deploy LLM models in GGUF format using [`llama.cpp`](https://github.com/ggerganov/llama.cpp) as networking services over a secure endpoint (available under ClearML Enterprise Plan)

 :::info Autoscalers
 Autoscaling ([AWS Autoscaler](apps_aws_autoscaler.md) and [GCP Autoscaler](apps_gcp_autoscaler.md))