mirror of
https://github.com/clearml/clearml-docs
synced 2025-02-25 05:24:39 +00:00
82 lines
4.1 KiB
Markdown
82 lines
4.1 KiB
Markdown
---
|
|
id: overview
|
|
title: What is ClearML?
|
|
slug: /
|
|
---
|
|
|
|
# ClearML Documentation
|
|
|
|
## Overview
|
|
Welcome to the documentation for ClearML, the end to end platform for streamlining AI development and deployment. ClearML consists of three essential layers:
|
|
1. [**Infrastructure Control Plane**](#infrastructure-control-plane) (Cloud/On-Prem Agnostic)
|
|
2. [**AI Development Center**](#ai-development-center)
|
|
3. [**GenAI App Engine**](#genai-app-engine)
|
|
|
|
Each layer provides distinct functionality to ensure an efficient and scalable AI workflow from development to deployment.
|
|
|
|
data:image/s3,"s3://crabby-images/fc01d/fc01d01dafdbd4d228e66bb7942df7155dbd7d6a" alt="Webapp gif"
|
|
data:image/s3,"s3://crabby-images/d3b72/d3b7277ace8cdd0ba4fbb0b09491fb8e90b9414f" alt="Webapp gif"
|
|
|
|
---
|
|
|
|
## Infrastructure Control Plane
|
|
The Infrastructure Control Plane serves as the foundation of the ClearML platform, offering compute resource provisioning and management, enabling administrators to make the compute available through GPUaaS capabilities and no-hassle configuration.
|
|
Utilizing the Infrastructure Control Plane, DevOps and IT teams can manage and optimize GPU resources to ensure high performance and cost efficiency.
|
|
|
|
#### Features
|
|
- **Resource Management:** Automates the allocation and management of GPU resources.
|
|
- **Workload Autoscaling:** Seamlessly scale GPU resources based on workload demands.
|
|
- **Monitoring and Logging:** Provides comprehensive monitoring and logging for GPU utilization and performance.
|
|
- **Cost Optimization:** Consolidate cloud and on-prem compute into a seamless GPUaaS offering
|
|
- **Deployment Flexibility:** Easily run your workloads on both cloud and on-premises compute.
|
|
|
|
data:image/s3,"s3://crabby-images/b641b/b641b10e974e087e5f859e3ac98fb59e7e62d9ee" alt="Infrastructure control plane"
|
|
data:image/s3,"s3://crabby-images/c44fc/c44fc4964e84fe995e3dbde5000a78f81557993f" alt="Infrastructure control plane"
|
|
|
|
---
|
|
|
|
## AI Development Center
|
|
The AI Development Center offers a robust environment for developing, training, and testing AI models. It is designed to be cloud and on-premises agnostic, providing flexibility in deployment.
|
|
|
|
#### Features
|
|
- **Integrated Development Environment:** A comprehensive IDE for training, testing, and debugging AI models.
|
|
- **Model Training:** Scalable and distributed model training and hyperparameter optimization.
|
|
- **Data Management:** Tools for data preprocessing, management, and versioning.
|
|
- **Experiment Tracking:** Track metrics, artifacts and log. manage versions, and compare results.
|
|
- **Workflow Automation:** Build pipelines to formalize your workflow
|
|
|
|
data:image/s3,"s3://crabby-images/aaf4e/aaf4e8e920ef4d3258b07e88eae05de627d3b89b" alt="AI Dev center"
|
|
data:image/s3,"s3://crabby-images/30fff/30fff27ae3ab3e598e68ad8c50b457ac8eec1c1b" alt="AI Dev center"
|
|
|
|
---
|
|
|
|
## GenAI App Engine
|
|
The GenAI App Engine is designed to deploy large language models (LLM) into GPU clusters and manage various AI workloads, including Retrieval-Augmented Generation (RAG) tasks. This layer also handles networking, authentication, and role-based access control (RBAC) for deployed services.
|
|
|
|
#### Features
|
|
- **LLM Deployment:** Seamlessly deploy LLMs into GPU clusters.
|
|
- **RAG Workloads:** Efficiently manage and execute RAG workloads.
|
|
- **Networking and Authentication:** Deploy GenAI through secure, authenticated network endpoints
|
|
- **RBAC:** Implement RBAC to control access to deployed services.
|
|
|
|
data:image/s3,"s3://crabby-images/8c427/8c4279786c261ddfd1589bdf4fe7745984945fc0" alt="GenAI engine"
|
|
data:image/s3,"s3://crabby-images/9d206/9d206d06dd9e0fc6796d70ab3265ebe5d6982397" alt="GenAI engine"
|
|
|
|
---
|
|
|
|
## Getting Started
|
|
To begin using the ClearML, follow these steps:
|
|
1. **Set Up Infrastructure Control Plane:** Allocate and manage your GPU resources.
|
|
2. **Develop AI Models:** Use the AI Development Center to develop and train your models.
|
|
3. **Deploy AI Models:** Deploy your models using the GenAI App Engine.
|
|
|
|
For detailed instructions on each step, refer to the respective sections in this documentation.
|
|
|
|
---
|
|
|
|
## Support
|
|
For feature requests or bug reports, see ClearML on [GitHub](https://github.com/clearml/clearml/issues).
|
|
|
|
If you have any questions, join the discussion on the **ClearML** [Slack channel](https://joinslack.clear.ml), or tag your questions on [stackoverflow](https://stackoverflow.com/questions/tagged/clearml) with the **clearml** tag.
|
|
|
|
Lastly, you can always find us at [support@clearml.ai](mailto:support@clearml.ai?subject=ClearML). |