doc: starting with
Some checks are pending
Deploy site to Pages / build (push) Waiting to run
Deploy site to Pages / deploy (push) Blocked by required conditions

This commit is contained in:
Timothy Jaeryang Baek 2025-04-08 14:08:22 -07:00
parent 5aaa5b67d7
commit 56b4f5b241
5 changed files with 205 additions and 1 deletions

View File

@ -0,0 +1,128 @@
---
sidebar_position: 3
title: "🦙Starting with Llama.cpp"
---
## Overview
Open WebUI makes it simple and flexible to connect and manage a local Llama.cpp server to run efficient, quantized language models. Whether youve compiled Llama.cpp yourself or you're using precompiled binaries, this guide will walk you through how to:
- Set up your Llama.cpp server
- Load large models locally
- Integrate with Open WebUI for a seamless interface
Lets get you started!
---
## Step 1: Install Llama.cpp
To run models with Llama.cpp, you first need the Llama.cpp server installed locally.
You can either:
- 📦 [Download prebuilt binaries](https://github.com/ggerganov/llama.cpp/releases)
- 🛠️ Or build it from source by following the [official build instructions](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md)
After installing, make sure `llama-server` is available in your local system path or take note of its location.
---
## Step 2: Download a Supported Model
You can load and run various GGUF-format quantized LLMs using Llama.cpp. One impressive example is the DeepSeek-R1 1.58-bit model optimized by UnslothAI. To download this version:
1. Visit the [Unsloth DeepSeek-R1 repository on Hugging Face](https://huggingface.co/unsloth/DeepSeek-R1-GGUF)
2. Download the 1.58-bit quantized version around 131GB.
Alternatively, use Python to download programmatically:
```python
# pip install huggingface_hub hf_transfer
from huggingface_hub import snapshot_download
snapshot_download(
repo_id = "unsloth/DeepSeek-R1-GGUF",
local_dir = "DeepSeek-R1-GGUF",
allow_patterns = ["*UD-IQ1_S*"], # Download only 1.58-bit variant
)
```
This will download the model files into a directory like:
```
DeepSeek-R1-GGUF/
└── DeepSeek-R1-UD-IQ1_S/
├── DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf
├── DeepSeek-R1-UD-IQ1_S-00002-of-00003.gguf
└── DeepSeek-R1-UD-IQ1_S-00003-of-00003.gguf
```
📍 Keep track of the full path to the first GGUF file — youll need it in Step 3.
---
## Step 3: Serve the Model with Llama.cpp
Start the model server using the llama-server binary. Navigate to your llama.cpp folder (e.g., build/bin) and run:
```bash
./llama-server \
--model /your/full/path/to/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
--port 10000 \
--ctx-size 1024 \
--n-gpu-layers 40
```
🛠️ Tweak the parameters to suit your machine:
- --model: Path to your .gguf model file
- --port: 10000 (or choose another open port)
- --ctx-size: Token context length (can increase if RAM allows)
- --n-gpu-layers: Layers offloaded to GPU for faster performance
Once the server runs, it will expose a local OpenAI-compatible API on:
```
http://127.0.0.1:10000
```
---
## Step 4: Connect Llama.cpp to Open WebUI
To control and query your locally running model directly from Open WebUI:
1. Open Open WebUI in your browser
2. Go to ⚙️ Admin Settings → Connections → OpenAI Connections
3. Click Add Connection and enter:
- URL: `http://127.0.0.1:10000/v1`
(Or use `http://host.docker.internal:10000/v1` if running WebUI inside Docker)
- API Key: `none` (leave blank)
💡 Once saved, Open WebUI will begin using your local Llama.cpp server as a backend!
![Llama.cpp Connection in Open WebUI](/images/tutorials/deepseek/connection.png)
---
## Quick Tip: Try Out the Model via Chat Interface
Once connected, select the model from the Open WebUI chat menu and start interacting!
![Model Chat Preview](/images/tutorials/deepseek/response.png)
---
## You're Ready to Go!
Once configured, Open WebUI makes it easy to:
- Manage and switch between local models served by Llama.cpp
- Use the OpenAI-compatible API with no key needed
- Experiment with massive models like DeepSeek-R1 — right from your machine!
---
🚀 Have fun experimenting and building!

View File

@ -1,6 +1,6 @@
---
sidebar_position: 1
title: "🦙 Starting With Ollama"
title: "👉 Starting With Ollama"
---
## Overview

View File

@ -0,0 +1,76 @@
---
sidebar_position: 2
title: "🤖 Starting With OpenAI"
---
## Overview
Open WebUI makes it easy to connect and use OpenAI and other OpenAI-compatible APIs. This guide will walk you through adding your API key, setting the correct endpoint, and selecting models — so you can start chatting right away.
---
## Step 1: Get Your OpenAI API Key
To use OpenAI models (such as GPT-4 or GPT-3.5), you need an API key from a supported provider.
You can use:
- OpenAI directly (https://platform.openai.com/account/api-keys)
- Azure OpenAI
- An OpenAI-compatible service (e.g., LocalAI, FastChat, etc.)
👉 Once you have the key, copy it and keep it handy.
For most OpenAI usage, the default API base URL is:
https://api.openai.com/v1
Other providers may use different URLs — check your providers documentation.
---
## Step 2: Add the API Connection in Open WebUI
Once Open WebUI is running:
1. Go to the ⚙️ **Admin Settings**.
2. Navigate to **Connections > OpenAI > Manage** (look for the wrench icon).
3. Click **Add New Connection**.
4. Fill in the following:
- API URL: https://api.openai.com/v1
- API Key: Paste your key here
5. Click Save ✅.
This securely stores your credentials and sets up the connection.
Heres what it looks like:
![OpenAI Connection Screen](/images/getting-started/quick-start/manage-openai.png)
---
## Step 3: Start Using Models
Once your connection is saved, you can start using models right inside Open WebUI.
🧠 You dont need to download any models — just select one from the Model Selector and start chatting. If a model is supported by your provider, youll be able to use it instantly via their API.
Heres what model selection looks like:
![OpenAI Model Selector](/images/getting-started/quick-start/selector-openai.png)
Simply choose GPT-4, GPT-3.5, or any compatible model offered by your provider.
---
## All Set!
Thats it! Your OpenAI-compatible API connection is ready to use.
With Open WebUI and OpenAI, you get powerful language models, an intuitive interface, and instant access to chat capabilities — no setup headaches.
If you run into issues or need additional support, visit our [help section](/troubleshooting).
Happy prompting! 🎉

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB