Merge branch 'open-webui:main' into main
55
.github/workflows/gh-pages.yml
vendored
Normal file
@ -0,0 +1,55 @@
|
||||
---
|
||||
name: Deploy site to Pages
|
||||
|
||||
on:
|
||||
# Runs on pushes targeting the default branch
|
||||
push:
|
||||
branches: ["main"]
|
||||
|
||||
# Allows you to run this workflow manually from the Actions tab
|
||||
workflow_dispatch:
|
||||
|
||||
# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
|
||||
permissions:
|
||||
contents: read
|
||||
pages: write
|
||||
id-token: write
|
||||
|
||||
# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
|
||||
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
|
||||
concurrency:
|
||||
group: "pages"
|
||||
cancel-in-progress: false
|
||||
|
||||
jobs:
|
||||
# Build job
|
||||
build:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Setup Node
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version-file: ".node-version"
|
||||
cache: npm
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
- name: Build
|
||||
run: npm run build
|
||||
- name: Upload artifact
|
||||
uses: actions/upload-pages-artifact@v3
|
||||
with:
|
||||
path: ./build
|
||||
|
||||
# Deployment job
|
||||
deploy:
|
||||
environment:
|
||||
name: github-pages
|
||||
url: ${{ steps.deployment.outputs.page_url }}
|
||||
runs-on: ubuntu-latest
|
||||
needs: build
|
||||
steps:
|
||||
- name: Deploy to GitHub Pages
|
||||
id: deployment
|
||||
uses: actions/deploy-pages@v4
|
126
docs/tutorials/evaluation/index.mdx
Normal file
@ -0,0 +1,126 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
title: "📝 Evaluation"
|
||||
---
|
||||
|
||||
import { TopBanners } from "@site/src/components/TopBanners";
|
||||
|
||||
<TopBanners />
|
||||
|
||||
## Why Should I Evaluate Models?
|
||||
|
||||
Meet **Alex**, a machine learning engineer at a mid-sized company. Alex knows there are numerous AI models out there—GPTs, LLaMA, and many more—but which one works best for the job at hand? They all sound impressive on paper, but Alex can’t just rely on public leaderboards. These models perform differently depending on the context, and some models may have been trained on the evaluation dataset (sneaky!). Plus, the way these models write can sometimes feel … off.
|
||||
|
||||
That's where Open WebUI comes in. It gives Alex and their team an easy way to evaluate models based on their actual needs. No convoluted math. No heavy lifting. Just thumbs up or thumbs down while interacting with the models.
|
||||
|
||||
### TL;DR
|
||||
|
||||
- **Why evaluations matter**: Too many models, but not all fit your specific needs. General public leaderboards can't always be trusted.
|
||||
- **How to solve it**: Open WebUI offers a built-in evaluation system. Use a thumbs up/down to rate model responses.
|
||||
- **What happens behind the scenes**: Ratings adjust your personalized leaderboard, and snapshots from rated chats will be used for future model fine-tuning!
|
||||
- **Evaluation options**:
|
||||
- **Arena Model**: Randomly selects models for you to compare.
|
||||
- **Normal Interaction**: Just chat like usual and rate the responses.
|
||||
|
||||
---
|
||||
|
||||
### Why Is Public Evaluation Not Enough?
|
||||
|
||||
- Public leaderboards aren’t tailored to **your** specific use case.
|
||||
- Some models are trained on evaluation datasets, affecting the fairness of the results.
|
||||
- A model may perform well overall, but its communication style or responses just don’t fit the “vibe” you want.
|
||||
|
||||
### The Solution: Personalized Evaluation with Open WebUI
|
||||
|
||||
Open WebUI has a built-in evaluation feature that lets you and your team discover the model best suited for your particular needs—all while interacting with the models.
|
||||
|
||||
How does it work? Simple!
|
||||
|
||||
- **During chats**, leave a thumbs up if you like a response, or a thumbs down if you don’t. If the message has a **sibling message** (like a regenerated response or part of a side-by-side model comparison), you’re contributing to your **personal leaderboard**.
|
||||
- **Leaderboards** are easily accessible in the Admin section, helping you track which models are performing best according to your team.
|
||||
|
||||
One cool feature? **Whenever you rate a response**, the system captures a **snapshot of that conversation**, which will later be used to refine models or even power future model training. (Do note, this is still being developed!)
|
||||
|
||||
---
|
||||
|
||||
### Two Ways to Evaluate an AI Model
|
||||
|
||||
Open WebUI provides two straightforward approaches for evaluating AI models.
|
||||
|
||||
### **1. Arena Model**
|
||||
|
||||
The **Arena Model** randomly selects from a pool of available models, making sure the evaluation is fair and unbiased. This helps in removing a potential flaw in manual comparison: **ecological validity** – ensuring you don’t knowingly or unknowingly favor one model.
|
||||
|
||||
How to use it:
|
||||
- Select a model from the Arena Model selector.
|
||||
- Use it like you normally would, but now you’re in “arena mode.”
|
||||
|
||||
For your feedback to affect the leaderboard, you need what’s called a **sibling message**. What's a sibling message? A sibling message is just any alternative response generated by the same query (think of message regenerations or having multiple models generating responses side-by-side). This way, you’re comparing responses **head-to-head**.
|
||||
|
||||
- **Scoring tip**: When you thumbs up one response, the other will automatically get a thumbs down. So, be mindful and only upvote the message you believe is genuinely the best!
|
||||
- Once you rate the responses, you can check out the leaderboard to see how models are stacking up.
|
||||
|
||||
Here’s a sneak peek at how the Arena Model interface works:
|
||||
|
||||

|
||||
|
||||
Need more depth? You can even replicate a [**Chatbot Arena**](https://lmarena.ai/)-style setup!
|
||||
|
||||

|
||||
|
||||
### **2. Normal Interaction**
|
||||
|
||||
No need to switch to “arena mode” if you don't want to. You can use Open WebUI normally and rate the AI model responses as you would in everyday operations. Just thumbs up/down the model responses, whenever you feel like it. However, **if you want your feedback to be used for ranking on the leaderboard**, you'll need to **swap out the model and interact with a different one**. This ensures there's a **sibling response** to compare it with – only comparisons between two different models will influence rankings.
|
||||
|
||||
For instance, this is how you can rate during a normal interaction:
|
||||
|
||||

|
||||
|
||||
And here's an example of setting up a multi-model comparison, similar to an arena:
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Leaderboard
|
||||
|
||||
After rating, check out the **Leaderboard** under the Admin Panel. This is where you’ll visually see how models are performing, ranked using an **Elo rating system** (think chess rankings!) You’ll get a real view of which models are truly standing out during the evaluations.
|
||||
|
||||
This is a sample leaderboard layout:
|
||||
|
||||

|
||||
|
||||
### Topic-Based Reranking
|
||||
|
||||
When you rate chats, you can **tag them by topic** for more granular insights. This is especially useful if you’re working in different domains like **customer service, creative writing, technical support**, etc.
|
||||
|
||||
#### Automatic Tagging
|
||||
Open WebUI tries to **automatically tag chats** based on the conversation topic. However, depending on the model you're using, the automatic tagging feature might **sometimes fail** or misinterpret the conversation. When this happens, it’s best practice to **manually tag your chats** to ensure the feedback is accurate.
|
||||
|
||||
- **How to manually tag**: When you rate a response, you'll have the option to add your own tags based on the conversation's context.
|
||||
|
||||
Don't skip this! Tagging is super powerful because it allows you to **re-rank models based on specific topics**. For instance, you might want to see which model performs best for answering technical support questions versus general customer inquiries.
|
||||
|
||||
Here’s an example of how re-ranking looks:
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
### Side Note: Chat Snapshots for Model Fine-Tuning
|
||||
|
||||
Whenever you rate a model’s response, Open WebUI *captures a snapshot of that chat*. These snapshots can eventually be used to **fine-tune your own models**—so your evaluations feed into the continuous improvement of the AI.
|
||||
|
||||
*(Stay tuned for more updates on this feature, it's actively being developed!)*
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**In a nutshell**, Open WebUI’s evaluation system has two clear goals:
|
||||
1. Help you **easily compare models**.
|
||||
2. Ultimately, find the model that meshes best with your individual needs.
|
||||
|
||||
At its heart, the system is all about making AI model evaluation **simple, transparent, and customizable** for every user. Whether it's through the Arena Model or Normal Chat Interaction, **you’re in full control of determining which AI model works best for your specific use case**!
|
||||
|
||||
**As always**, all of your data stays securely on **your instance**, and nothing is shared unless you specifically **opt-in for community sharing**. Your privacy and data autonomy are always prioritized.
|
84
docs/tutorials/tips/contributing-tutorial.md
Normal file
@ -0,0 +1,84 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
title: "Contributing Tutorials"
|
||||
---
|
||||
|
||||
:::warning
|
||||
This tutorial is a community contribution and is not supported by the OpenWebUI team. It serves only as a demonstration on how to customize OpenWebUI for your specific use case. Want to contribute? Check out the contributing tutorial.
|
||||
:::
|
||||
|
||||
# Contributing Tutorials
|
||||
|
||||
We appreciate your interest in contributing tutorials to the Open WebUI documentation. Follow the steps below to set up your environment and submit your tutorial.
|
||||
|
||||
## Steps
|
||||
|
||||
1. **Fork the `openwebui/docs` GitHub Repository**
|
||||
|
||||
- Navigate to the [Open WebUI Docs Repository](https://github.com/open-webui/docs) on GitHub.
|
||||
- Click the **Fork** button at the top-right corner to create a copy under your GitHub account.
|
||||
|
||||
2. **Configure GitHub Environment Variables**
|
||||
|
||||
- In your forked repository, go to **Settings** > **Secrets and variables** > **Actions** > **Variables**.
|
||||
- Add the following environment variables:
|
||||
- `BASE_URL` set to `/docs` (or your chosen base URL for the fork).
|
||||
- `SITE_URL` set to `https://<your-github-username>.github.io/`.
|
||||
|
||||
3. **Enable GitHub Actions**
|
||||
|
||||
- In your forked repository, navigate to the **Actions** tab.
|
||||
- If prompted, enable GitHub Actions by following the on-screen instructions.
|
||||
|
||||
4. **Enable GitHub Pages**
|
||||
|
||||
- Go to **Settings** > **Pages** in your forked repository.
|
||||
- Under **Source**, select the branch you want to deploy (e.g., `main`) and the folder (e.g.,`/docs`).
|
||||
- Click **Save** to enable GitHub Pages.
|
||||
|
||||
5. **Run the `gh-pages` GitHub Workflow**
|
||||
|
||||
- In the **Actions** tab, locate the `gh-pages` workflow.
|
||||
- Trigger the workflow manually if necessary, or it may run automatically based on your setup.
|
||||
|
||||
6. **Browse to Your Forked Copy**
|
||||
|
||||
- Visit `https://<your-github-username>.github.io/docs` to view your forked documentation.
|
||||
|
||||
7. **Draft Your Changes**
|
||||
|
||||
- In your forked repository, navigate to the appropriate directory (e.g., `docs/tutorial/`).
|
||||
- Create a new markdown file for your tutorial or edit existing ones.
|
||||
- Ensure that your tutorial includes the unsupported warning banner.
|
||||
|
||||
8. **Submit a Pull Request**
|
||||
|
||||
- Once your tutorial is ready, commit your changes to your forked repository.
|
||||
- Navigate to the original `open-webui/docs` repository.
|
||||
- Click **New Pull Request** and select your fork and branch as the source.
|
||||
- Provide a descriptive title and description for your PR.
|
||||
- Submit the pull request for review.
|
||||
|
||||
## Important
|
||||
|
||||
Community-contributed tutorials must include the the following:
|
||||
```
|
||||
:::warning
|
||||
This tutorial is a community contribution and is not supported by the OpenWebUI team. It serves only as a demonstration on how to customize OpenWebUI for your specific use case. Want to contribute? Check out the contributing tutorial.
|
||||
:::
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
:::tip How to Test Docusaurus Locally
|
||||
You can test your Docusaurus site locally with the following commands:
|
||||
|
||||
```bash
|
||||
npm install # Install dependencies
|
||||
npm run build # Build the site for production
|
||||
```
|
||||
|
||||
This will help you catch any issues before deploying
|
||||
:::
|
||||
|
||||
---
|
BIN
static/img/evaluation/arena-many.png
Normal file
After Width: | Height: | Size: 678 KiB |
BIN
static/img/evaluation/arena.png
Normal file
After Width: | Height: | Size: 345 KiB |
BIN
static/img/evaluation/leaderboard-reranked.png
Normal file
After Width: | Height: | Size: 348 KiB |
BIN
static/img/evaluation/leaderboard.png
Normal file
After Width: | Height: | Size: 349 KiB |
BIN
static/img/evaluation/normal-many.png
Normal file
After Width: | Height: | Size: 408 KiB |
BIN
static/img/evaluation/normal.png
Normal file
After Width: | Height: | Size: 437 KiB |
BIN
static/img/evaluation/rate.png
Normal file
After Width: | Height: | Size: 279 KiB |