A simple guide to your own AI assistant
Introduction
I am an exchange student from Finland. I came to Tralios IT GmbH in Germany for my on-the-job training. My main task at Tralios was to learn about different AI models and how to utilize them locally for the company. This blog is my small guide to people who want to try out AI models locally for their company or for personal use.
Here are a few examples of what you can do with the models
Model used: Deepseek-r1:8b
Prompt given: What is 123 multiplied by 456? |
Expected answer: 56 088
Screenshot of the answer:
Model used: Gemma3:4b
Prompt given: Can you give me a few examples for an email to a customer, the email should be based on these 4 sentences. 1. The email should be formal but still friendly. 2. The email should be under 10 sentences. 3 The email is about a power outage coming on the next tuesday. 4. The power outage lasts from 8 am until 11 am. |
Expected answer: 2 to 3 examples with different tones like friendly and formal.
Screenshot of the answer:
Model used: Llama3.2:3b
Prompt given: Jack earns 18$ per hour and works for 30 hours a week. Layla earns 30$ per hour and works 17 hours a week. Who earns more money in one week? |
Expected answer: Jack
Screenshot of the answer:
What You’ll Learn
- How to set up a basic AI environment on your computer.
- How to utilize AI models like Llama and Mistral running on your computer.
- Why this is useful for personal and business purposes.
How does it work?
Imagine a package that has everything you need to work with inside it. That’s what we are trying to achieve here with AI. We are taking different AI models and putting them into a package that’s ready to use, without needing complicated setups.
The Tools We’re Using
- Ollama: Is used to download and run your AI models in different environments.
- OpenWebUI: Is for using the AI models in a web interface like a browser. There you can chat with the AI models directly and use multiple models at the same time.
- NVIDIA Container Toolkit: This tool is used to utilize your graphics card (GPU) in the AI processing, making the AI model processing even faster.
- Docker: Is like a program running in the background handling all your packages.
Why should you try this?
- Privacy: You’re in control of your data. You don’t need to share information to other AI companies.
- Customization: You can try out different AI models and customize them to to suit your specific needs.
- Cost-Effective: You can run AI models from your own hardware instead of paying for cloud-services.
- Future of AI: AI will be a part of our future and having information about how it works might prove useful.
Important Notes Before You Start
- Operating System: You’ll need Ubuntu 24.04 installed.
- Graphics card: You’ll need an NVIDIA graphics card. We recommend at least 4-8GB of VRAM (VRAM is your graphics card memory). You can find a list of supported GPUs here
- Compatibility: This guide is made using an NVIDIA graphics card so it might not work on other brand graphics cards.
- Updates: AI technology advances quickly so this guide might not be 100% up to date.
Initial Steps: Go to Install docker for Ubuntu and follow the 3 instructions for the apt repository.
This is the foundation – ensure you fully complete this step. If you encounter any errors during the apt commands (like ‘permission denied’), double-check your user’s sudo privileges.
Installation successful: Once successful, you should see a message saying ‘Hello World Container’ – this confirms Docker is installed correctly. If you don’t, review the output for error messages and consult the Docker troubleshooting guide.
After installation: Open a new terminal window. Type
docker --version
This command checks if Docker is accessible.
Context is Key: Before starting, ensure you have the latest NVIDIA drivers installed for your GPU. This is critical for the toolkit to function correctly.
Detailed Instructions: Go to NVIDIA container toolkit install guide and follow the Ubuntu/Debian instructions 1 through 3. Then scroll down to Configuring docker and follow the first two steps.
Verification: After configuring Docker, try this command in terminal:
docker run -it --rm --gpus all ubuntu NVIDIA-smi
This command attempts to run a basic Ubuntu container and access your GPU. If this fails, try to check the instructions again. The most common issue is a missing or incorrect NVIDIA driver.
Troubleshooting GPU Access: If you get an error related to GPU access, double-check that the NVIDIA drivers are properly installed and that the NVIDIA-smi command is working correctly (nvidia-smi should display information about your GPU).
Note: For this step we will use the default editor of Ubuntu called nano.
File Structure: Create a folder with your chosen name for example (openwebui-ollama) somewhere convenient. For example
sudo mkdir /opt/openwebui-ollama
Go inside that folder with
cd /opt/openwebui-ollamaCreate a file named docker-compose.yml. with the command sudo nano docker.compose.yml
File content: You can copy the content that needs to be inserted inside docker-compose.yml from the Docker-compose.yml tab below.
Secret Key: The WEBUI_SECRET_KEY inside the file is a security setting. Change ‘MySecretKey’ to a stronger password.
Saving the file: After making the necessary changes to the file inside the nano editor save and exit the editor.
Save is Ctrl+O and exit is Ctrl+X.
Running Compose: Navigate to the folder containing your docker-compose.yml and run these commands:
docker compose pullThis downloads the containers set in docker-compose.yml
docker compose up --detachThis runs the containers in the background.
You can copy this and then paste it inside your docker-compose.yml file
services: open-webui: ports: - 8080:8080 volumes: - open-webui:/app/backend/data container_name: open-webui restart: unless-stopped image: ghcr.io/open-webui/open-webui environment: - WEBUI_SECRET_KEY=MySecretKey - OLLAMA_BASE_URL=http://ollama:11434 ollama: deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: - gpu volumes: - ollama:/root/.ollama ports: - 11434:11434 container_name: ollama image: ollama/ollama restart: unless-stopped volumes: ollama: {} open-webui: {}
Accessing OpenWebUI: Once the containers are running, open your web browser and go to:
http://[server-IP-address]:8080
Replace [server-IP-address] with the IP address of the machine running your containers.
Admin Panel Access: Click your profile picture (top right corner) and select ‘Admin Panel’. This opens the interface for managing your OpenWebUI instance.
Settings & Connections: Navigate to ‘Settings’ -> ‘Connections’ -> ‘Manage Ollama API Connections’. Here you'll configure the connection to your Ollama instance.
URL Configuration: Enter http://ollama:11434. Make sure that your Ollama server is running and accessible from the machine running OpenWebUI.
How to get and use different models in OpenWebUI
Here we learn how to download models from ollama to your openwebui website. Here is a link to ollama models. Try to learn a bit about what the different models can do by browsing the models.
When you click a model name on the site, you can see what they are designed for and what parameter sizes are available.
Parameter count b is simply a batch size meaning the bigger the b number is the bigger your request length can be. It also means that the bigger the number the more data it has to work with, because the bigger ones have been trained with more data. When the parameter count is higher it means it needs more from your hardware and if your hardware is not enough for the model it will be significantly slower.
After finding your desired model go back to your openwebui page.
In the upper left-corner there should be an arrow pointing down, click that, and write the desired model into the search field.
Note: It has to be precise like (Gemma3:4b or llama3.2:3b). Click pull the model from ollama.com to initiate the download. The first time you pull a model, it will take a significant amount of time and disk space. After the download you can select the model and ask something from it.
Tips for the future
- Stopping/Restarting Services: To stop or restart the services we navigate to the containing folder.
In order to stop the server run:cd /opt/openwebui-ollama docker compose down
In order to restart the server run:
cd /opt/openwebui-ollama docker compose restart
- Updating services: You might need to update the service from time to time. There should be a blue text notification about updating in the lower right corner of your webui website. To update the container, we navigate to the containing folder, download the updated images with a command and then shut it down and start it again.
cd /opt/openwebui-ollama docker compose pull docker compose down docker compose up --detach
Summary about what you’ve done
Now you’ve learned how to install the necessary tools for creating your own AI environment.
Now you can use what you’ve learned for yourself or your company and try out different AI models for various situations.
Perhaps you need help setting it up or just don’t want to set it up alone, just contact our team we will do our best to help you.