host ALL your AI locally

3 May 202424:19

TLDRThe video script details the process of building a local AI server named Terry, which the creator initially built for personal use but also intends to use for his daughters. The server is equipped with a GUI, chat interface, and multiple AI models, including the ability to integrate stable diffusion for image generation. The setup is demonstrated on a laptop, suggesting that the viewer's own computer could suffice. The server is praised for its speed, customization, and privacy, with the ability to restrict model usage for educational purposes without the risk of misuse. The script guides viewers through the setup process, including installing necessary software like Docker and handling prerequisites for integrating stable diffusion. The summary also highlights the capabilities of the local AI, such as generating images and interacting with chat models, and mentions the potential for further customization and the importance of privacy in AI usage.


  • 🤖 The video is about building a personal AI server named Terry, which can run AI models locally with a GUI and chat interface.
  • 🚀 The server is highly customizable, fast, and private, which is important for the creator's intention to let his daughters use it for school without the risk of misuse.
  • 💻 A high-powered computer is not strictly necessary; a simple laptop can suffice for running a local AI server.
  • 🧠 The AI server uses a powerful AMD Ryzen 9 7950X processor and 128GB of DDR5 memory for optimal performance.
  • 🎛️ The server includes features like back chat histories, multiple models, and the ability to add stable diffusion for image generation.
  • 📚 The creator demonstrates integrating the AI server with a notes application called Obsidian, allowing for an AI chat interface within the app.
  • 📡 The server can be controlled and customized to restrict certain actions or content, ensuring that it's used appropriately.
  • 🔧 The video provides a step-by-step guide on setting up the AI server, including installing necessary software like Docker and Llama.
  • 🖼️ Stable Diffusion is showcased for its ability to generate images locally, offering a fast and interactive experience.
  • 🔗 The AI server can be accessed remotely by users, given they have the server's IP address, and admin controls are in place for user management.
  • 📝 The Open Web UI is highlighted as a user-friendly interface for interacting with the AI server, allowing for chat, file uploads, and model switching.
  • ⚙️ The video concludes with additional features like document integration and the ability to create custom models with specific restrictions.

Q & A

  • What was the primary motivation behind building the AI server?

    -The primary motivation was to run all AI locally for personal use, with a focus on having a private and customizable system that could later be introduced to the builder's daughters for school assistance without the risk of misuse.

  • What are the key features of the AI server's interface?

    -The AI server features a GUI with a chat interface, support for back chat histories, multiple models, and the ability to integrate with applications like Obsidian. It also allows for the addition of stable diffusion for image generation.

  • What is the name of the AI server mentioned in the transcript?

    -The AI server is named Terry.

  • What kind of case was used for the AI server Terry?

    -A Leon Lee zero 11 dynamic EVO xl full tower EATX case was used for Terry.

  • What is the processor used in the AI server Terry?

    -The processor used is the AMD Ryzen 9 7950X, which has 16 cores and a frequency of 4.2 gigahertz.

  • How much memory does the AI server Terry have?

    -Terry has 128 gigabytes of G.Skill Trident D5 Neo DDR5 6000 memory.

  • What is the significance of having a local AI server for the builder's daughters?

    -The local AI server allows the builder's daughters to use AI to assist with schoolwork in a controlled environment. The builder can restrict the capabilities of the AI to prevent cheating or other misuse.

  • What is the name of the operating system that was successfully installed on the AI server after struggling with Ubuntu?

    -The operating system successfully installed was Pop!_OS by System76.

  • What is the minimum requirement to build a similar local AI server?

    -The minimum requirement is a computer running Windows, Mac, or Linux. Having a GPU would enhance the performance.

  • How can one restrict the AI's capabilities for specific users?

    -One can create custom model files with specific system prompts that outline what the AI can and cannot do, effectively acting as guardrails for the AI's capabilities.

  • What is the name of the web UI used for the local AI server?

    -The web UI used is called Open Web UI.

  • How can the local AI server be accessed remotely?

    -The local AI server can be accessed remotely by using the host's IP address, assuming the proper ports are open and accessible.



🚀 Building a Personal AI Server for Enhanced Control and Privacy

The video begins with the creator discussing his motivation for building a personal AI server named Terry, initially for his own use to run AI models locally. He emphasizes the importance of control, privacy, and customization, which led him to integrate a GUI and chat interface into his notes application, Obsidian. The server's hardware is showcased, including a full tower EATX case, a powerful AMD Ryzen processor, 128GB of DDR5 memory, and two MSI liquid-cooled GPUs. Despite initial difficulties with Ubuntu, the system successfully runs on Pop!_OS by System76. The video promises a tutorial on setting up a similar AI server, highlighting the need for a capable computer and a GPU for optimal performance.


🛠️ Installing and Configuring the AI Server with

The video continues with a step-by-step guide on setting up the AI server using as the foundation. It covers the installation process for different operating systems, including a quick setup for Windows users via WSL (Windows Subsystem for Linux). The presenter also discusses updating system packages, installing Llama with a simple command, and verifying the installation by accessing the API service through a web browser. Additional steps include adding AI models to Llama, such as Llama 2 and Code Gemma, and testing the models' functionality. The video also touches on the performance of the AI server with multiple GPUs and the ability to monitor GPU usage through terminal commands.


🖥️ Integrating Open Web UI and Customizing AI Models for Specific Uses

The presenter moves on to the web UI component, selecting Open Web UI as the preferred option for interacting with Llama. He explains how to run Open Web UI inside a Docker container, which requires Docker to be installed. The video demonstrates the process of setting up Docker, deploying the Open Web UI container, and accessing the web interface. It also covers the creation of an admin account and the customization options available, such as restricting user sign-ups, whitelisting models, and creating custom model files with specific constraints. The presenter shows how to create a restricted model for his daughter to prevent cheating and ensure educational use of the AI.


🖼️ Setting Up Stable Diffusion for Image Generation and Integrating it with Open Web UI

The video introduces the setup process for Stable Diffusion, an AI model for image generation, which is installed using Automatic1111 with a UI. The presenter guides viewers through installing prerequisites, managing Python versions with PI ENV, and executing a script to set up the environment for Stable Diffusion. He demonstrates the image generation process and its speed, comparing it to other AI models. The video also shows how to integrate Stable Diffusion with Open Web UI, allowing users to generate images directly from the web interface based on text prompts.


📚 Exploring Additional Features: Document Integration and AI Chatbot in Obsidian

The final part of the video explores additional features of the AI server. The presenter discusses the document section in Open Web UI, which allows users to upload and utilize documents within the AI's responses. He also shares his enthusiasm for Obsidian, a note-taking application, and demonstrates how to integrate a local AI chatbot into it using a community plugin. The chatbot can reference and interact with the current note, providing a dynamic and interactive note-taking experience. The video concludes with an invitation for viewers to join the creator's Discord community for further discussion and to share their own AI projects.



💡AI Server

An AI server is a computer system specifically designed to run artificial intelligence applications. In the context of the video, the creator built an AI server named Terry to host AI models locally, which allows for faster and more private interactions with AI, without relying on cloud-based services. Terry is equipped with high-end components to handle the computational demands of AI processing.

💡Local AI

Local AI refers to artificial intelligence applications that run directly on a user's own hardware, such as a personal computer or a server, rather than on remote servers. The video emphasizes the benefits of running AI locally, including increased speed, privacy, and control over the AI's capabilities.

💡GUI (Graphical User Interface)

A GUI is a type of user interface that allows users to interact with a system using graphical icons and visual indicators. In the video, the creator mentions a 'beautiful chat interface' for interacting with the AI, making it more user-friendly and accessible than a command-line interface.


Llama, in this context, refers to an AI model or a part of the AI software that the creator is using on his local server. The script mentions 'Llama two', which is a specific version or instance of an AI model that the creator interacts with to perform tasks like generating text or summarizing content.

💡Stable Diffusion

Stable Diffusion is a term used in the video to describe a feature or model capable of generating images from textual descriptions, a process known as image synthesis. It's an example of how AI can be used for creative tasks, and the video demonstrates how it can be integrated into the local AI server for fast and private image generation.


Obsidian is a note-taking and knowledge management software that the creator uses in the video. It's highlighted as an application where the local AI can be integrated, allowing the creator to have an AI chat interface within his notes for enhanced productivity and information processing.


Docker is a platform that enables users to develop, ship, and run applications in specialized software containers. In the video, Docker is used to deploy Open Web UI, a web interface for interacting with the local AI server, simplifying the process of setting up and managing the AI services.


Ubuntu is a popular open-source operating system based on Linux. The video mentions the installation of Ubuntu as part of setting up the local AI server, highlighting its compatibility and use in the process of installing necessary software and drivers.

💡Pop!_OS by System76

Pop!_OS is a Linux distribution developed by System76, which is mentioned in the video as an alternative operating system that the creator used for his AI server. It is noted for its ease of installation, including built-in Nvidia drivers, which are beneficial for running AI workloads.

💡WSL (Windows Subsystem for Linux)

WSL is a compatibility layer for running Linux binary executables natively on Windows. The video demonstrates the use of WSL to install and run Llama on a Windows system, showcasing how it allows Windows users to utilize Linux tools and applications.

💡Multimodal Models

Multimodal models in AI refer to systems that can process and understand information from multiple senses or data types, such as text, images, and audio. In the video, the creator discusses using a multimodal model named 'lava' that can analyze and generate responses based on images, in addition to text.


The author built a local AI server named Terry for personal use and to assist his daughters with school.

The AI server includes a GUI with a chat interface, back chat histories, and multiple models.

Stable diffusion can be added to the AI server for additional features.

The AI server was integrated into the notes application Obsidian for a seamless user experience.

A simple laptop can be used to set up a local AI server, demonstrating its accessibility.

The AI server is customizable, fast, and private, allowing the author to control its usage.

The author can restrict the AI's capabilities with special model files to prevent misuse.

Terry, the AI server, is equipped with high-end components for powerful performance.

The author encountered difficulties installing Ubuntu but successfully installed Pop OS by System76. serves as the foundation for running AI models on the local server.

WSL (Windows Subsystem for Linux) can be used to install and run Llama on Windows.

IT Pro by A Cloud Guru is recommended for learning Linux and IT skills.

Open Web UI is a user-friendly interface for interacting with the local AI server.

Docker is used to deploy the Open Web UI container for a seamless integration with Llama.

The AI server can generate images using Stable Diffusion, showcasing its multimodal capabilities.

Automatic 1111 is an interface for integrating Stable Diffusion with Open Web UI.

Obsidian, a note-taking application, can be enhanced with a local chatbot connected to the AI server.

The author emphasizes the importance of privacy and control when running AI locally.