Run Hugging Faces Spaces Demo on your own Colab GPU or Locally

28 Oct 202209:29

TLDRThe video tutorial guides viewers on how to run popular Hugging Face Spaces demos on Google Colab to avoid queues and utilize a Tesla T4 GPU. It explains creating a Google Colab notebook, checking for GPU availability, cloning the Hugging Face Spaces repo, installing required libraries from requirements.txt, and running the demo with modifications for an external URL share. The tutorial aims to help users efficiently utilize computational resources and avoid waiting times.


  • 🚀 **Popular Hugging Face Spaces and Queue Times**: The script discusses the popularity of certain Hugging Face Spaces demos, which can lead to long queue times due to high demand and shared compute resources.
  • 💡 **Running Demos Locally with Google Colab**: It suggests using Google Colab to run popular Hugging Face Spaces demos locally to avoid waiting in queues and to potentially access faster compute like Tesla T4 GPUs.
  • 📚 **Setting Up Google Colab**: The video provides a step-by-step guide on how to set up a new Google Colab notebook, including selecting GPU hardware acceleration if available.
  • 🔍 **Checking GPU Availability**: It explains how to verify if a GPU is available in the Google Colab environment using Nvidia SMI or by checking if PyTorch CUDA is available.
  • 📂 **Cloning Hugging Face Spaces Repositories**: The script details the process of cloning the relevant Hugging Face Spaces repository into the Google Colab notebook using `!git clone` command.
  • 📖 **Navigating Directories and Checking Contents**: It instructs on how to navigate into the cloned repository's directory and check the contents using `%cd` and `%ls` commands in Google Colab.
  • 💻 **Installing Required Libraries**: The importance of installing necessary libraries using `pip install -r requirements.txt` is highlighted, with additional instructions for installing `gradle` if it's not listed in the requirements.
  • 🔑 **Handling Hugging Face Tokens**: The script addresses the potential need for a Hugging Face token if the demo requires one, and how to perform a notebook login using `from huggingface_hub import notebook_login`.
  • 🌐 **Launching the Demo with an External URL**: It explains how to modify the `demo.launch` function to include a `share=True` parameter to get an external URL that can be shared on the internet.
  • 📋 **Editing the `` File**: The process of copying and pasting the content from the original `` into the Google Colab notebook and making necessary changes is outlined.
  • 🚫 **Avoiding Model Download重复**: The script suggests a method to separate the model downloading process from the application running process to prevent repeated downloads and improve efficiency.
  • 🎉 **Accessing the Running Application**: Finally, it describes how to run the application using `%python`, access the local and public URLs, and interact with the demo by uploading an image and selecting different options.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is about running a popular Hugging Face Spaces demo on Google Colab to avoid waiting in queues and to utilize the GPU resources more efficiently.

  • Why might running the demo on Google Colab be beneficial?

    -Running the demo on Google Colab can be beneficial because it allows users to skip queues and potentially use a Tesla T4 GPU, which can significantly speed up the process and provide a better experience.

  • How can you check if a GPU is available in Google Colab?

    -To check if a GPU is available in Google Colab, you can click on 'Runtime', then 'Change runtime type', and select 'GPU' under 'Hardware accelerator'. Alternatively, you can run 'import torch; print(torch.cuda.is_available())' to see if CUDA is available, which indicates a GPU is accessible.

  • What is the first step in setting up the demo on Google Colab?

    -The first step is to create a new Google Colab notebook by clicking on 'New File' and then 'New Notebook'.

  • How do you clone the Hugging Face Spaces repository in Google Colab?

    -To clone the Hugging Face Spaces repository, you can use the '!git clone [repository URL]' command in a cell in your Google Colab notebook.

  • What is the purpose of the 'requirements.txt' file in the repository?

    -The 'requirements.txt' file lists all the necessary libraries and dependencies required to run the Hugging Face Spaces demo. It is used with 'pip install -r requirements.txt' to install these libraries.

  • What additional step is needed if the demo requires a Hugging Face token?

    -If the demo requires a Hugging Face token, you need to run 'from huggingface_hub import notebook_login; notebook_login()' to authenticate and use the token.

  • How can you ensure that the demo application runs with an external URL in Google Colab?

    -To run the demo application with an external URL, you need to modify the '' file and set the 'share' parameter to 'True'. This allows the application to be accessible via an internet URL.

  • What is the advantage of separating the model download process from the main application code?

    -Separating the model download process from the main application code allows you to avoid downloading the model multiple times, which can save time and resources, especially when rerunning the notebook.

  • How do you run the demo application in Google Colab?

    -To run the demo application, you simply execute the 'python' command in the Google Colab notebook, which will download the necessary models and then run the application.

  • What is the outcome of running the demo as described in the video?

    -The outcome is that the demo application runs on Google Colab using the GPU without waiting in a queue, and the user can interact with it by uploading images and selecting different options to see the model's output.



🚀 Running Hugging Face Demos on Google Colab

This paragraph discusses the process of running popular Hugging Face demos on Google Colab to avoid long queues and utilize the available GPU resources. It emphasizes the benefits of running these models on your own Google Colab environment, such as skipping the queue and getting the same speed as if you were first in line. The speaker guides the audience on how to create a new Google Colab notebook, select GPU hardware acceleration, and clone the Hugging Face Spaces repository. It also covers checking for GPU availability using Nvidia SMI or PyTorch, and the importance of installing required libraries from the 'requirements.txt' file.


📚 Customizing and Launching the Gradient Application

This paragraph delves into the specifics of customizing and launching the gradient application on Google Colab. It explains the need to modify the '' file to enable sharing the application via an external URL, as opposed to a local URL. The speaker provides instructions on how to overwrite the '' file and add the 'share=True' parameter for easy access. The paragraph also touches on the nuances of handling Hugging Face tokens if required, and the possibility of separating the model download process from the application run for efficiency. The speaker demonstrates the application's functionality by uploading an image and selecting different styles for the diffusion model, highlighting the advantage of running on GPU without being in a queue.



💡Hugging Face Spaces

Hugging Face Spaces is a platform where users can find, share, and use a variety of machine learning models, particularly in the field of natural language processing. In the context of the video, it is a place where the demo being discussed is hosted, and it is noted for its popularity, leading to potential queues for access to the compute resources.

💡Google Colab

Google Colab is a cloud-based platform for machine learning and programming that allows users to run Python code in a Jupyter notebook environment. It is highlighted in the video as an alternative to running popular Hugging Face Spaces demos, where users can take the code and run it on their own Google Colab notebooks, potentially with access to a GPU and avoiding queues.

💡Diffusion Models

Diffusion models are a class of machine learning models used in generative tasks, such as image generation. They work by gradually transforming a random noise distribution into a meaningful sample. In the video, the focus is on fine-tuned diffusion models that are being demoed on Hugging Face Spaces, which are of interest to many users.

💡GPU (Graphics Processing Unit)

A Graphics Processing Unit (GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. In the context of the video, having access to a GPU on Google Colab can significantly speed up the running of the diffusion models, allowing users to skip queues and get faster processing times.


In the context of version control and software development, to 'clone' refers to the action of creating a complete copy of a repository, including its entire history. In the video, cloning the Hugging Face Spaces repository is a necessary step to replicate the demo in a Google Colab notebook.


A 'requirements.txt' file is a common way to list dependencies for Python projects. It contains a list of libraries and the specific versions needed for a project to run properly. In the video, the script mentions the importance of this file when setting up the Google Colab environment to ensure all necessary libraries for the demo are installed.

💡Notebook Login

In the context of using Hugging Face Spaces, 'notebook login' refers to the process of authenticating a user's access to certain resources or models that require a personal access token. This is necessary when using models that are not publicly available or require a subscription.

💡Share Parameter

The 'share' parameter, when set to true, allows the output of a Jupyter notebook to be accessible through an external URL, making it shareable and publicly accessible. This is useful for demos where the creator wants users to be able to easily share and access the application without the need for manual tunneling.

💡Model Download

Model download refers to the process of obtaining the necessary machine learning models and their associated files from a remote server to a local environment. In the video, it is mentioned that the first time the application is run, it will download all the required models, which can take a considerable amount of time.


Streamlit is an open-source Python library used to create and share custom web applications quickly. It is particularly useful for data scientists and machine learning engineers to present their work through interactive web apps without the need for extensive web development skills. In the video, if the demo were a Streamlit app, the viewers would be instructed to install Streamlit separately.

💡Gradient Application

A gradient application, in the context of this video, refers to the specific application or app created using the Gradient library, which is a framework for building and deploying machine learning models as web applications. The video's main theme revolves around running such an application using the Hugging Face Spaces demo.


Learning how to take a demo from Hugging Face Spaces and run it on your own Google Colab.

Popular Hugging Face Spaces demo can have long queues due to high demand.

Running the demo on Google Colab can help you skip the queue and potentially get a T4 GPU.

Creating a new Google Colab notebook is the first step to running the demo.

Ensure your Google Colab notebook has GPU support for optimal performance.

Use Nvidia SMI to check if the GPU is available in your Google Colab environment.

Clone the Hugging Face Spaces repository to access the demo's code.

Navigate to the specific folder within the repository that contains the demo.

Install required libraries using the provided requirements.txt file.

If the demo requires a Hugging Face token, use the notebook login feature.

Modify the file to enable sharing of the application with an external URL.

Run the file in Google Colab to start the demo and download necessary models.

Access the local and public URLs to interact with the demo and use its features.

Avoid re-downloading the model every time by separating the download and application run parts.

This tutorial helps users free up GPU resources and reduce wait times for popular demos.

The process is applicable for both local machine runs and Google Colab notebook environments.

The tutorial provides a practical guide to efficiently utilize Hugging Face Spaces and Google Colab.