How to Make AI Avatars - D-ID Tutorial

17 Jul 202311:48

TLDRThe video tutorial introduces D-ID, an AI company that offers a tool called Creative Reality Studio, which is used to create impressive AI avatars. The video explains the process of accessing the platform, the different pricing plans available, and how to create videos using AI avatars. It also demonstrates how to upload personal pictures for avatar creation, choose from various voices and styles, and use generative AI tools to transform pictures or videos into unique experiences. The tutorial further explores the option to use one's own voice, the use of translation tools for different languages, and the creation of animated AI presenters from scratch using prompts. The presenter also discusses the Pro Plan's benefits, including access to a better AI voice generator and more presenters, and mentions the availability of additional AI tools and courses on a related platform.


  • 🌟 D-ID is an AI company with a tool called Creative Reality Studio that creates impressive AI avatars.
  • 🚀 Users can transform pictures or videos into extraordinary experiences with D-ID's generative AI tools.
  • 🌐 The technology is utilized by creators, marketing agencies, production companies, and social media platforms globally.
  • 📚 To access D-ID, one can log in at Dash and proceed to for video creation.
  • 💰 D-ID offers a free trial with limited features, including a watermark on creations.
  • 📈 For more features, like presenter options and better AI voice generators, the Pro Plan is recommended.
  • 🎭 In Creative Reality Studio, users can choose from various presenters or upload their own pictures for a personalized avatar.
  • 🗣️ Voices are separate from the avatars, allowing users to match different voices to their chosen characters.
  • 🔄 Users can adjust the tone of the voice, add pauses, and even generate scripts using AI technology.
  • 🌐 Language selection is available, but for non-English scripts, translation is necessary before input.
  • 📹 The generated videos can be downloaded as MP4 files for use in other platforms or presentations.
  • 📂 Completed creations are stored in a video library for easy access, but cannot be edited; new videos must be created.
  • 📝 Users can also generate AI presenters from scratch by inputting a prompt into the system.
  • 🖼️ An option to add custom pictures for face-swapping is available, enhancing personalization.
  • 🎧 Recording and uploading one's own voice for the avatar is possible, adding an extra layer of authenticity.

Q & A

  • What is D-ID and what does it specialize in?

    -D-ID is an AI company that specializes in creating AI avatars using a tool called Creative Reality Studio. It also offers generative AI tools that transform pictures or videos into extraordinary experiences.

  • What is the purpose of the Creative Reality Studio tool?

    -The Creative Reality Studio tool is designed to enable users to create videos with AI avatars. It allows for the customization of avatars, script, language, and voice, making it a comprehensive platform for video production using AI.

  • How can one access the D-ID platform?

    -To access D-ID, one can go to Dash, log in, and then be redirected to, which is where the video creation process takes place.

  • What are the limitations of the free plan offered by D-ID?

    -The free plan on D-ID offers up to five minutes of creation but includes a D-ID watermark. It is quite limited in terms of the number of AI avatars and voice options available.

  • What does the Pro Plan provide that the free plan does not?

    -The Pro Plan removes the D-ID watermark, provides more AI avatars to choose from, and offers better AI voice generators. It is recommended for users who want to use the platform for practical reasons.

  • How does one upload their own picture to animate in D-ID?

    -The script mentions the ability to upload one's own picture for animation, but it does not provide specific steps. It suggests that there is an option within the platform to upload personal images for use with the AI avatars.

  • What additional features does the Pro Plan offer compared to the five-dollar plan mentioned?

    -The Pro Plan includes more AI avatars, better AI voice generators, and the removal of the D-ID watermark. It also likely offers additional features not explicitly mentioned in the script.

  • How does the language selection feature work in D-ID's platform?

    -Language selection allows the user to choose the accent for the AI voice. However, for actual language translation, the user needs to translate the script themselves or use a translation tool like DeepL before inputting it into D-ID.

  • What is the process for generating an AI presenter in D-ID?

    -To generate an AI presenter, one can choose from available options or use a prompt to generate a presenter from scratch using the platform's generative AI tools.

  • How can users add their own voice to the AI avatars?

    -Users can upload their own recorded audio to be used with the AI avatars. This can be done through the platform's interface, and it's suggested that using one's own voice is best when using a personal picture for the avatar.

  • What is the file format in which D-ID allows users to download their created videos?

    -D-ID allows users to download their created videos in MP4 format, which can be used on various platforms or integrated into presentations.

  • How does the video library in D-ID work?

    -The video library in D-ID is where all created videos are stored. Users cannot edit these videos directly and must create a new video if they want to make changes. It's important to name the videos to easily find and search for them within the library.



🤖 Introduction to DID's AI Tools and Tutorial Overview

DID is an AI company that offers a product named Creative Reality Studio, which enables the creation of AI avatars. These avatars can perform video presentations explaining DID's services. The platform allows users to transform any picture or video into extraordinary multimedia experiences, widely used by creators, marketing agencies, and production companies. The video script guides users on how to access and use the studio website, including a tutorial on creating videos, understanding the pricing plan, and choosing different features like AI avatars, voices, and styles.


🎬 Generating and Customizing AI-Driven Videos

The script continues with a demonstration of generating a video on DID's platform, showing the steps to input scripts, select languages, and choose voices with different accents. It also covers the credit system used for generating videos, and the process of downloading the completed videos. Additionally, the tutorial explains how to use external translation tools for non-English scripts, and introduces the concept of generating AI presenters with custom prompts using stable diffusion technology. The section also provides insights into incorporating AI-generated avatars into videos and the potential integration with other platforms.


📸 Advanced Customization and Integration Options

This section delves deeper into the customization options available on DID, focusing on creating personalized videos using one's photos and voice. It demonstrates uploading and animating personal images, choosing appropriate voices, or using one's recordings for a more authentic presentation. The script highlights the importance of expression in photos for better animation results and discusses the integration of high-quality AI voice generation into the platform's Pro Plan. Finally, it mentions educational resources available for learning more about generative AI tools and the benefits of upgrading to access enhanced features.



💡AI Avatars

AI Avatars are computer-generated characters that can be controlled by a person or an artificial intelligence system. In the context of the video, AI Avatars are created using D-ID's Creative Reality Studio, which allows users to generate realistic-looking avatars that can be used in various applications such as marketing, social media, and entertainment.

💡Creative Reality Studio

Creative Reality Studio is a tool developed by D-ID that enables users to create and customize AI avatars. It is mentioned in the video as the platform where users can access and utilize the technology to make their AI avatars, which can be used for a variety of purposes including video production.

💡Generative AI Tools

Generative AI tools refer to artificial intelligence systems that can create new content, such as images, videos, or text, based on existing data or user input. In the video, D-ID's generative AI tools are used to transform pictures or videos into unique experiences, allowing for the creation of AI avatars and personalized content.

💡Video Production

Video production is the process of creating video content, which involves planning, filming, and editing. The video discusses D-ID's mission to enable full video production using just AI, which implies the creation of video content without the need for traditional human actors or extensive manual editing.

💡Deepfake Technology

Deepfake technology involves using AI to create realistic but fake videos or images of people. While the term isn't explicitly mentioned in the video, the concept is central to D-ID's offerings, as it allows users to generate AI avatars that can mimic human speech and appearance.

💡AI Voice Generators

AI voice generators are tools that use artificial intelligence to create synthesized human speech. In the context of the video, D-ID's AI voice generators are used to give the AI avatars a voice, allowing them to speak in various languages and accents, enhancing the realism and versatility of the avatars.


A watermark is a visible or invisible marker embedded in a video or image to indicate copyright or ownership. The video script discusses different pricing plans offered by D-ID, where the free trial includes a watermark, while the Pro Plan removes the D-ID watermark, allowing for professional use of the generated content.


A script is a written text that serves as the dialogue or plan for a video, film, or theater production. In the video, the script is used to program the AI avatars to speak specific lines, making the video production process more streamlined and automated.


Translation is the process of converting text or speech from one language to another. The video mentions the use of translation tools, such as DeepL, to translate the script into different languages, allowing the AI avatars to speak in various accents or languages.


Mid-Journey is a software or tool mentioned in the video that can be used in conjunction with D-ID's technology to create AI avatars. It suggests a process where avatars are initially created in Mid-Journey and then imported into D-ID for further development and use.

💡Stable Diffusion

Stable Diffusion is a type of generative AI technology used for creating images from textual descriptions. The video script describes using Stable Diffusion in the background to generate new AI presenter avatars from user-provided prompts, showcasing the advanced capabilities of generative AI.

💡Pro Plan

The Pro Plan is a subscription tier offered by D-ID that provides users with advanced features and capabilities. It is mentioned in the video as the recommended plan for those who want to use D-ID's tools for practical purposes, as it offers more features, such as a wider selection of AI avatars and voices, and the removal of the D-ID watermark.


D-ID offers a tool called Creative Reality Studio for creating AI avatars.

The platform allows transformation of pictures or videos into extraordinary experiences.

D-ID supports video production entirely through AI technology.

Users can log in at and access the studio at

There is a free trial available that includes up to five minutes of video creation.

The free version comes with a watermark and limited AI avatar options.

Higher plans offer more minutes and a wider selection of AI voices and avatars.

The Pro Plan removes the D-ID watermark and enhances available features.

Users can upload their own picture and voice for a personalized avatar.

The platform offers AI tools to generate scripts using ChatGPT technology.

A variety of languages and accents are available for voice synthesis.

Users can create AI-generated presenters and choose from animated avatars.

The studio allows users to animate avatars with their own voice recordings.

Videos created can be downloaded as mp4 and used on various platforms.

Each video creation uses credits, which are tracked on the user's account.