How to Use D-ID Agents (Navigating Success Pt. 2) | D-ID Academy

D-ID AI Video Platform
26 May 202448:55

TLDRWelcome to the second D-ID Academy webinar, where we explore the potential of D-ID Agents for businesses. Hosted by Ron Fredman and featuring experts Simon Caraka and Jim McCormick, the session guides users through creating and customizing AI agents for marketing, customer service, and more. Learn about the technology behind agents, best practices for integration, and upcoming features. Discover how D-ID's AI video generation is revolutionizing business interactions with realistic, interactive digital representatives.


  • 😀 The webinar is the second in a series by D-ID Academy, focusing on D-ID Agents, which are AI video generation tools for businesses.
  • 🔍 D-ID's core capability is in AI video generation, shifting focus to generative AI three years ago to create new video frames of people's faces from images or videos.
  • 📈 D-ID has experienced significant growth, with over 150 million videos made using their platform, 270,000 API key holders, and a million monthly site visitors.
  • 🏆 The company has received recognition from industry leaders like Nvidia's CEO and has been featured in various industry landscape maps and top product lists.
  • 🛠️ D-ID offers two primary solutions: AI videos for creating content with avatars and real-time AI agents for interactive digital engagement.
  • 🤖 D-ID Agents enable face-to-face conversations through avatars, enhancing marketing, sales, customer experience, and personalized engagement.
  • 🛑 The webinar covers the step-by-step process of creating an agent, including best practices, tips, and the reasoning behind suggestions.
  • 📚 Knowledge sources for agents can be grounded, hybrid, or ungrounded, depending on whether they rely solely on uploaded documents or also on general knowledge.
  • 🔑 The use of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) technologies is crucial for finding the best responses to user queries within the knowledge base.
  • 🔄 The importance of testing agents before embedding them on websites is emphasized to ensure a good user experience and accurate responses.
  • 💻 The webinar demonstrates how to embed an agent into a website's HTML, showcasing the simplicity of integrating D-ID's technology into existing online platforms.

Q & A

  • What is the main focus of the D-ID Academy webinar series?

    -The D-ID Academy webinar series focuses on helping businesses unlock the potential of D-ID's most recent product, D-ID Agents, and demonstrates how it can help them expand and grow their business.

  • Who are the presenters in the webinar and what are their roles?

    -The presenters are Ron Fredman, the head of content and creative marketing at D-ID, Simon Caraka, an API tech support engineer, and Jim McCormick, a machine learning engineer who is an early adopter of D-ID agents. Ron leads the session, Simon assists with technical integrations, and Jim shares his experiences as an early user.

  • What is the difference between off-the-shelf agents and those created using the D-ID API?

    -Off-the-shelf agents are available through the D-ID Creative Reality Studio and are ready to use immediately. In contrast, developers using the D-ID API have more customization options and can create agents tailored to specific business needs.

  • How does D-ID use AI to generate new video frames of people's faces?

    -D-ID uses AI to generate new video frames of people's faces based on images or video. This technology is applied to create AI videos and real-time AI agents, enhancing marketing, customer experience, and personalized engagement.

  • What are the two primary solutions offered by D-ID?

    -The two primary solutions offered by D-ID are AI videos, which generate videos using avatars from a single image or video, and real-time AI agents, which are interactive digital people that can engage in conversations and represent a business.

  • What is the significance of using a large language model (LLM) and retrieval-augmented generation (RAG) in D-ID agents?

    -The large language model (LLM) and retrieval-augmented generation (RAG) work together to find the best question and answer from the user's message within the knowledge base, ensuring relevant and accurate responses from the D-ID agents.

  • What are the different knowledge types supported by D-ID agents?

    -D-ID agents support three knowledge types: grounded, which uses only responses from the knowledge base; hybrid, which uses both knowledge reference material and training material; and ungrounded, which operates like a chatbot without a reference file.

  • How can businesses create an agent using the D-ID Studio?

    -Businesses can create an agent using the D-ID Studio by selecting an image, naming the agent, choosing a language and voice, setting agent instructions, uploading knowledge sources, and configuring chat settings. The process is designed to be simple and straightforward.

  • What is the importance of testing D-ID agents before embedding them on a website?

    -Testing D-ID agents ensures that the image, voice, and responses are aligned, coherent, and consistent. It helps identify any issues with the agent's performance and allows for adjustments to be made before the agent is made available to the public.

  • How can businesses customize the appearance and behavior of their D-ID agents?

    -Businesses can customize the appearance by selecting a high-quality image that aligns with their brand. They can also customize behavior by setting agent instructions, which act as prompts for the LLM, defining the agent's personality and how it should use reference documents.

  • What are the benefits of using D-ID agents for businesses?

    -D-ID agents offer businesses the ability to create interactive, branded digital representatives that can enhance marketing and sales activities, elevate customer experience, and provide personalized engagement with users in various use cases, including learning and development.



📅 Welcome and Introduction to Webinar

The script opens with a greeting and introduction to the second 'did Academy webinar for Enterprises,' focusing on the potential of 'did agents' for business growth. Ron Fredman, Head of Content and Creative Marketing at 'did,' hosts the webinar alongside Simon Caraka, an API Tech Support Engineer, and Jim McCormick, a Machine Learning Engineer. The session aims to be hands-on, covering agent creation, best practices, and addressing questions. It is part of a series, with a recording of the previous session available and an upcoming webinar scheduled for July 9th.


🚀 Overview of 'did' and Its Solutions

The script provides an overview of 'did,' a company specializing in AI video generation, founded in 2017 with a pivot to generative AI three years ago. 'did' has experienced significant growth, raising funding and expanding globally. The company offers two primary solutions: AI videos for content creation and real-time AI agents for interactive digital engagement. The webinar will delve into the technologies behind these agents, including large language models (LLM) and retrieval-augmented generation (RAG) for conversational AI.


🛠️ Creating and Customizing AI Agents

The script outlines the process of creating and customizing AI agents using 'did' studio, starting with selecting an image that aligns with the brand and meets quality standards. It discusses the importance of choosing the right voice and language for the agent, as well as crafting instructions for the agent's behavior and responses. The process involves selecting knowledge sources, setting chat preferences, and testing the agent to ensure it meets expectations before embedding it into a website.


🔍 Best Practices for Agent Testing and Knowledge Sources

The script emphasizes the importance of testing AI agents to ensure they perform well and provide accurate responses. It suggests structuring knowledge sources into Q&A pairs for better interaction with the large language model. The session also covers the limitations of document uploads and strategies for optimizing knowledge sources, such as removing non-textual elements from documents to reduce file size.


🗣️ Enhancing Conversational Experience with Agents

The script discusses enhancing the conversational experience with AI agents by customizing chat settings, including starter questions and welcome messages. It highlights the importance of aligning the agent's appearance, voice, and responses for a coherent brand experience. The webinar demonstrates testing the agent with various questions to ensure it provides concise and helpful answers.


🔧 Handling Unknown Questions and Embedding Agents

The script addresses how AI agents handle unknown or irrelevant questions by avoiding providing incorrect answers and staying on topic with the knowledge base. It also covers the technical process of embedding the agent into a website, including the need for a Pro plan for multiple embedded agents and the simplicity of adding the embed code to HTML.


🌐 Live Demonstration of Agent Embedding and Testing

The script includes a live demonstration of embedding an AI agent into a website and testing its functionality. It shows the agent's ability to answer questions in a branded and engaging manner, providing a superior experience compared to traditional text-based chatbots. The demonstration also includes editing the agent's prompt for improved user interaction.


📈 Analyzing Agent Performance and Upcoming Features

The script discusses the development of a Sandbox for testing AI agents and the importance of analyzing chat sessions for business insights. It mentions the agent's pricing model, which is based on sessions, and the availability of a mechanism to export and analyze chat sessions. The webinar also teases upcoming features and encourages participants to share their experiences with AI agents.


🤝 Closing Remarks and Community Engagement

The script concludes with closing remarks, expressing gratitude to the participants and the guest speakers. It invites the audience to join the next webinar and to share their AI agent experiences. The session highlights the importance of community engagement through channels like Discord and encourages continued learning and collaboration.



💡D-ID Agents

D-ID Agents are interactive digital personas that can engage in conversations with users, serving as the face of a business. They are designed to enhance marketing, sales, and customer experience by providing personalized engagement. In the video, D-ID Agents are showcased as a new product that businesses can integrate to expand and grow, with a focus on real-time interaction and the use of AI to generate responses.


A webinar is an online seminar or workshop that allows participants to learn or get updated on a particular topic. In this context, the D-ID Academy webinar is specifically designed for enterprises to understand how to use D-ID Agents effectively. The script mentions that this is the second session in a series, indicating a structured program to educate businesses on leveraging D-ID's technology.


API stands for Application Programming Interface, which is a set of rules and protocols that allows different software applications to communicate with each other. In the script, Simon, an API tech support engineer, discusses making the integration process with D-ID's API simple and straightforward, highlighting the importance of APIs in customizing and scaling the use of D-ID Agents.

💡Machine Learning

Machine learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed. Jim McCormick, a machine learning engineer in the script, represents the technical expertise involved in utilizing D-ID Agents. The agents likely incorporate machine learning algorithms to process user queries and generate responses.

💡Knowledge Base

A knowledge base is a collection of information or data that a system, such as D-ID Agents, can refer to in order to provide answers or take action. The script mentions that the agents use a knowledge base, which can include general knowledge or uploaded documents, to find the best responses to user queries.

💡Generative AI

Generative AI refers to artificial intelligence systems that can create new content, such as text, images, or videos, based on existing data. D-ID has shifted its focus to generative AI, as mentioned in the script, to generate new video frames of people's faces, which is a core capability of the D-ID Agents.


LLM stands for Large Language Model, which is a type of AI that can process and generate human-like text based on the input it receives. In the context of the video, the LLM is a key component of the D-ID Agents, working alongside the Retrieval-Augmented Generation (RAG) to find and formulate responses to user questions.


RAG is short for Retrieval-Augmented Generation, a system that combines the retrieval of relevant information with the generation of new text. The script explains that RAG works in tandem with the LLM in D-ID Agents to provide responses that are grounded in the provided knowledge base or documents.


Low-code and no-code refer to software development approaches that allow users to create applications with minimal or no coding, often through graphical interfaces or pre-built modules. The script discusses the D-ID Studio version of agent creation as a low-code/no-code solution, making it accessible for users without extensive programming skills.


Embedding in the context of this video refers to the process of integrating the D-ID Agent into a website's HTML code, allowing the agent to function as part of the website's interface. Jim demonstrates how to embed an agent into a website, which is a key step in making the agent accessible to users on the site.


Introduction to D-ID Academy webinar focusing on D-ID Agents for business growth.

Ron Fredman, Head of Content and Creative Marketing, welcomes participants and introduces the webinar's agenda.

Simon Caraka and Jim McCormack introduce themselves as experts in API tech support and machine learning, respectively.

Webinar's goal is to demonstrate the integration process of D-ID Agents API and its application in businesses.

D-ID's core capabilities in AI video generation and the shift to generative AI three years ago.

Milestones achieved by D-ID, including 150 million videos made and a substantial increase in size and funding.

Two primary solutions offered by D-ID: AI videos and real-time AI agents for interactive digital communication.

D-ID Agents enable face-to-face conversations between businesses and users through avatars.

Technological components of D-ID Agents, including Large Language Model (LLM) and Retrieval-Augmented Generation (RAG).

Differences between off-the-shelf agents and developer options using the API.

Step-by-step guide on the agent creation process with best practices and tips.

Importance of selecting a high-quality image for the agent that aligns with brand identity.

Customizing agent details such as name, language, and voice for a personalized experience.

Setting agent instructions to guide the LLM on using reference documents and defining the agent's personality.

Selecting and structuring knowledge sources in Q&A pairs for effective interaction with the LLM.

Chat settings allow for starter questions to guide users when interacting with the agent.

Testing the agent to ensure proper functionality and alignment with the brand and user expectations.

Embedding the agent into a website using a simple script, enhancing the user experience with direct voice and video responses.

Upcoming webinar on July 9th featuring real-life use cases of D-ID Agents in businesses.

Q&A session addressing questions on limitations, lip-sync alignment, and pricing models for D-ID Agents.

Invitation to join the D-ID community on Discord for peer support and further learning opportunities.

Final remarks and thanks to the participants, with a reminder of the next webinar and call for sharing agent experiences.