Stable Diffusion かわいい顔しか出ないLoRAを作る

16 Mar 202413:37

TLDRThe video script outlines a process for creating a custom AI-generated character, or 'Lora,' by gathering images of cute faces and using them for training. It discusses the importance of selecting the right images, using dynamic prompts with random elements, and adjusting settings for stable image generation. The script provides a step-by-step guide on preparing the data set, using tags, and setting parameters for the training process. It also mentions the possibility of creating anime-style Loras and encourages viewers to try it out, offering a membership bonus for downloading the created Lora.


  • 🎨 The process involves gathering images of cute faces to train a model, aiming to generate images of only cute faces using a specific mix of real-life and anime styles.
  • 🖌️ The challenge is to create a unique 'Rora' using only cute faces, without any other styles interfering.
  • 🌟 It's important to select the right check points when creating the initial images, as these will guide the final output.
  • 📸 The script suggests using a mix of dynamic prompts and randomization to create a diverse set of training images.
  • 🔄 The use of an extension tab is recommended for ease of generating varied prompts without manual changes.
  • 📂 Preparing the training dataset involves creating a folder, adding images, and using a data set tag editor to label and organize the images.
  • 🏷️ Tags are used to specify features that should appear in the generated images, such as facial expressions and hairstyles.
  • 📈 The script provides a detailed guide on setting up the learning process, including the number of steps and parameters to use.
  • 🚀 Once the model is trained, it can generate images by using a trigger word and the trained check points.
  • 🌈 The script mentions the possibility of creating anime-style Roras by using a different set of check points and original images.
  • 💡 The use of Stable Diffusion is suggested for creating images with consistent size and color tones, which can help stabilize the learning process.
  • 📹 The video script is part of a series that explains the use of AI tools like Stable Diffusion and Voicebox in a simple manner.

Q & A

  • What is the main goal of the video script?

    -The main goal of the video script is to guide the audience through the process of creating a custom AI-generated character, specifically a 'Rola', using Stable Diffusion and learning from a collection of cute facial images.

  • What type of images is the script suggesting to collect for the learning process?

    -The script suggests collecting images of cute faces, focusing on real-life mixed with anime-style images, to create a 'Rola' with a consistent cute appearance.

  • How does the script propose to ensure a variety of cute faces in the output?

    -The script proposes using a combination of carefully selected learning images and the Dynamic Prompts extension, which allows for random insertion of keywords to create a diverse range of cute expressions and features.

  • What is the role of the 'Dynamic Prompts' extension in the process?

    -The 'Dynamic Prompts' extension helps to vary the prompts by randomly inserting selected keywords into the text, which in turn generates a diverse set of images without the need to manually change the prompts for each iteration.

  • How does the script address the challenge of creating a 'Rola' with a specific look?

    -The script suggests using tags and checkpoints that correspond to the desired facial features and expressions, such as smile, eye direction, and hairstyle, to guide the AI in creating a 'Rola' with the desired appearance.

  • What is the significance of the 'negative' tags in the process?

    -Negative tags are used to suppress undesired features, such as overly strong makeup, that might detract from the desired aesthetic of the 'Rola'.

  • How does the script recommend organizing the learning images?

    -The script recommends organizing the learning images into a specific folder, ensuring that the images are of high quality and variety to guide the AI learning process effectively.

  • What is the recommended number of learning steps for the 'Rola' creation process?

    -The script suggests around 1000 steps as a suitable number for the learning process, based on the number of images and desired output quality.

  • How does the script suggest testing the 'Rola' after learning?

    -The script suggests using the generated 'Rola' in a UI environment, inputting a trigger word and observing the output to ensure it matches the desired aesthetic and features.

  • What additional tips does the script provide for refining the 'Rola'?

    -The script suggests using Stable Diffusion to create images with consistent size and color tones, and also mentions the possibility of using simple or flat backgrounds to focus on the 'Rola'.

  • How can the audience access the created 'Rola'?

    -The created 'Rola' can be made available as a membership benefit, allowing members to download and use it.



🎨 Creating a Customizable Character with Stable Diffusion

This paragraph discusses the process of creating a character named 'Rora' using Stable Diffusion, focusing on generating only cute faces. The speaker talks about using a mix of real-life and anime-style images to train the AI, and the importance of selecting the right training images. They also mention the use of a dynamic prompt extension to randomize certain aspects of the character's appearance, such as expressions and hair, without changing the overall style. The paragraph highlights the challenge of gathering high-quality images for training and the use of tags to refine the character's features.


📁 Preparing and Organizing Training Images

The speaker describes the steps for preparing the training images for Rora's Stable Diffusion learning. They emphasize the importance of having a variety of images with different angles and expressions. The process involves creating a folder for the images, using a data set tag editor to label the images with relevant tags, and selecting specific features to be emphasized in the final character design. The paragraph also touches on the use of negative prompts to avoid unwanted features and the organization of images in folders for easy access during the training process.


🚀 Launching the Training and Testing the Results

This paragraph details the actual training process of Rora using the prepared images and tags. The speaker explains how to set up the training parameters, including the number of steps and the seed value, and how to use the advanced tab for additional settings. After training, they discuss testing the results by generating images using the trained model and the UI interface. The paragraph concludes with the speaker's thoughts on applying the same principles to create animations and the benefits of using Stable Diffusion for character creation, including the ability to produce images with consistent size and color schemes.



💡Dora Learning

Dora Learning refers to the process of training an AI model, specifically a text-to-image generator, to produce images based on a set of input data or 'learning images'. In the context of the video, it involves using a collection of cute faces to train the AI to generate similar images. The script mentions the challenge of creating a unique character, 'Rora', using only cute faces and different styles from the learning images.

💡Childraw Mix

Childraw Mix is a term used in the video to describe a combination of real-life and anime-style images. The goal is to create a character, 'Rora', using this mix of styles. This concept is central to the video's theme of blending different visual elements to achieve a desired outcome in AI-generated art.


Customization in the context of the video refers to the process of modifying or adjusting the AI's output to match specific preferences or styles. This includes changing facial features, hairstyles, and other elements to create a unique character. The video emphasizes the importance of customization in achieving the desired look for the character 'Rora'.

💡Dynamic Prompts

Dynamic Prompts are a feature that allows for randomization of certain elements within the AI's prompts without changing the overall prompt. This functionality enables the AI to generate a variety of outputs based on the same base prompt, introducing an element of randomness and creativity to the generated images.

💡Check Points

Check Points in the video refer to specific features or elements in the images that the user wants to ensure appear in the AI's output. These could be facial expressions, hairstyles, or other visual characteristics. The video emphasizes the importance of selecting the right check points to guide the AI in creating the desired character.

💡Stable Diffusion

Stable Diffusion is a type of AI model used for generating images. It is known for creating high-quality, stable outputs. In the video, the speaker discusses using Stable Diffusion to create a consistent and high-quality version of 'Rora', emphasizing the importance of image quality and consistency in the learning process.

💡Negative Tags

Negative Tags are used in the AI generation process to suppress certain features or elements that are not desired in the output. In the video, they are used to avoid overemphasis on certain characteristics, such as makeup, that might detract from the desired aesthetic of the character 'Rora'.

💡AI Art Generation

AI Art Generation is the process of using artificial intelligence to create visual art. The video focuses on this concept by detailing the steps to train an AI model to generate images of a character named 'Rora' with specific visual traits. The process involves selecting learning images, customizing prompts, and refining the AI's output to achieve the desired artistic style.

💡Image Collection

Image Collection is the process of gathering a set of images that will be used to train the AI model. In the video, the user collects cute faces to train the AI to generate images that match their aesthetic preferences. The quality and selection of these images are crucial to the final output of the AI.

💡Character Design

Character Design involves creating the visual appearance and personality of a character. In the video, the character 'Rora' is designed through the careful selection of features and styles from the learning images. The process includes deciding on facial expressions, hairstyles, and other visual elements that define the character's look.

💡AI Training

AI Training is the process of teaching an AI model to perform a specific task, such as generating images, by providing it with data and feedback. In the video, AI Training involves using a collection of images to train the AI to generate images of 'Rora' with certain desired features. The training process is detailed, including the setup of check points and the use of dynamic prompts.


The process of creating a custom 'Dora' by gathering cute faces and using them for Stable Diffusion learning to generate consistent cute facial features.

The challenge of creating a 'Rora' using only cute faces from a mix of live-action and anime styles, without any non-cute faces.

The importance of selecting the right check points when creating a prompt to generate desired facial features and expressions.

Utilizing the Dynamic Prompt extension to randomize parts of the prompt without manually changing the entire prompt, allowing for variation in the generated images.

The method of enclosing random keywords for dynamic parts within brackets and commas to achieve variation in features like facial expressions and hair direction.

The process of creating a 'Rora' involves gathering images, preparing a data set, and using tags to specify desired features.

The use of tags related to facial features and hairstyles to ensure the generated 'Rora' aligns with the creator's vision.

The step-by-step guide on setting up the learning process for 'Rora', including folder creation and parameter adjustments.

The concept of using Stable Diffusion to create a 'Rora' with a consistent and high-quality output by selecting the right images for learning.

The practical application of creating a 'Rora' for generating anime-style faces and the potential for using it with different types of images, such as live-action.

The suggestion to use Simple Background and Flat Background options to easily remove backgrounds or create a more stylized look.

The mention of membership benefits, including access to the newly created anime-style 'Rora', for members of the channel.

The channel's commitment to providing easy-to-understand explanations of AI tools like Stable Diffusion and Voice Box.

The invitation for viewers to subscribe to the channel and ask questions or provide feedback through the comment section.