ComfyUI Workflow Build Text2Img + Latent Upscale + Model Upscale | ComfyUI Basics | Stable Diffusion

11 Jun 202423:38

TLDRThis tutorial video guides viewers through building a basic text-to-image workflow from scratch using ComfyUI, comparing it with Stable Diffusion's automatic LL. It covers adding checkpoint nodes, prompt sections, and generating images with a k-sampler. The video also demonstrates enhancing the workflow with features like Latent Upscale and Model Upscale, showing how to upscale images for higher resolution and detail. The host encourages viewers to follow along and build a comprehensive workflow step by step.


  • 😀 The tutorial video provides a step-by-step guide on building a basic text-to-image workflow from scratch using ComfyUI.
  • 🔍 It explains how to add and connect nodes like the checkpoint, prompt sections, and K sampler to create a functional workflow.
  • 📝 The video script also covers how to enhance the workflow with latent upscale and model upscale images for better image quality.
  • 🔄 The process of adding LoRA (Low-Rank Adaptation) nodes to the workflow is demonstrated to improve the generation of stylized images.
  • 🔗 The tutorial includes a comparison with Stable Diffusion's automatic LL to highlight the necessary components for a ComfyUI workflow.
  • 🖼️ Viewers are shown how to connect the positive and negative prompts to the checkpoint node to refine the image generation process.
  • 🛠️ The script describes the importance of the Generation section, including settings like sampling method, scheduler types, and image dimensions.
  • 🔍️ The video mentions the use of a 'reroot' to simplify the workflow, making it easier to manage and understand.
  • 📈 The tutorial demonstrates how to upscale images using both latent upscale and model upscale methods, explaining the differences in results.
  • 📝 It emphasizes the need to adjust parameters like denoisng strength and scale factor to achieve the desired image quality and resolution.
  • 🔧 The final workflow includes a clean-up step, organizing the nodes into groups for better clarity and usability.

Q & A

  • What is the main topic of the tutorial video?

    -The main topic of the tutorial video is building a basic text to image workflow from scratch on ComfyUI and enhancing it with latent upscale and model upscale features.

  • What is the first step in building a workflow on ComfyUI according to the video?

    -The first step in building a workflow on ComfyUI is to get a checkpoint node, which can be done by right-clicking on the blank space and selecting 'Add node' or by double-clicking and searching for the 'checkpoint' node.

  • How can you add a prompt node in ComfyUI?

    -You can add a prompt node in ComfyUI by right-clicking on the blank space and selecting 'Add node' then 'Conditioning' and choosing 'Clip Text and Code Prompt', by double-clicking and searching for 'prompt', or by dragging the 'Clip' section from the 'Load Checkpoint' node.

  • What are the two types of prompts mentioned in the video script?

    -The two types of prompts mentioned in the video script are the positive prompt and the negative prompt.

  • What is the purpose of the 'K sampler' node in the workflow?

    -The 'K sampler' node is used for generating images in the workflow. It is connected to the positive and negative prompts to influence the image generation process.

  • How can you add the 'Latent Upscale' feature to the workflow?

    -To add the 'Latent Upscale' feature, you need to add a 'latent upscale by' node, connect the output of the first workflow to the 'sample' input of the 'latent upscale by' node, and then connect the output to a 'K sampler' for further upscaling.

  • What is the role of the 'VAE decode' node in the workflow?

    -The 'VAE decode' node is responsible for decoding the latent information and converting it into an image format, which is the final step before saving or previewing the generated image.

  • How can you connect multiple 'Lora' nodes in the workflow?

    -To connect multiple 'Lora' nodes, you can duplicate an existing 'Lora' node using 'Ctrl+C' and 'Ctrl+V' or by pressing 'Alt' and dragging the node. Then, connect the model and clip outputs from the first 'Lora' node to the corresponding inputs of the second 'Lora' node.

  • What is the purpose of the 'Model Upscale' node in the workflow?

    -The 'Model Upscale' node is used to upscale an image using a specific model, such as '4X Ultra Sharp'. It takes the output from the latent upscale or other image generation steps and increases its resolution.

  • How can you simplify a complex workflow in ComfyUI?

    -You can simplify a complex workflow in ComfyUI by using the 'Reroot' feature, which helps to organize and streamline the workflow for easier navigation and understanding.



🛠️ Building a Basic Text-to-Image Workflow on Kyui

This paragraph introduces the tutorial's objective to build a basic text-to-image workflow from scratch on Kyui. It also mentions the comparison with Stable Diffusion's automatic 1.1 to provide insights into essential workflow components. The process begins with clearing the workflow space and obtaining a checkpoint node, which is fundamental. Two methods for adding nodes are described: right-clicking to access the 'add node' section and double-clicking to search for specific nodes. The tutorial emphasizes the importance of the checkpoint node as the starting point for building the workflow.


📝 Configuring Prompts and Sampler for Image Generation

The paragraph explains the setup of positive and negative prompt sections on Kyui UI, which are crucial for guiding the image generation process. It details three methods to add prompt nodes: using the 'add node' option, double-clicking to search, or dragging from the 'load checkpoint' node. The tutorial then moves on to the configuration of the 'K sampler' node, which is necessary for generating images, and includes setting parameters such as sampling method, steps, width, height, and CFG scale. The paragraph concludes with connecting the prompts to the 'K sampler' and emphasizes testing the workflow with a basic prompt to ensure functionality.


🌌 Enhancing Workflow with LoRA and Multiple Prompts

This section delves into enhancing the basic text-to-image workflow by incorporating LoRA (Low-Rank Adaptation) nodes. The process involves adding a 'load LoRA' node and connecting it to the model and clip sections of the 'load checkpoint' node. The tutorial demonstrates how to connect the output of the LoRA node to the 'K sampler' and how to duplicate LoRA nodes for more complex workflows. It also shows how to connect the duplicated nodes properly to maintain the workflow's integrity. The paragraph concludes with a test run of the enhanced workflow using a fantasy-themed prompt to illustrate the improved image generation capabilities.


🔍 Implementing Latent Upscale and Model Upscale Techniques

The paragraph introduces the process of adding latent upscale and model upscale techniques to the workflow. It starts by adding a 'latent upscale by' node and connecting it to the output of the initial workflow to upscale the generated image. The tutorial explains how to set up a new 'K sampler' for the upscaled image, including connecting the model and prompts. It also covers the configuration of the VAE (Variational Autoencoder) for decoding the upscaled latent image. The paragraph concludes with a test of the latent upscale workflow, comparing the results with the original text-to-image output and adjusting the denoisng strength for better image quality.


🖼️ Finalizing the Workflow with Model Upscale and Cleanup

The final paragraph focuses on the last steps of the workflow, which include adding a model upscale node to further enhance the image resolution. The tutorial describes how to connect the latent upscale output to the model upscale node and configure the scaling factors. It also discusses the importance of adjusting the denoisng strength to balance image detail and noise. The paragraph concludes with a comparison of different upscale methods and a demonstration of the final workflow's capabilities. The workflow is then organized into groups for clarity, and the tutorial ends with a prompt for viewer feedback and a sign-off.




ComfyUI is a user interface platform that is mentioned in the title of the video, which suggests it is the environment where the workflow is being built. It is a key term as it sets the context for the entire tutorial, indicating that the instructions are specific to this particular interface. In the script, ComfyUI is used as the base for constructing a text-to-image workflow from scratch.

💡Text-to-Image Workflow

A text-to-image workflow is a process that converts textual descriptions into visual images. It is the central theme of the video, as the tutorial aims to guide viewers through building such a workflow on ComfyUI. The script details the steps to create this workflow, including adding nodes and setting parameters.

💡Latent Upscale

Latent upscale refers to a technique used to enhance the quality or resolution of an image that has been generated from a latent space representation. In the context of the video, latent upscale is one of the enhancements discussed for improving the text-to-image workflow, allowing for higher quality image outputs.

💡Model Upscale

Model upscale is another method of improving image resolution, likely referring to a process where the model's parameters are used to upscale the image. The script mentions adding model upscale to the workflow, which suggests it as a separate technique from latent upscale, offering different results.

💡Checkpoint Node

In the script, the checkpoint node is an essential component in building the workflow on ComfyUI. It is a type of node that is added to the workflow to ensure progress is saved at certain points. The term is used to illustrate one of the first steps in setting up the text-to-image workflow.

💡Positive Prompt

A positive prompt is a textual instruction that guides the image generation process towards a desired outcome. In the video script, setting up a positive prompt is part of configuring the workflow to specify what kind of images should be generated.

💡Negative Prompt

A negative prompt, as mentioned in the script, is used to guide the image generation away from certain outcomes. It is a part of the workflow that helps refine the results by specifying what should be avoided in the generated images.


The K-Sampler node, as discussed in the script, is crucial for the image generation process within the workflow. It is used to determine the sampling method and other parameters that affect how the image is produced from the textual description.

💡VAE Decode

VAE stands for Variational Autoencoder, and in the context of the video, VAE decode refers to the process of decoding the latent space representation back into an image format. The script describes connecting the VAE decode node to finalize the image generation process.

💡Load Laura Node

The Load Laura node is introduced in the script as a component that can be added to the workflow for additional enhancements. While the exact function of 'Laura' is not detailed in the provided transcript, it seems to be a feature or tool within ComfyUI that can be integrated into the workflow to improve image generation.

💡Upscale Image

Upscale image refers to the process of increasing the resolution of an image while maintaining or enhancing its quality. In the script, this term is used in the context of adding a node to the workflow that performs this function, either through latent upscale or model upscale techniques.


Introduction to building a basic text to image workflow on ComfyUI from scratch.

Explanation of enhancing the workflow with latent and model upscale images.

Comparison with Stable Diffusion's automatic LL to understand necessary workflow components.

Step-by-step guide to add a checkpoint node in ComfyUI.

Three methods to add prompt nodes for positive and negative prompts.

Importance of the Generation section with parameters like sampling method and scheduler types.

How to add a K sampler node for image generation.

Connecting the K sampler with positive and negative prompts for better image results.

Adding width and height parameters using an empty latent image node.

End of the workflow with the generation of the image and options to save or preview.

Testing the workflow with a 'D Vision Excel' prompt and generating a 512x512 image.

Adjusting the workflow to generate images at different resolutions like 768x1080.

Adding LoRA (Low-Rank Adaptation) to the workflow for enhanced image generation.

Demonstration of connecting multiple LoRA nodes for more complex image generation.

Building a latent upscale workflow to refine and enhance the generated images.

Incorporating a model upscale workflow to further improve image resolution and quality.

Final workflow review with a clean-up and organization for better understanding and usability.

Comparison of the original text-to-image output with latent upscale and model upscale results.

Adjusting denoisng strength for better image similarity in the latent upscale process.

Completion of the workflow with a simple text-to-image, LoRA nodes, latent upscale, and model upscale.

Invitation for feedback and suggestions for further video content on ComfyUI workflows.