InvokeAI - Canvas Fundamentals

24 Sept 202338:06

TLDRThe video script offers an in-depth tutorial on utilizing the unified canvas feature of Invoke AI for end-to-end creative workflows. It emphasizes the importance of the bounding box for controlling image generation and provides a live demonstration on how to manipulate it for detailed work. The script also explores various generation methods, including using masks and control nets, and discusses compositing options and the scale before processing feature for higher quality results. The creator shares techniques for achieving specific outputs and encourages experimentation with the canvas tools for the best results.


  • ๐ŸŽจ The unified canvas is a powerful feature of Invoke AI that supports an end-to-end workflow for creative vision realization.
  • ๐ŸŸซ The bounding box is a critical tool in the canvas which controls the generation of new imagery and content within the tool.
  • ๐Ÿš€ Resizing and moving the bounding box allows for better compositions and focused detailing work within larger images.
  • ๐Ÿ” The AI model's understanding is influenced by the area visible within the bounding box; focusing on a specific area limits its context.
  • ๐ŸŒŸ Denoising process relies on the initial image's context and structural hints to help the model work with the provided prompt.
  • ๐Ÿ“ท Using a mask and high denoising strength can produce detailed results, but it requires adjusting the prompt to match the focused area.
  • ๐Ÿ–ผ๏ธ Control net on the canvas helps with smaller detail regeneration by maintaining structure even at high denoising strengths.
  • ๐Ÿ“ The 'scale before processing' feature improves the quality of regeneration by focusing on small details at a higher resolution.
  • ๐ŸŽญ Mask adjustments and coherence pass are essential for seamless compositing, helping to blend new data into the original image.
  • ๐Ÿ”ง The canvas offers various infill techniques like tile, patch, match, llama, and CV2 for extending or changing the composition of an image.
  • ๐ŸŽจ Experimentation with the canvas tools is encouraged to find the best way to communicate intent to the AI model and achieve desired results.

Q & A

  • What is the primary function of the bounding box in the canvas?

    -The bounding box in the canvas is a crucial feature that controls where and how new imagery and content will be generated. It allows users to select specific areas of the image for editing or detailed work, and by resizing and moving it, users can create better compositions and focus the AI model on particular sections of the image.

  • How does the denoising process relate to the context provided in the initial image?

    -The denoising process involves providing the right type of context and structural hints within the initial image to help the AI model understand how to work with the prompt. It essentially helps the model 'squint' at the provided image and determine how to generate content that matches the user's prompt.

  • What is the purpose of using a mask in conjunction with the bounding box?

    -Using a mask in conjunction with the bounding box allows for more precise control over where the AI model regenerates content. By passing in a mask using the brush tool, the system can create a new image using the bounding box data and composite the new information into the selected area, while maintaining the desired structure and context.

  • How does the 'scale before processing' feature enhance the quality of regeneration?

    -The 'scale before processing' feature allows users to focus on a small area of details, such as a face or a specific character, and improve the quality of regeneration by performing it at a higher resolution. This technique prevents loss of detail in smaller elements within a larger image and helps maintain the quality of close-up characters or objects that fill the entire image.

  • What are the different compositing options available within the canvas?

    -The canvas offers several compositing options including mask adjustments, coherence pass, and infilling techniques like tile, patch, match, llama, and CV2. These options help blend the regenerated content seamlessly into the original image, maintaining the desired structure and details.

  • What is the significance of the coherence pass in the image regeneration process?

    -The coherence pass is a second image-to-image process that runs after the initial regeneration and compositing. It is primarily meant to clear up any rough edges or seams that might have been introduced during the infill process or regeneration, resulting in a more seamless and polished final image.

  • How does the user adjust the AI's focus when regenerating a specific area of an image?

    -The user can adjust the AI's focus by changing the prompt and selecting a specific area using the bounding box. By providing detailed information about the desired area and using a higher denoising strength, the AI can generate content that closely matches the user's vision for that particular section.

  • What is the role of the control net in refining the structure and details of a generation?

    -The control net, when used with an image, helps to maintain the structure and details of the generation by focusing on the edges and adhering to the initial sketch or image provided. It can be adjusted for the level of control desired, allowing for a balance between structure and creative freedom.

  • How can the user experiment with different tools and settings to achieve their desired output?

    -Users can experiment with various tools and settings such as mask adjustments, denoising strengths, infilling techniques, and prompt modifications. By iterating through different combinations and adjustments, users can guide the AI towards generating content that aligns with their creative vision.

  • What is the importance of using negative prompts in the generation process?

    -Negative prompts are used to exclude certain elements or characteristics from the generated content. By specifying what the user does not want in the image, the AI can better understand and adhere to the desired output, avoiding unwanted features such as hats in the example of generating a wizard.

  • How does the aspect ratio and composition of the final image impact the overall effect?

    -The aspect ratio and composition of the final image play a crucial role in conveying the intended message and focusing the viewer's attention. Adjusting these elements can enhance the narrative, improve the visual appeal, and ensure that the image aligns with the desired emotional impact or story.



๐ŸŽจ Introduction to the Canvas and Bounding Box

The video begins with an introduction to the canvas, emphasizing its role in enabling end-to-end creative workflows in Invoke AI. The bounding box is highlighted as a critical feature, controlling the generation of new imagery and content within the tool. The video creator demonstrates how the bounding box can be manipulated to edit different parts of an image, and how resizing and moving it can affect the composition. The importance of the bounding box in limiting the AI model's view and understanding of the image is discussed, along with the use of the denoising process to provide context and structural hints for better image generation.


๐Ÿ› ๏ธ Advanced Techniques: Control Net and Infill Settings

This paragraph delves into advanced techniques for using the canvas, such as the control net and infill settings. The control net allows for finer control over the regeneration of smaller details by importing an image from the canvas and using it as a guide. The video creator explains how different infill techniques like tile, patch, match, llama, and CV2 can be used to achieve various results. The paragraph also covers the use of the brush tool for creating solid color infills and the importance of the scale before processing feature for improving the quality of detailed areas in an image.


๐Ÿ”„ Compositing and Coherence Pass Explained

The video continues with an explanation of the compositing process and coherence pass. When the bounding box is over an area with existing pixel data, the system uses this data as the initial image for regeneration. The use of masks and the compositing options, including preserving the mask area, are discussed. The coherence pass, a secondary generation process meant to smooth out rough edges or seams, is introduced. The video creator outlines the three options for the second pass: unmasked, masked, and mask edge, and shares personal preferences and recommendations for achieving the best results.


๐Ÿ–Œ๏ธ Practical Demonstration: Canvas in Action

The creator presents a practical demonstration of using the canvas to generate new content. Starting with a text-to-image generation, the video shows the process of refining the image through various iterations. The creator uses different tools and techniques, such as the brush and mask, to adjust the composition and focus on specific areas for regeneration. The process of removing unwanted elements and adding details to the image is shown, along with the use of different denoising strengths and the importance of matching the prompt to the desired outcome.


๐ŸŽญ Enhancing Details and Finalizing the Composition

In this section, the focus is on enhancing specific details within the image and finalizing the composition. The creator zooms in on a background character and uses the scale before processing feature to increase the resolution and quality of the regeneration. The process of refining the character's appearance and adjusting the prompt for better results is demonstrated. The video also covers the use of mask adjustments and compositing blur to integrate new elements into the original image seamlessly. The creator shares their approach to iterating and experimenting with different settings to achieve the desired look.


๐ŸŒŸ Creative Exploration: Experimenting with the Canvas

The video concludes with a creative exploration of the canvas tools. The creator shares a rough sketch of a mage character and uses a control net with a soft edge to establish the structure. The process of using prompts, negative prompts, and adjusting the control net percentage to refine the generation is discussed. The creator demonstrates how to use the eraser tool to infill areas with colors from the composition and explores different infill settings like tile, llama, and patch match. The video encourages viewers to experiment with the canvas tools and share their creations.


๐Ÿš€ Wrapping Up and Future Content

The video wraps up with a summary of the key points covered and an invitation for viewers to share their thoughts and suggestions for future content. The creator emphasizes the importance of experimenting with the canvas tools and the impact of prompts on the generated content. The potential of image prompts on the unified canvas is mentioned as a topic for an upcoming video. The video ends with a call to action for viewers to like, subscribe, and join the Discord server for updates and discussions on Invoke AI.



๐Ÿ’กUnified Canvas

The Unified Canvas is a feature that enables end-to-end workflow for creative visualization, allowing users to generate and manipulate new imagery and content. It is central to the video's theme of demonstrating the capabilities of Invoke AI for creative projects. The canvas is used throughout the video to edit, regenerate, and refine images, as shown when the speaker focuses on improving the details of a space fighter and later a character on stage.

๐Ÿ’กBounding Box

The bounding box is a critical tool within the canvas that controls the area where new content is generated. It can be moved and resized to focus on specific parts of an image, thus directing the AI's attention to certain details. In the video, the speaker uses the bounding box to select and edit different areas of the image, such as the cockpit of a space fighter and the details of a character's face.

๐Ÿ’กDenoising Process

The denoising process refers to the AI's ability to refine and improve the quality of an image by removing noise and enhancing details. It involves providing the right context and structural hints to help the AI understand and execute the user's prompt effectively. The video emphasizes the importance of this process in achieving high-quality results, especially when working with smaller or detailed areas of an image.

๐Ÿ’กControl Net

A control net is a tool that helps maintain the structure and details of an image during the regeneration process. It works by analyzing the edges and details of an imported image or sketch and using this information to guide the AI's generation. The concept is introduced in the video as a way to refine the structure of the image while allowing for some flexibility in the final result.


A mask in the context of the video is a tool used to select specific areas of an image for regeneration or editing. It allows users to isolate parts of the image for detailed work or to exclude certain areas from the AI's view during the generation process. The mask is essential for achieving precise control over the final composition.


Inpainting is a technique where the AI fills in or regenerates missing or selected parts of an image based on the surrounding context and the user's input. It is used to add details, extend an image, or correct areas within the canvas. The video demonstrates how inpainting can be used to enhance specific areas of an image, such as adding details to a character's face or extending the background.

๐Ÿ’กScale Before Processing

Scale Before Processing is a feature that allows users to zoom in on a specific area of the canvas and generate content at a higher resolution before fitting it back into the original image. This technique improves the quality of the regenerated details, ensuring that smaller elements or background features maintain clarity and definition.


Compositing is the process of blending newly generated content with the existing image. It involves using a mask to determine which parts of the regenerated content should be merged with the original image. The video discusses different compositing options, such as mask adjustments and coherence pass, to achieve seamless integration of new elements.

๐Ÿ’กCoherence Pass

A coherence pass is a secondary generation process applied after the initial content regeneration to smooth out any rough edges or seams that may have been introduced. It helps to ensure a more cohesive and natural-looking final image by refining the transitions between the new and existing content.

๐Ÿ’กImage Prompt

An image prompt is a user-provided visual input that guides the AI in generating specific content. It is a powerful tool on the unified canvas that allows for more precise control over the final output by giving the AI a clear reference to work with. The video hints at the potential of using image prompts in future content, suggesting their importance in achieving detailed and accurate results.


The introduction of the bounding box feature in the canvas, which is crucial for controlling the generation of new imagery and content within the tool.

The demonstration of resizing and moving the bounding box to create better compositions and focus on specific areas of an image.

The explanation of how the AI model's view is limited when the bounding box is focused on a specific area, affecting the generated content.

The use of the denoising process to provide the right context and structural hints for the AI model to understand and work with the initial image.

The importance of changing the prompt when regenerating smaller areas of an image to match the desired outcome.

The introduction of the control net feature on the canvas, which allows for easier regeneration of smaller details with more control.

The explanation of the different generation methods available with the bounding box, including new image generation, using existing pixel data, and infilling empty spaces.

The discussion of the mask adjustments and compositing process, which allows for seamless integration of regenerated content into the original image.

The introduction of the scale before processing feature, which improves the quality of regeneration by focusing on small details at a higher resolution.

The explanation of the coherence pass, which is used to clear up any rough edges or seams in the regenerated image.

The practical demonstration of using the canvas to generate new content, including the process of refining and adjusting the image to achieve the desired result.

The use of the eraser tool to modify the base layer and allow for infilling based on the existing color composition.

The exploration of different infill techniques, such as tile, patch, match, llama, and CV2, and their impact on the final image.

The emphasis on the importance of experimenting with the canvas tools to communicate intent effectively to the AI model and achieve the best results.

The encouragement for users to share their thoughts and use cases, and the anticipation of future content and features for the canvas.