top of page
1/5
DangerGirls_008.jpg

Danger
GirL 

x

Danger Girl X Box Logo

Real girls, shaped by real experiences, form the heart of the DangerVerse: a mature, neo-noir cinematic universe teeming with captivating narratives and danger.

MaximMagazineCover(whiteBorders)_edited_edited.jpg

Mastering the Automatic1111 User Interface: A Comprehensive Guide for Stable Diffusion

Updated: Nov 12


a baby samurai in training
Your Diffusion Training Starts Now

Stable Diffusion - Automatic1111 User Interface

Pixar Control Panel with cute aliens

As explorers of new creative technology, there are always fresh peaks to scale, and today our target is none other than the user interface of the amazing open source Automatic1111.


If you've been toying with Stable Diffusion models and have had your fingers on the pulse of AI art creation, chances are you've already bumped into Automatic1111. You've seen the power it offers, the multitude of options, and let's not forget about those tantalizing dropdown menus, all promising an adventure.


"But what does it all mean?", you've asked yourself. "What do all these tabs, options, and sliders do?" I hear you, and I've been there too. That's why we're here today, ready to deep-dive into Automatic1111's user interface. From the subtle nuances of the 'txt2img' tab, the mysteries of the 'img2img' tab, to the ever-so-useful 'Upscaler' options - we're about to take them all apart, understand their functions, and unlock their potential.


The best part? This is for noobs like you to become masters like DaVinci. So, whether you're just getting your feet wet with Automatic1111, or you're an intermediate user eager to know every last detail, this series is for you.


What is automatic 1111 web ui?


A1111, also known as Automatic 1111, is the go-to web user interface for Stable Diffusion enthusiasts, especially for those on the advanced side. While loaded with features that make it a first choice for many, it can be a bit of a maze for newcomers or even seasoned users. Whether you're generating images, adding extensions, experimenting with new features, or even venturing into video production, A1111 is continuously evolving to offer more. While rich in functionality, its complexity can be daunting, especially as it grows.


The official documentation exists but can be like reading a foreign language. It also doesn't cover usage of existing and new extensions. That's where this guide steps in - to take that complex, dense information and translate it into something clear and understandable, ensuring that users can really get to grips with everything the UI has to offer.


Control panel of a city

Table of Content

 

The Interface and Basic Tabs

  1. The Primary Tabs at a Glance - Automatic1111 User Interface

  2. Text to Image - txt2img a. Prompt Engineering (Prompts and Negative Prompts) b. What are Tokens for? (Tokens and the 0/75 Indicator)

  3. CFG Scale - Classifier Free Guidance Scale Steering Your Prompts a. CFG Scale - Classifier Free Guidance b. CFG's Impact on Image Quality: Understanding the Trade-offs c. CFG Scale - Cheat Sheet

  4. Understanding Batch Sizes and Counts: Enhancing Efficiency a. Batch Count - Dialing Up the Numbers b. Batch Size - Maximizing Performance c. Synergy of Batch Count and Batch Size

  5. Playing with Dimensions: Understanding Height, Width, and Their Influence a. The Influence of Dimensions b. A Tool for Easy Aspect Ratio Adjustment c. Taking It Further with Hires.fix d. Tweaking the Denoising Strength (Hires.Fix) e. Taking Control with Hires.Fix f. Hires steps

  6. Sampling Method, What on Earth is it? a. Sampling Steps: Making Sense of the Numbers b. The XYZ Plot: Experiment With Ease

  7. Restore Faces Button - How does restore faces work in Stable Diffusion?

  8. Understanding the 'Seed' in Automatic1111 UI's a. The Basics: What's a 'Seed'? b. Using the 'Seed' c. Additional Seed Tools: Dice, Recycle, and 'Extra'

  9. Image to Image - img2img

  10. Navigating the 'PNG Info' Tab

  11. Extras Tab for Upscaling Your Images

  12. The Stable Diffusion Checkpoints

  13. Including LoRa in the Mix a. Accessing the 'Lora to prompt' Tab

  14. Enhancing Visuals with SD VAE a. Download VAE b. ​Accessing the 'SD VAE' Tab

  15. CLIP Skip: Fine-Tuning AI Image Generation a. Accessing the 'Clip skip' Tab

  16. Checkpoint Merger: A Tool for Combining and Creating Models

  17. Learn to Train Your Own AI Art Models

  18. Uncovered Features in Automatic1111

  19. End Credits - Thank you Andrew

  20. Learn Stable Diffusion 101

 
The Samurai code, father and  son respecting each other
"Son, you will honor us with your diffusion ways."
 

The Interface and Basic Tabs

kiddie Alien user interface

The Automatic1111 user interface is the starting point of your AI art creation. It offers a plethora of tabs and options that may seem a bit overwhelming at first, but once you get to know them, you'll realize that each serves a specific purpose designed to streamline your AI-driven creative process.


1. The Primary Tabs at a Glance - Automatic1111 User Interface


Let's break down the primary tabs on the interface and understand their basic functions:

Automatic1111 User Interface
Automatic1111 User Interface

2. Text to Image - txt2img

An abandoned room full of scriptures
Learn the Basics of Prompt Engineering

This tab, short for Text-to-Image, stands as the cornerstone of the Automatic1111 user interface.


It's the favored tool of a new generation of creators called 'Prompt Engineers'. The concept is straightforward yet powerful: you describe an image using words in the prompt box, and the underlying Stable Diffusion algorithm does its best to materialize your textual description into a tangible image. This feature has truly revolutionized digital art creation, bridging the gap between imagination and visualization.

prompt and negative prompt box for automatic1111


Prompt Engineering

​Prompt (press Ctrl+Enter or Alt+Enter to generate)

​The 'Prompt' box in the 'txt2img' tab is essentially your canvas for ideas. This is where you write your textual description of the image you wish to generate. The Stable Diffusion algorithm takes this prompt as input and, using its complex learning and pattern recognition capabilities, attempts to create an image that corresponds to your description.

​Negative Prompt (press Ctrl+Enter or Alt+Enter to generate)

​​The 'Negative Prompt' box is a complementary tool that refines your image generation process. Here, you can input what you specifically don't want in your image. This allows for a higher degree of control and precision over the output, helping you to avoid unwanted elements in your generated image.

Both the 'Prompt' and 'Negative Prompt' fields play critical roles in shaping the final output, making the 'txt2img' function an incredibly flexible and powerful tool for AI-based art creation. I'll talk more about how to properly design your prompt in my article below.

 

Related:

 

What are Tokens for?

Golden Coins - What are the tokens for?
Understand What Tokens Do

  • Tokens and the 0/75 Indicator:

In the context of Stable Diffusion, a 'token' is essentially a unit of text input you can feed to the model. The number "0/75" displayed at the far end of your prompt bar represents the standard maximum limit of tokens you can use in a single prompt. Anything beyond that may not be considered by the model.

  • Tokens Beyond 75 - Infinite Prompt Length:

But what if you have more to say? That's where 'Infinite Prompt Length' comes in. If you type more than 75 tokens, the model cleverly divides your prompt into smaller pieces, each consisting of 75 tokens or fewer, and processes each chunk independently. So, if you have a 120-token prompt, it gets divided into a 75-token chunk and a 45-token chunk, which are processed separately and then combined. This allows you to provide more complex instructions to the model.

  • The 'BREAK' Keyword:

If you want to start a new chunk of tokens before reaching the 75-token limit, you can use the 'BREAK' keyword. This command will fill the remainder of the current chunk with 'empty' tokens, and any text you type afterwards will start a new chunk. This gives you more control over how your input is processed. And there you have it! A simplified overview of tokens, infinite prompt length, and the BREAK keyword in Stable Diffusion.

  • Note: Each letter, number, and punctuation mark counts as a token, but spaces between them do not.



3. CFG Scale - Classifier Free Guidance Scale Steering Your Prompts

Enchanted Forrest Pixar Style
Steer your prompts using CFG Scale

CFG Scale - Classifier Free Guidance


Classifier Free Guidance (CFG) Scale acts like the steering wheel to your prompts. It's a crucial setting that dictates how rigidly Stable Diffusion adheres to your text prompt in both text-to-image (txt2img) and image-to-image (img2img) tasks.

Classifier Free Guidance (CFG Scale) in Automatic1111
Classifier Free Guidance (CFG Scale)

Imagine CFG as a sliding scale that controls your guide's attentiveness to your instructions. The default position, at a CFG value of 7, offers a harmonious balance, granting Stable Diffusion enough liberty to interpret your prompt creatively, while also ensuring it doesn't stray too far off course. If you notch it down to 1, you're essentially giving your guide free rein. Cranking it up above 15, however, is like you're micromanaging the guide to stick strictly to your instructions.


While the Stable Diffusion Web UI limits the CFG scale between 1 and 30, there are no such limits if you're working via a Terminal. You can push it all the way up to 999, or even enter negative territory!


Note: Currently, you cannot use Negative CFG in Automatic1111

Negative CFG values prompt Stable Diffusion to craft an image opposite to your text prompt, though this approach is less common as Negative Prompts typically yield more predictable and desirable outcomes.


CFG's Impact on Image Quality: Understanding the Trade-offs


CFG's role in steering the direction of your Prompt Engineering might sound pretty straightforward, but it's a bit more complex in practice. Different CFG values can lead to varied trade-offs.


As an example, let's consider a scenario where we're using the Euler A Sampling Method and 20 Sampling Steps, the default Stable Diffusion Web UI settings. When analyzing the output images from this setup, you'll notice that with an increase in CFG value, color saturation and contrast in your images also rise. However, push the CFG value too high, and your images may start losing detail, becoming blurrier.

CFG Scale - Cheat Sheet

CFG 2 - 6: A playful and creative setting, though it might steer away from the original prompt. Best suited for shorter, less precise prompts.


CFG 7 - 10: Ideal for a variety of prompts. Strikes a balance between imaginative outputs and staying true to the original direction.


CFG 10 - 15: Optimal when you're confident that your prompt is thoroughly detailed, leaving no room for ambiguity about the desired image. CFG 16 - 20: Usually not the go-to choice unless your prompt is incredibly specific. Might potentially impact the image's coherence and quality.

CFG >20: Rarely useful; tends to result in less satisfactory outcomes.

Example:

Should you wish to recreate the example shown below, I've provided the necessary details to set you on the right path below.


Model Used:Cyberrealistic_v32

Negative Embedding:CyberRealistic Negative

SD VAE: vae-ft-mse-840000-ema-pruned-ckpt

Sampling Method: DPM++ SDE Karras

Restore Faces: On Sampling: Steps 40 Seed: 760702598 Clip Skip: 2

CFG Scale Grid - Blonde Girl
CFG Scale 1-30

The CFG Scale, ranging from 1 to 30, demonstrates noticeable differences at each level. However, these variations are also influenced by other factors and settings.


Does that mean you're stuck with blurry images if you want to stick to higher CFG values? Not quite! You can counterbalance this by:

  1. Increasing Sampling Steps: Adding more sampling steps typically adds more detail to the output image, but be mindful of processing times, which could increase as well.

  2. Switching Sampling Methods: Some sampling methods perform better at specific CFG and sampling steps. For instance, UniPC tends to deliver good results at a CFG as low as 3, while DPM++ SDE Karras excels at providing detailed images at CFG values greater than 7.

To squeeze the best image quality from Stable Diffusion without blowing up memory and processing times, it's essential to strike the right balance between CFG, sampling steps, and the sampling method. The XYZ Plots technique is a handy tool to neatly display all your results in an organized grid. Be sure to check out my dedicated blog post on mastering XYZ Plots, linked below.

 
 


4. Understanding Batch Sizes and Counts: Enhancing Efficiency

Short Stubby Pixar Monsters
Dialing up the Batch Sizes and Counts

When playing around with image generation, two settings can significantly impact your outcomes and the way your system runs: Batch Count and Batch Size. Understanding these two parameters can enhance your creativity and make your workflow smoother.

Batch Size and Count - Stable  Diffusion Automatic1111 UI
Batch Size & Batch Count

Batch Count - Dialing Up the Numbers


Batch Count is your companion when you want to create multiple sets of images. This setting indicates how many batches of images you want to produce. It doesn't affect the system's performance or how much VRAM it uses, making it a carefree way to up your creativity. If you set the Batch Count to, say, 5, the system will independently generate five batches of images.


Batch Size - Maximizing Performance


On the other hand, Batch Size refers to the number of images to generate in one go within a single batch. While increasing the Batch Size can significantly boost the generation performance, be mindful that it comes at the cost of higher VRAM usage. It's all about balancing between performance and your system's capabilities.


Spooky Girl with Red paint on her face
This is a spooky girl with red paint with a batch of 12
 
  • Tutorial: Should you wish to learn how to create the images showcased in the thumbnail, drop a comment below. I'd be more than happy to prepare a tutorial for you!

 

Synergy of Batch Count and Batch Size


When used together, these settings can ramp up your productivity. For instance, with a Batch Count of 5 and a Batch Size of 5, your system will generate images five times, and each time it will produce five images. It's like setting your work on an assembly line, producing a collection of unique images efficiently.


In the end, understanding and properly utilizing Batch Size and Batch Count gives you better control over your image generation process. It's all about finding the sweet spot that matches your creative needs and respects your system's limitations. Experiment, iterate, and discover what works best for you!



5. Playing with Dimensions: Understanding Height, Width, and Their Influence


dimensional influence
Understanding Height & Width

When working with Stable Diffusion, the image dimensions, specifically the height and width, play a crucial role. They do more than just define the size of the image; they can also influence the quality and outcome of your creations, especially when working with certain fine-tuned models.


The Influence of Dimensions


Some fine-tuned models show a preference for specific image sizes. For instance, a model might perform optimally with images of 512x768 pixels. The reason behind this preference typically lies in the model's training process. If the model was trained using images of that specific size, it would likely yield the best results when used with similar dimensions.


Doubling the size may seem like a straightforward way to get higher resolution images. However, this might not always be the case. Increasing the size can occasionally produce unusual results, especially if the model wasn't trained for that particular size. When using fine-tuned models, it's always worth checking their descriptions for any specific size preferences or recommendations.


A Tool for Easy Aspect Ratio Adjustment


To make size adjustments more intuitive, Stable Diffusion comes equipped with an up and down arrow. This handy tool flips the aspect ratio, making it effortless to switch between portrait and landscape formats, or any other ratio adjustments you might need.


Width and Height and Portrait to Landscape Bar
Up and Down arrow switches from Portrait to Landscape

Taking It Further with Hires.fix


Before and after image of girl with firery hair
Hires.fix can drastically increase the resolution at lower VRAM cost.

Clicking on Hires.fix opens up a wealth of options for further refining your images. It introduces you to the Upscaler, a powerful tool for increasing the resolution of your creations. Alongside the Upscaler, you'll also find options like Hires Steps and Denoising Strength.


Before getting into the options of Hires.Fix, it's important to understand what you're trying to achieve. Are you aiming for a larger image or more detail? If your answer is both, Hires.Fix is your go-to tool. Begin by selecting your base resolution using the Width and Height sliders. Then, click on the Hires.Fix button. Here, you'll find the Upscaler and a range of options like Hires Steps and Denoising Strength.


By adjusting these settings, you can influence the final resolution and detail level of your image. Remember, a higher resolution isn't always better if the model wasn't trained for it - that's where the magic of the Upscaler steps in! It allows you to enhance resolution while preserving the integrity of the image.

Hires.fix - Stable Diffusion Automatic1111
Click on Hires.fix to open up controllers for this feature
 
  • Related - For a thorough explanation about Hires.Fix and Upscaler check out this blog dedicated to these features in Automatic1111.

 

Tweaking the Denoising Strength (Hires.Fix)


Denoising Strength is your superpower when it comes to preserving the essence of your image while enhancing its resolution. Picture it as the gatekeeper of the original content during the upscaling process.


The Denoising Strength scale swings from 0 to 1. At zero, it’s like having a guard dog that never barks – your image stays untouched. Slide it all the way to 1, and you could find yourself staring at an image that's decided to go on a wild tangent. For values less than 1, it’s about striking a balance - fewer processing steps are taken, yet the core charm of your image is maintained.


Taking Control with Hires.Fix


It's exciting how Hires.Fix adds a whole new dimension to your image generation. Your image gets created and upscaled in two stages, and Denoising Strength dons an important role.

Boy vs hairy monster
What's under that fur?

Why? Because it's here that Denoising Strength helps you strike the right balance between preserving the original details and allowing for a dash of creativity. It's like having an artist who knows when to stick to the script and when to let their imagination run wild.


In Hires.Fix, your canvas size, or rather your image resolution, can soar to greater heights. But, just like flying, higher isn't always better. Here's where Denoising Strength steps in, preventing unwanted quirks from sneaking in during the upscaling.


In a nutshell, it's all about using the Denoising Strength in Hires.Fix to achieve a sweet spot between maintaining details and encouraging creativity. So, get experimental, and you'll soon be crafting stunning, high-resolution images that truly resonate with your vision.


Hires steps


The Steps in Hires feature is an option that becomes available once you select the Hires.fix function. It determines the number of steps taken to upscale your image. If set to zero, it utilizes the same number of steps as used for the original image.


In the Hires.fix settings, you'll also find options to adjust the final width and height of your image using the 'Resize width to' and 'Resize height to' slider bars.


Twinning or Deformity in Stable Diffusion
If you're experiencing issues like twinning or deformities, your aspect ratio may not align with the model's training.

Employing the Hires.fix function effectively can help avoid issues such as twinning and loss of composition in your upscaled images. "Twinning," in this context, refers to the unwanted duplication or multiplication of features in your creations. For instance, this might result in characters with two faces or two heads, which can be visually disruptive unless intentionally desired.


6. Sampling Method, What on Earth is it?

Pixar style monsters having a snacking session
What are Sampling Steps?

'Sampling Method' is a key control you'll find in both 'txt2img' and 'img2img' tabs of the Automatic1111 interface. It represents the algorithmic strategy your AI uses to translate a text prompt into a unique image or transform an existing image. But what does that mean in everyday language?


Consider 'Sampling Method' as your AI's navigation system. It directs how Stable Diffusion charts its course through an ocean of potential image outputs derived from your prompt. This list of names like Euler, Heun, DPM, DDIM, and others? They're different navigational routes your AI can follow during its image-generation journey.


What's cool about this is there's no one-size-fits-all 'best' setting. The optimal choice is tied to what you're aiming to achieve with your image. Some methods guide the AI towards meticulously crafting every detail, while others prompt it to quickly sketch out a concept.


And don't stress about tripping over technical jargon. We get that terms like 'probability distribution' and 'numerical methods' aren't daily chatter for most. We'll break down each of these methods in future posts, making them simple and easy to grasp.


Sampling methods in Stable Diffusion
Available Sampling Methods
 
  • Related - In Depth Explanation of all the Sampling Methods.

 


Sampling Steps: Making Sense of the Numbers


Sampling Steps is not about brute force, but precision and finesse. You see, "Sampling Steps" is a slider on the interface that controls how many iterations, or steps, the Stable Diffusion model takes to craft your artwork. It’s kind of like the number of brush strokes an artist decides to put into their painting. But this is where it gets interesting.


Contrary to what one might think, bigger isn't always better with Sampling Steps. It's a dance - a balance, if you will. Cranking up the Sampling Steps doesn't necessarily gift you a better image. It’s more like the Goldilocks principle, you want to find the number that’s ‘just right’.

XYZ Plots - Stable Diffusion - Portrait of Girls in Paint Splatter
Use XYZ Plots as a Starting Point

How do you find that 'just right' number? Well, that involves a bit of trial and error. It's all about experimenting with different values, generating images, and checking which one hits that sweet spot of detail and computational efficiency. It’s all about finding that balance between a high-quality image and not having your system work overtime.


The XYZ Plot: Experiment With Ease


There's a little trick up Automatic1111's sleeve that can help you see how each level of Sampling Steps (and other parameters) will affect your end result. It's tucked away in the "Scripts" dropdown menu, and it's called the "XYZ Plot". This clever tool lets you render a grid to visually see how each level of Sampling Steps (and other settings) affects your image.


To sum up, Sampling Steps in Automatic1111 is not a parameter to gloss over. It’s a crucial part of understanding how Stable Diffusion operates and how to get the best results from it. Remember, it’s not a power game – it’s a balancing act. So, put on that balancing act hat, start experimenting, and see the magic happen!

 

Related - 'Maximize Your Workflow Efficiency: Pre-Planning with Stable Diffusion's XYZ Plot Image Grid'

 

7. Restore Faces Button - How does restore faces work in Stable Diffusion?


orthographic transformation art of a man's face cracking
Face Restoration

The 'Restore Face' feature in Automatic1111 is designed to refine and enhance facial features in your images. However, it's not a one-size-fits-all solution. Depending on the fine-tuned model you're using, the 'Restore Face' feature might do more harm than good, sometimes overcorrecting and altering the original feel of your image.

Restore Face Button on and off of a  woman's portrait
Which do you prefer? I personally like the left.

While it's programmed to bring a more realistic look to faces, it can also affect the overall integrity of the image. So, it's worth testing this feature to determine if it suits your needs or not.


Might be better off sometimes.


In some instances, leaving the 'Restore Face' function off could yield better results. On the flip side, if you notice an image where a face appears incomplete or unrefined, that might be a perfect scenario to engage the 'Restore Face' feature. In the end, it's all about experimenting and finding the balance that works best for your creative process.



8. Understanding the 'Seed' in Automatic1111's UI

Cute Pixar Sprouting Seed Smiling in the Grass
Understand the Importance of "Seeds"

The concept of the 'Seed' in Automatic1111's interface might sound a little technical, but trust me, it's an integral part of your creative journey with this tool. The Seed is your key to unlocking a world of consistency and controlled randomness in image generation.


The Basics: What's a 'Seed'?


Picture this - you're a gardener, and the 'Seed' is, well, your seed! It's the starting point for the image you're about to grow. The Seed value in Automatic1111 initiates a random tensor in a latent space (think of it as a virtual garden plot) that controls the content of your image.

Let's say you're trying to generate an image of a "Turtle standing on two legs". By using a specific Seed value, you can ensure the same image sprouts every time you 'plant' that Seed. This is super helpful when you're tweaking settings or prompts and want to keep the base image consistent.


Using the 'Seed'


Here's how you plant your Seed: simply enter the Seed value of the image you're trying to recreate into the Seed box. If the Seed value is -1, Automatic1111 will choose a random Seed, creating a unique image every time.

Bear in mind, though, if you're using a Seed from an image someone else has generated, you might not get an identical result - there could be other factors at play like LORAs, for example.


Additional Seed Tools: Dice, Recycle and 'Extra'

Dice, recycle and extra button
I modified my icon, so yours will look different, but similar.

Next to the Seed box, there are a few icons that you'll want to get familiar with:

  • The dice icon: This resets your Seed to -1, letting Automatic1111 pick a random Seed every time.

  • The recycle icon: It's like hitting 'refresh' - this uses the Seed from your last generated image.


For those of you who love to get into the nitty-gritty, check the 'Extra' option. This reveals the Extra Seed menu with even more options:

  1. Variation seed: This is an additional Seed you can play with.

  2. Variation strength: Here, you can control how much of your original Seed and Variation Seed you want in the mix. A setting of 0 uses your original Seed, while a setting of 1 uses the Variation Seed.

For instance, if you have two images with Seeds 1 and 3, you can blend these images by setting the Seed to 1, the Variation seed to 3, and then sliding the Variation strength between 0 and 1. It's like mixing two colours on a palette to get the perfect shade!

  1. Resize seed from width/height: This one's a gem when you're resizing your image. Without it, changing the image size drastically alters your image, even with the same Seed. By using this setting, you maintain the content of the image when resizing.

That's a wrap on the 'Seed'! Understanding and utilizing this tool is like mastering a secret sauce in your creative kitchen. So, roll up your sleeves and get cooking with Automatic1111!



9. Image to Image - img2img


Pretty Pixar Twin Girls with long hair
Use an Image to Create an Image

When it comes to image creation, 'img2img' and ControlNet are my go-to tools in Stable Diffusion.

img2img stable diffusion green ranger
img2img of Green Ranger

This feature allows you to transform an existing image (your own photo, sketch, or any picture you've uploaded) through the application of AI processes, which helps in adding a layer of artistry and creativity to your work.


The true magic of the 'img2img' tab is the control it affords. One of its core tools is the Denoising Strength slider, which allows you to fine-tune the extent of transformation applied to your image. Lower values retain more of the original image's characteristics, whereas higher values allow for more dramatic and creative transformations.


Entering the 'img2img' tab opens up a plethora of options, each offering a different path for your image's transformation journey. These include features like Sketch, Inpaint, Inpaint sketch, Inpaint upload, and Batch, making them potent tools in your AI Workflow.


When used in tandem with the ControlNet tool, the 'img2img' feature becomes incredibly powerful. It gives you a granular level of control over your creations, paving the way for boundless creativity.

 
  • This potent duo of ControlNet and Img2Img deserves a detailed exploration, so come back and look for my comprehensive tutorial on how to use these in your AI Workflow. So, stay tuned!

 


10. Navigating the 'PNG Info' Tab

painting of an old english town on an art table
How to paint like this

Primarily, the 'PNG Info' tab within Stable Diffusion caters to a more in-depth exploration of your image files. Despite the 'PNG' in its name, this function accommodates both PNG and JPEG formats, facilitating a comprehensive understanding of your generated artwork.


This tool provides you with a treasure trove of data including prompts, negative prompts, seeds, models used, and more. With this wealth of information at your fingertips, you're more equipped to replicate the images you're examining. It's like a backstage pass, letting you peer into the creative processes that shaped the final piece.


But there's a caveat to remember: images generated using the img2img feature or LoRa training might prove more challenging to recreate due to their unique, process-driven characteristics.

Also worth mentioning is a nifty trick: in the settings, you can switch your diffusion file type from PNG to JPEG. This not only saves storage space but also speeds up the image generation process. Thus, the 'PNG Info' tab doesn't merely give you insight into image files; it's also a gateway to optimizing your creative process.

How to use it: Go to PNG into tab > find an image to upload > drop it in "Drop Images Here".

The information of that image should appear to the right.

PNG info of a Samurai kid
PNG info of a Samurai Kid


11. Extras Tab for Upscaling Your Images

Giant Titan in the clouds, looking down in the world

Primarily serving as an upscaling hub, the 'Extras' tab in Stable Diffusion brings an added level of finesse to your image creations. Think of it as the post-production suite of your image generation process. Once you've generated an image from other tabs, you can swiftly 'send to' the 'Extras' tab for a touch of upscaling magic.


Now, here's the deal: upscaling isn't a one-size-fits-all process. It requires a good understanding of the various options available and a knack for applying them effectively. But don't fret - that's a topic for another day (and another blog post).

 
  • In the future, we'll dive into how to upscale your images properly, using the 'Extras' tab to its fullest potential. Until then, remember that this tab is your key to elevate your images from good to great, all at your own pace and style. So, feel free to explore and experiment!

 


12. The Stable Diffusion Checkpoints

The Stable Diffusion Checkpoints section is your command center for choosing the Stable Diffusion Models you'll work with. Here's where you pick the fine-tuned models that suit your creative needs.


To ensure your selections are available, all downloaded checkpoints and safe sensors must be placed in the appropriate folder.


Follow this path: Web UI Folder for Stable Diffusion > Models > Stable-diffusion Folder. Once your files are nestled in that folder, you're good to start the WebUI.

Stable Diffusion Checkpoint selector in Automatic1111
Select your fine-tined models here

Once the WebUI is up and running, it's time to choose your diffusion model. Head back to the Stable Diffusion Checkpoints section, select your desired model, and you're ready to dive into creating. It's as simple as that! Make sure to explore different models to discover the one that matches your artistic vision.



13. Including LoRa in the Mix

Train your mind young Warrior
Train Your AI Assassin

The 'Add Lora to prompt' section provides an avenue for integrating your custom-trained LoRa's into your workflow. Creating and using your own LoRa's, or leveraging the ones available for download at www.civitai.com, can open up new avenues for your creations.


Training your own LoRa can take your photography to a whole new level, imbuing your images with unique and personalized aesthetics.

add lora to prompt UI for automatic1111
Use this tab to add your LoRa prompt

Accessing the 'Lora to prompt' Tab

This particular tab only becomes visible when you activate it.

To do this, navigate to Settings > User Interface > Quick Setting List and then select 'Add sd_lora'.

Once activated, you'll have the 'Add sd_lora to prompt' tab.

 
  • Keep an eye out for an upcoming tutorial where I'll guide you through the process of training your own LoRa, transforming your photography game like never before. Stay tuned!

 


14. Enhancing Visuals with SD VAE

Magical Forrest struck by lightning
Enhance Diffusion with SD VAE

The Stable Diffusion Variational AutoEncoder, or SD VAE, is a crucial piece in Stable Diffusion. This advanced tool transforms image data into a 'latent space', a simplified format from which it generates improved, novel images.


The SD VAE's impact is significant, particularly when used with fine-tuned models lacking this feature. Its activation can significantly elevate the visual quality of generated images. Hence, while some models may already incorporate SD VAE, it's worth checking its activation status when working with those that don't.

SD VAE for Stable Diffusion

Download VAE Here.

  1. Download the required file and place it in the stable-diffusion-webui/models/VAE/ directory.

  2. Navigate to the Settings Tab on the A1111 Webui, select Stable Diffusion from the left-side menu, click on SD VAE, and then choose 'vae-ft-mse-840000-ema-pruned'.

  3. Hit the 'Apply Settings' button and patiently wait for a successful application message.

  4. Proceed to generate your image in the usual manner, using any Stable Diffusion model in either the 'txt2img' or 'img2img' options.

​Accessing the 'SD VAE' Tab

This particular tab only becomes visible when you activate it.

To do this, navigate to Settings > User Interface > Quick Setting List and then select 'Add sd_vae'.

Once activated, you'll have the 'Add sd_vae to prompt' tab.



15. CLIP Skip: Fine-Tuning AI Image Generation

Fine tuning a car in the garage
Fine-tine the Beast

CLIP is a model that embeds both images and texts into a common vector space, so that similar images and texts are close to each other. Clip skip is a parameter that allows you to skip some of the layers of the CLIP model when generating images. This can be useful for getting more creative results, as the CLIP model can sometimes be too specific in its descriptions.


For example, if you want to generate an image of a cat, the CLIP model has 12 layers that can describe different aspects of a cat, such as its color, shape, size, fur, eyes, etc. If you use all 12 layers, you might get a very realistic image of a cat, but it might not be very interesting or unique. If you use clip skip = 2, you will skip the last two layers of the CLIP model, and use only the first 10 layers. This means that you will ignore some of the details of the cat, such as its fur pattern or eye color, and focus more on its general features, such as its shape or size. This might give you a more abstract or creative image of a cat, but it might also be less realistic or accurate.


Clip Skip Stable Diffusion