NightCafe Logo
Create
Anonymous User

Experiment

VQGAN+CLIP Keyword Modifier Comparison

We compared 126 keyword modifiers with the same prompt and initial image. These are the results.

Jump to the results
August 2022 update: Community member tdraw_ai_art has recreated this study using the Stable Diffusion algorithm.
See Comparison
Experiment Base Image
Base image (upsized). 400 iterations of "Scary skeleton astronaut in space".

Method

The Experiment Explained

We started by running the prompt "Scary skeleton astronaut in space" for 400 iterations at thumb resolution (400x400px). That gave us this base image.

Then, we evolved that creation 126 times, each time adding a different keyword modifier, and running for an additional 400 iterations. "Evolving" a creation uses the previous creation's output as the start image for the next, so every experiment started from the base image (i.e. NOT from scratch).

This experiment was inspired by this fantastic album by Reddit user u/kingdomakrillic.

Want to try your own modifiers on this base image? Click "Evolve It" below then add your modifier to the prompt.

Evolve It

Jump to results

VQGAN+CLIP Keyword Modifiers
Clockwise from top left: "A dog on the beach", "A dog on the beach Thomas Kinkade", "A dog on the beach Unreal Engine", "A dog on the beach detailed painting".

How do VQGAN+CLIP Modifiers Work?

Taken from our VQGAN+CLIP tutorial on Medium.

Modifiers are just keywords that have been found to have a strong influence on how the AI interprets your prompt. In most cases, using one or more modifiers in your prompt will dramatically improve the resulting image. Here’s an example using the text prompt “A dog on the beach”. It’s obvious that the top left image (without any modifiers) is noticeably worse than the others.

So why do modifiers have such a dramatic effect? It’s to do with the data that the CLIP network was trained on — millions of image and caption pairs from the internet. CLIP has seen a huge number of images on the internet, and the ones that include the words “Thomas Kinkade” in the caption tend to be nicely textured paintings like those shown in the centre-left image. Likewise the images that were paired with a caption containing the words “Unreal Engine” tend to look like scenes from a video game (because Unreal Engine is a video game rendering engine).

Thus, when you include modifiers like “Thomas Kinkade” or “Unreal Engine”, CLIP knows that the image should look a certain way. Note that in the examples above, it’s not so much the shapes that are better with modifiers, it’s the finer textures that make it look better.

A few of our favourites
Clockwise from top left: "Unreal Engine", "VRay", "SketchFab", "CryEngine".

Observations

A few key takeaways

3D rendering engines as modifiers

The modifiers that are 3D rendering engines (pictured: Unreal Engine, CryEngine, VRay, SketchUp) really shine here. Interestingly, the rendering engines targeted at games (Unreal Engine, CryEngine) both ended up with a spaceship interior in the the background.

Some didn't affect the base image much

Some modifiers like "futuristic", "mystical", "dream" and a few others didn't end up deviating far from the base image. Perhaps this means that CLIP doesn't have a strong concept of what these keywords should look like? Or maybe these modifiers are a bit too broad and therefore hard for CLIP to steer towards any particular look? It would be interesting to do more experiments with these modifiers to get a better idea of what's going on.

VQGAN+CLIP on NightCafe Creator

>2x faster than Google Colab • Run multiple jobs in parallel • Works on any device • Create, evolve and add modifiers in a few clicks

Results

Scroll through the results and vote for your favourites by liking them.

  • Most Popular
  • Alphabetical
  • Most Recent
PRO
·Following

matte painting

ArtisticMediumThumb

charcoal drawing
charcoal drawing

charcoal drawing

ArtisticMediumThumb

VRay
VRay

VRay

ArtisticMediumThumb

trending on Artstation
trending on Artstation

trending on Artstation

ArtisticMediumThumb

Unreal Engine
Unreal Engine

Unreal Engine

ArtisticMediumThumb

watercolor
watercolor

watercolor

ArtisticMediumThumb

Kandinsky
Kandinsky

Kandinsky

ArtisticMediumThumb

isometric
isometric

isometric

ArtisticMediumThumb

ZBrush central
ZBrush central

ZBrush central

ArtisticMediumThumb

ink drawing
ink drawing

ink drawing

ArtisticMediumThumb

35mm
35mm

35mm

ArtisticMediumThumb

🔒 Secure Payments

PCI compliant payments powered by Paddle, PayPal and Shopify.

PaddlePayPalShopify
MastercardVisaAmerican ExpressApple PayGoogle Pay

Copyright © NightCafe StudioCommunity Standards Acknowledgements Privacy Policy Terms of Service Refund Policy Sitemap