This is cool! Thanks for sharing. I've played around with some random Colab notebooks that have surfaced on HN but the results have been underwhelming compared with some of the polished AI Art I've seen in the wild. Some questions that popped into my head:
What's your setup (Cloud/Colab/Custom hardware)? Did you borrow the code in its entirety or is there a secret source? How long did it take you to fiddle around with the hyper-parameters until you were happy with the results? How many iterations did you settle on before stopping? Thanks!
re: secret sauce, I tend to be in the 'sharing is caring' camp. The code for this was based on the popular notebook by @RiversHaveWings (VQGAN+CLIP) although I've edited it back and forth a few times.
I usually run for a few hundred iterations (eg 250).
EDIT: here's a Google Colab that replicates my plant generation: https://colab.research.google.com/drive/1b1UfblpdhPJ7f1WRjfC...
My guess is that you've solved this already but I run it in a loop by replacing do_loop() at the bottom of the last frame with the following:
import random
prompt='crystalline alien spacecraft on a flower'
modifiers=['',' macro photography', ' with hyperrealistic 35mm depth of field']
original_prompt = prompt
while True:
for modifier in modifiers:
prompt = original_prompt + modifier
seed = random.randrange(1000)
tqdm.write(f'Prompt {prompt}, Seed {seed}')
do_run()
Dang it! I knew this existed but I couldn't find it for some reason. Thank you!!! This has been a wonderful journey...my poor kids have been getting spammed with my latest incantations for the past week lol.
I keep trying to think of ways to make this more consumable for the less technically inclined, but GPUs are just not cheap. I was on a V100 for probably 36 hours through Colab via my $9 monthly subscription, that'd run $100 on AWS for the same amount of time on a p3. So then I'd need to figure out a tip jar or something to keep the gas tank full.
Had a play the other day - I love it! But it's quite slow, so for most of my experiments (which involve video and motion) I haven't yet found a way to incorporate it into my workflows without slowing things down considerably.
Thanks for sharing though, the work she's doing is so great!
Not dumb at all. This code is at the bottom of the notebook cell that actually generates the image. By default it's a one-shot deal, this allows me to continuously generate images with some perturbations in the input to build in some variety (seed value, some prompt 'modifiers' that tend to have a style transfer effect).
I'm curious, is there an advantage of doing style transfer implicitly via adding text over explicitly by providing a target image to copy the style from?
Thanks for taking the time to reply to my Qs; all useful insights and is much appreciated.
Mostly I'm excited about the direction and innovation velocity that AI art is going, but partly I'm anxious about what the eventual implications will be for human artists.
Packaging is meh, but you can use my repo if you want to generate your images using the CLI, a discord bot, or an IRC bot with your own hardware: https://github.com/luc-leonard/clip_generators. It includes vqgan+clip and guided-diffusion models
It can be hit-and-miss. With a bit of trial and error you start to see which prompts work well to generate pleasing images. Some people have done excellent work exploring ways to change the look, like https://imgur.com/a/SnSIQRu
As for hardware, a lot of my experimenting is done on Google colab. For this plant generation stuff I rented a GPU via vast.ai for ~$0.20 an hour and set it running overnight.
What's your setup (Cloud/Colab/Custom hardware)? Did you borrow the code in its entirety or is there a secret source? How long did it take you to fiddle around with the hyper-parameters until you were happy with the results? How many iterations did you settle on before stopping? Thanks!