Category Archives: Art

Things related to the visual side.

A blog post detailing my obsessive dive into generative AI

An image that says "Seth's AI tools" next to a bad-ass skeleton that was generated by stable diffusion

Over the last few months I created a fun little toy called Seth’s AI Tools (creative name, huh?), it’s an open source Unity program that has become a playground for me to test a mishmash of AI stuff.

If you click the “AI Paintball” button inside of it, you get the thing shown in the youtube video above.

This shitty game proof of concept generates every character image sprite immediately before it’s used on-screen based on the subject entered by the player. None of the art is included in the download. (well, a few things are, like the forest background and splat effects – although I did make them with this app too)

It’s 100% local and does not use any internet functionality. (behind the scenes, it’s using Stable Diffusion, GFPGAN, ESRGAN, CLIP interrogation, and DIS among other ML/AI stuff tech)

If I leave this running for twelve days, it will have generated and displayed over one million unique images during gameplay.

What can generative art bring to games?

Well, I figured this test would be interesting because having AI make unlimited unique but SIMILAR images of your opponent & teammates and popping them up randomly forces your brain to constantly make judgement calls.

You can never memorize the art patterns because everything is always new content. Sounds tiring now that I think about it.

If you don’t shoot an opponent fast enough, they will hit you. If you hit a friendly, you lose points.

Random thought: It might be interesting to render a second frame where I modify the first image and force a “smile” on it or something, but the whole thing looks like a bad flash game and I got kind of bored of working on it for now.

The challenge of trying to use dynamic AI art inside of a game

It’s neat to type in “purple corndog” and get a brand new picture in seconds. But as far as gamedev goes, what can you really do with a raw AI created image on the-fly?

Uhh… I guess you could…

  • Show pictures in a frame on a wall
  • Simple art for a “find the matching tiles” or a match three game
  • Background art, for gameplay or a title screen
  • Texture maps (can be tiled)

Your options are kind of limited.

To control the output better, one trick is to start with an existing image, and use a mask to only generate new data in certain parts. In this way, you have a lot more control, for example, you could only change someone’s shirt, and not touch their face.

I used this technique for my pizza screensaver test – I generated a pizza to use as a template once, then asked the AI to only fill in the middle of it (inpainting) without touching the outer crust. This is why every pizza has the same crust.

It works pretty well as I can hardcode the alpha mask to use so it’s a nice circle shaped sprite, don’t have to worry about shapes and edges at all. (see video below)

The “pizza” button in Seth’s AI tools. Every single pizza is unique and generated on the fly.

But with a newer technique called Dichotomous Image Segmentation that I hacked in a few days ago I can now create an alpha masked sprite dynamically in real-time. (A sprite being an object/creature image with a transparent background)

Using DIS works much better than other tests I did trying to use chroma or luma keying. It can pick up someone in a green shirt in front of a green background, for example.

It’s a generally useful thing to have around, even if it isn’t perfect. (and like with everything in this field, better data from more training will improve it)

This video shows a valid use: (I call it “removing background” in the video below, but it’s the same thing)

This shows how the “remove background” button works NOT in the game

Now moving on to the AI Paintball demo.

This isn’t a Rorschach ink blot test, it’s the starting shape I use to create all the characters in the AI Paintball test.

This image is the target of inpainting with a given text prompt, the background is removed (by creating an alpha mask of the subject) and voilà, there’s your chipmunk, skeleton, or whatever, ready to pop-up from behind a bush.

A note on the hardware I’m using to run this

I’m using three RTX 3090 GPUs, this is how I can generate an image per second or so. This means simply playing this game or using the pizza screen saver uses 1000+ watts of power on my system.

In other words, it’s the worst, most inefficient screen saver ever created and you should never use it as one.

If you only have one GPU the game/pizza demo will look much emptier as it will be slower to make images. (this could be worked around by re-using images but this kind of thing isn’t really for mass consumption anyway so I didn’t worry it)

Oh, want to run my AI Tools server + app on your own computer?

Well, it’s a bit convoluted so this is only for the dedicated AI lovers who have decent graphic cards.

My app requires that you also install a special server, this allows the two pieces to be updated separately and offload the documentation on installing the server to others. (it can be tricky…)

There are instructions here, or google “automatic1111 webui setup tutorial for windows” and replace where they mention https://github.com/AUTOMATIC1111/stable-diffusion-webui with https://github.com/SethRobinson/aitools_server instead.

The setup is basically the same as my customized server *is* that one, just with a few extra features added as well as insuring that it hasn’t broken compatibility with my tools.

The dangers of letting the player choose the game subject dynamically

The greatest strength and weakness of something like this is that the player enter their own description and can shoot at anything or anyone they want.

A shirtless Mario, something I created as an, uh, example of what you shouldn’t do. Unless that’s your thing, I mean, nobody is going to know.

Unfortunately, stable diffusion weight data reflects the biases and stereotypes of the internet in general because, well, that’s what it’s trained on. Turns out the web has become quite the cesspool.

Tim Berners-Lee would be rolling in his… oh, he’s still alive actually, really underscores how quick everything has changed.

The pitfalls are many: for example, if someone chooses the opponent “terrorist”, you can guess what ethnicity the AI is going to choose.

Entering the names of well known politicians and celebrities work too – there is no end of ways to create something offensive to someone with just a few keystrokes.

Despite being a silly little tech demo nobody will see I almost changed the name to “Cupid’s Arrows” where you shoot hearts or something in an effort to side-step the ‘violence against X’ issue but that seemed a bit too… I don’t know, condescending and obvious.

So I went with a paintball theme as a compromise, at least nobody is virtually dying now.

The legality of AI and the future

Well, this is my blog so I might as well put down some random thoughts about this too.

AI image generation is currently in the hot seat for being able to mimic popular artists’ style and create copyrighted or obscene material easier than ever before. (or for a good time, try both at once)

The stable diffusion data (called the weights) is around 4 GB, or 4,294,967,296 bytes. ALL images are created using only this data. It’s reportedly trained on 2.3 billion images from just around the internet.

Assuming that’s true, 4,294,967,296 bytes divided by 2.3 billion is only two bytes per image on average. *

Two bytes is enough space to store a single number between 0 and 65535) . How can all this be possible with only one number per image?! Well, it’s simple, it’s merely computing possibilities in noise space that are tied to tokens which are tied to words and … uh.. it’s all very mathy. Fine, I don’t really get it either.

This data (and code to use it) was released to the public for free and is responsible for much of the explosion we’re seeing now.

Our copyright system has never had to deal with concepts like “AI training”. How would it ever be feasible to get permission to use 2.3 billion images, and is it really necessary if it results in only a few bytes of data per each?

I’m hoping legally we end up with an opt-out system instead of requiring permission for all training because keep this mind: If you want to remove someone from a picture or upscale it, it will do the best job if it’s been trained on similar data. Using crippled data sets will make things less useful across the board.

To remove the birdy, the AI has to understand faces to fill in the missing parts.

Copyright as it applies to AI needs to evolve as fast as the technology, but that’s unlikely to happen. We have to find the balance in protecting IP but also not at the cost of hamstringing humanity’s ability to use and create the most amazing thing since mp3s.

Image generation has gotten a lot of attention because, well, it’s visual. But the AI evolution/revolution happening is also going to make your phone understand what you’re saying better than any human and help give assistance to hurricane victims.

Any rules on what can and can’t be used for training will have implications far beyond picture tools.

* it’s a bit more complicated as some images are trained at a higher resolution, a celebrity’s face or popular artist may be in thousands of images, etc.

Uh, anyway

So that’s what I’ve been playing with the last few months. Also doing stuff with GPT-3 and text generation in general (Kobold-AI is a good place to start there).

Like any powerful tool, AI can be used for good or evil, but I think it’s amazing that an art pleb like me can now make a nice apple.

It’s still early, improvements are happening at an amazing pace and it’s going to get easier to use and install on every kind of device – but a warning:

How to create Simon Belmont with DALL·E 2

Simon Belmont as he appears in Castlevania: Grimoire of Souls (Src: Wikipedia)

This morning OpenAI has changed the rules – we can share pictures with faces now! To celebrate, I figured I’d have DALL·E create a real life photo of Castlevania hero, Simon Belmont. He should look something like the above picture, right?

I’ll just enter the name and the style of photo I want and with the magic of AI we get…

“Simon Belmont , Professional Photograph in the studio, perfect lighting, bokeh”

…some bikers and Neo wannabes. DALL·E has been programmed to ignore (?) famous people and I guess that extends to fictional characters as well. Had poor results with Mickey Mouse and Shrek too.

It will never closely duplicate a celebrity face or anybody’s face for that matter, it will only output greatly “mixed” things. (this is a legal/ethical choice rather than a technological limitation I believe)

So the secret is to forget the name and craft a worthy sentence to describe the target in textual detail. Actually, I get slightly better results including the name so I’ll keep that too.

As a representative of lazy people everywhere, I’ll use OpenAI’s GPT-3 DaVinci to create the description for me. (Their text AI tools have no qualms referencing famous people or anything else)

Perfect. Now we feed the AI created description into DALL·E and get…

“Simon Belmont is a tall and muscular man with long, flowing blond hair. He has piercing blue eyes and a chiseled jawline. He typically wears a red tunic with a white undershirt, brown trousers, and black boots. He also wears a red cape and golden cross around his neck, Professional Photograph in the studio, perfect lighting, bokeh

Well, much closer. You know, we should have added a whip.

The quality stands up pretty well at full resolution too:

What a hero! We may have found the box art for Dink Smallwood 2… ! Or a romance novel. Oh, wait, we can’t use any of this generated stuff commercially yet, too bad.

Add an eye patch for Goro Majima Belmont

Conclusion

Being a skilled writer (unlike the person typing) will probably result in better images. All those pages of boring descriptive prose in The Hobbit would create masterpieces!

I’ve been dabbling with creating creature sprites/concept art to fit existing games (Like Dink Smallwood) but inpainting techniques have not been producing good results yet. Still learning and playing with things.

A 3D printed World Lock

I’ve been eyeing 3D printers for quite a while and finally pulled the trigger and purchased the rather economical Flashforge Dreamer.

I actually wanted a Taz 5 but I couldn’t find anywhere to ship it to Japan at a reasonable price, so whatever.

Anyway, despite feeling a bit limited due to the smallish build area it’s been a lot of fun.

IMG_1915

Printing an elephant

IMG_1917

The finished elephant. The legs move with no assembly as they are printed that way!

Was up and printing in thirty minutes.  So far stuff has worked without hairspray, glue sticks, painters tape and the other things I read about that scared me.

Some Dreamer tips:

  • The SD card shows as “error” in the dreamer when you have Wifi enabled. (as of the latest firmware available on 4/22/2015)  I think it has something to do with Wifi mode taking over the SD card as it uses the SD card as a cache while printing.  They should really change the message to “busy” or something. If you need to print from SD, turn off Wifi first.
  • I almost always print with a fairly hot build plate. (65C)  I let it cool before removing the print.
  • DO NOT USE the putty knife it comes with.  It’s way too thick.  Buy one with a much thinner edged one at the store and you will be amazed at how much easier your prints come off the bed!
  • If a print is going to fail horribly, it’s probably going to be within the first 5 minutes, so check around then.

IMG_1919

No really, that’s exactly what I was going for

Thingiverse seems like “the place” to get 3d files.  Any other good places out there?

If you have a dreamer, the first thing you notice is those spools of filament you bought from adafruit are too big.  No problem!  I used this design and a skateboard bearing to create a nice lazy susan style spool holder, it works great.

Also printed a solder spool holder just because.

IMG_1927

A printed solder holder. See, I’m saving money already

IMG_1935

3D printed Pi2+PiTFT case for my Growtopia monitor so it looks less like a bomb, found it here.

Ok, that’s all fine and dandy, but the real reason I wanted a 3D printer is to make my own stuff.

I used Inkscape (sort of the Blender of vector art) to generate a shape from the 2D bitmap of the Growtopia logo, imported that into Blender and extruded it.   Well, as I expected it’s a bit hard to see and crap in general.  I printed another in black filament to sort of use it as a “drop shadow”, helped a bit.

Can you recognize this logo?!

Can you recognize this logo? Er.. maybe if I raised parts to make the letters stand out more, I dunno

It was suggested on Twitter to print Growtopia characters, but man, that’s hard to do. Akiko whipped up a 3D model of a world lock for me though.

world_lock

Have a 3D printer and want to print your own World Lock?  You can get the .stl from here!

What other simple Growtopia things would make sense to print?  Hrm.  Is a character really possible?  What if we painted it…

A tip about Blender to 3D Flashprint/Simplify3D’s stl scale

In Blender, I set the units to metric, then set the scene scale to 0.01.  When doing the final export I set the STL export scale to 1000 and this keeps the measurements in Blender exactly matching the final print size.  (Use the Ruler/Protractor tool in Blender to measure pieces easily)

blender_measurements

Also, keep in mind Blender now has some helpful options to check if your models are setup right for 3D printing, you just need to enable them. Too bad the export STL button on the 3D Print menu doesn’t have a scale setting, I need that.

Conclusion

I bought a 9 pin dot matrix printer for $220 for my Commodore 128 a looong time ago. You’d laugh at the low resolution pictures I downloaded off Quantumlink and printed. You had to stand across the room to figure out which movie star it was. So noisy.  So slow you could read faster than it could print!

Similarly to the path 2D has taken, I believe 3D printing is now accelerating its journey towards detailed full color prints that will become a standard we all take for granted in just a few years.  Exciting times.

Ludumdare 20 – Who’s in? #LD48

It’s that time again.

Three times a year hundreds of masochistic geeks from around the planet push themselves to the limit by creating the best game they can (individually) in only two days. The winning theme you must base your game around is announced as it begins.

Check out the keynote (done by Sos this year, nice job!)  and get more info at Ludumdare.

As for me…

I don’t think I’m going to be able to devote enough time to make anything this year, but at the very least I’ll be hanging out in irc cheering the brave on as usual.

My unhelpful guide for first-timers is here.

Warm up your compilers, gas up your image editor, and change the tires on your music program because it starts in 23 hours.

Dev Journal: Tank combat meets Mario Kart?

New game project!

So I’m sort of working on a new cross-platform game. The basic idea is “local splitscreen/networked multiplayer tank combat with easy touch control that’s fun for me and my kid”.

Basic movement. Ugly as hell but hey, four players!

I spent a lot of time getting really flexible split-screen support in. I can add as many local players as I want. In addition to specifying the window size, I can specify their rotation. Touch controls smartly adjust.

A real physics engine?

The irrBullet hello example running on an iPad. Man, what is with that floor texture!

Hey, how about real physics for the tank movement? Let’s integrate Bullet (with some wrapper help from irrBullet)

I had a feeling I wasn’t going to end up using it because of speed and networking issues, so this was mostly a for fun side diversion and practice.

Ran IrrBullet’s Hello World example on the iPad and the Nexus One (Android). Decent speeds, especially on iPad.

Plugged it in for the tank physics. The only way I could get reasonable tank-like movement was to use eight “raycast wheels” per tank.

It turned out quite computationally expensive and I could see it was going to take approximately four hundred years of tweaking to get player controls to feel “right”. Screw this.

So I dumped Bullet and decided to just do my usual homegrown cheapo physics. Not as good, but easier to tweak and runs fast. Looking forward to using Bullet in the future for something though.

Making a test level with lightmaps

A crappy test level is created in Max. I use “Point helpers” to mark the position and rotation of spawn points. Easier to see in max than dummies.

I use max’s “render lightmaps to a texture in an intelligent way and apply the texturing to the second map channel automatically” feature.

Real tank models and basic combat

Tanks can now smoke and blow eachother up.

More progress:

  • Tank models licensed from from Mighty Vertex
  • Functional health bar
  • Tank shadows
  • Reaction physics when shooting/being shot
  • Crash sound effect when hitting another tank.

Next up: Turret movement…