Category Archives: Unity

A blog post detailing my obsessive dive into generative AI

An image that says "Seth's AI tools" next to a bad-ass skeleton that was generated by stable diffusion

Over the last few months I created a fun little toy called Seth’s AI Tools (creative name, huh?), it’s an open source Unity program that has become a playground for me to test a mishmash of AI stuff.

If you click the “AI Paintball” button inside of it, you get the thing shown in the youtube video above.

This shitty game proof of concept generates every character image sprite immediately before it’s used on-screen based on the subject entered by the player. None of the art is included in the download. (well, a few things are, like the forest background and splat effects – although I did make them with this app too)

It’s 100% local and does not use any internet functionality. (behind the scenes, it’s using Stable Diffusion, GFPGAN, ESRGAN, CLIP interrogation, and DIS among other ML/AI stuff tech)

If I leave this running for twelve days, it will have generated and displayed over one million unique images during gameplay.

What can generative art bring to games?

Well, I figured this test would be interesting because having AI make unlimited unique but SIMILAR images of your opponent & teammates and popping them up randomly forces your brain to constantly make judgement calls.

You can never memorize the art patterns because everything is always new content. Sounds tiring now that I think about it.

If you don’t shoot an opponent fast enough, they will hit you. If you hit a friendly, you lose points.

Random thought: It might be interesting to render a second frame where I modify the first image and force a “smile” on it or something, but the whole thing looks like a bad flash game and I got kind of bored of working on it for now.

The challenge of trying to use dynamic AI art inside of a game

It’s neat to type in “purple corndog” and get a brand new picture in seconds. But as far as gamedev goes, what can you really do with a raw AI created image on the-fly?

Uhh… I guess you could…

  • Show pictures in a frame on a wall
  • Simple art for a “find the matching tiles” or a match three game
  • Background art, for gameplay or a title screen
  • Texture maps (can be tiled)

Your options are kind of limited.

To control the output better, one trick is to start with an existing image, and use a mask to only generate new data in certain parts. In this way, you have a lot more control, for example, you could only change someone’s shirt, and not touch their face.

I used this technique for my pizza screensaver test – I generated a pizza to use as a template once, then asked the AI to only fill in the middle of it (inpainting) without touching the outer crust. This is why every pizza has the same crust.

It works pretty well as I can hardcode the alpha mask to use so it’s a nice circle shaped sprite, don’t have to worry about shapes and edges at all. (see video below)

The “pizza” button in Seth’s AI tools. Every single pizza is unique and generated on the fly.

But with a newer technique called Dichotomous Image Segmentation that I hacked in a few days ago I can now create an alpha masked sprite dynamically in real-time. (A sprite being an object/creature image with a transparent background)

Using DIS works much better than other tests I did trying to use chroma or luma keying. It can pick up someone in a green shirt in front of a green background, for example.

It’s a generally useful thing to have around, even if it isn’t perfect. (and like with everything in this field, better data from more training will improve it)

This video shows a valid use: (I call it “removing background” in the video below, but it’s the same thing)

This shows how the “remove background” button works NOT in the game

Now moving on to the AI Paintball demo.

This isn’t a Rorschach ink blot test, it’s the starting shape I use to create all the characters in the AI Paintball test.

This image is the target of inpainting with a given text prompt, the background is removed (by creating an alpha mask of the subject) and voilà, there’s your chipmunk, skeleton, or whatever, ready to pop-up from behind a bush.

A note on the hardware I’m using to run this

I’m using three RTX 3090 GPUs, this is how I can generate an image per second or so. This means simply playing this game or using the pizza screen saver uses 1000+ watts of power on my system.

In other words, it’s the worst, most inefficient screen saver ever created and you should never use it as one.

If you only have one GPU the game/pizza demo will look much emptier as it will be slower to make images. (this could be worked around by re-using images but this kind of thing isn’t really for mass consumption anyway so I didn’t worry it)

Oh, want to run my AI Tools server + app on your own computer?

Well, it’s a bit convoluted so this is only for the dedicated AI lovers who have decent graphic cards.

My app requires that you also install a special server, this allows the two pieces to be updated separately and offload the documentation on installing the server to others. (it can be tricky…)

There are instructions here, or google “automatic1111 webui setup tutorial for windows” and replace where they mention https://github.com/AUTOMATIC1111/stable-diffusion-webui with https://github.com/SethRobinson/aitools_server instead.

The setup is basically the same as my customized server *is* that one, just with a few extra features added as well as insuring that it hasn’t broken compatibility with my tools.

The dangers of letting the player choose the game subject dynamically

The greatest strength and weakness of something like this is that the player enter their own description and can shoot at anything or anyone they want.

A shirtless Mario, something I created as an, uh, example of what you shouldn’t do. Unless that’s your thing, I mean, nobody is going to know.

Unfortunately, stable diffusion weight data reflects the biases and stereotypes of the internet in general because, well, that’s what it’s trained on. Turns out the web has become quite the cesspool.

Tim Berners-Lee would be rolling in his… oh, he’s still alive actually, really underscores how quick everything has changed.

The pitfalls are many: for example, if someone chooses the opponent “terrorist”, you can guess what ethnicity the AI is going to choose.

Entering the names of well known politicians and celebrities work too – there is no end of ways to create something offensive to someone with just a few keystrokes.

Despite being a silly little tech demo nobody will see I almost changed the name to “Cupid’s Arrows” where you shoot hearts or something in an effort to side-step the ‘violence against X’ issue but that seemed a bit too… I don’t know, condescending and obvious.

So I went with a paintball theme as a compromise, at least nobody is virtually dying now.

The legality of AI and the future

Well, this is my blog so I might as well put down some random thoughts about this too.

AI image generation is currently in the hot seat for being able to mimic popular artists’ style and create copyrighted or obscene material easier than ever before. (or for a good time, try both at once)

The stable diffusion data (called the weights) is around 4 GB, or 4,294,967,296 bytes. ALL images are created using only this data. It’s reportedly trained on 2.3 billion images from just around the internet.

Assuming that’s true, 4,294,967,296 bytes divided by 2.3 billion is only two bytes per image on average. *

Two bytes is enough space to store a single number between 0 and 65535) . How can all this be possible with only one number per image?! Well, it’s simple, it’s merely computing possibilities in noise space that are tied to tokens which are tied to words and … uh.. it’s all very mathy. Fine, I don’t really get it either.

This data (and code to use it) was released to the public for free and is responsible for much of the explosion we’re seeing now.

Our copyright system has never had to deal with concepts like “AI training”. How would it ever be feasible to get permission to use 2.3 billion images, and is it really necessary if it results in only a few bytes of data per each?

I’m hoping legally we end up with an opt-out system instead of requiring permission for all training because keep this mind: If you want to remove someone from a picture or upscale it, it will do the best job if it’s been trained on similar data. Using crippled data sets will make things less useful across the board.

To remove the birdy, the AI has to understand faces to fill in the missing parts.

Copyright as it applies to AI needs to evolve as fast as the technology, but that’s unlikely to happen. We have to find the balance in protecting IP but also not at the cost of hamstringing humanity’s ability to use and create the most amazing thing since mp3s.

Image generation has gotten a lot of attention because, well, it’s visual. But the AI evolution/revolution happening is also going to make your phone understand what you’re saying better than any human and help give assistance to hurricane victims.

Any rules on what can and can’t be used for training will have implications far beyond picture tools.

* it’s a bit more complicated as some images are trained at a higher resolution, a celebrity’s face or popular artist may be in thousands of images, etc.

Uh, anyway

So that’s what I’ve been playing with the last few months. Also doing stuff with GPT-3 and text generation in general (Kobold-AI is a good place to start there).

Like any powerful tool, AI can be used for good or evil, but I think it’s amazing that an art pleb like me can now make a nice apple.

It’s still early, improvements are happening at an amazing pace and it’s going to get easier to use and install on every kind of device – but a warning:

The 100 Prisoners Problem riddle interactive web app simulation I did in Unity

So this is one of those times where I made something in a few hours and want it to be indexed on the web rather than just the ethereal world of twitter so I’m making this post about it in the hopes that people will find it with a very specific Google search. (probably some kid stealing this for his homework.. steal away, I don’t mind!)

Image
The app looks like this. You can pan and zoom around and click buttons to control the simulation.

Play it here

Full source code of my unity project (github)

So if you’ve never heard of the 100 Prisoners Problem Riddle, it’s an amazing math trick where the solution seems to defy all logic. The way I was introduced to it was with Veritasium‘s easy to understand video on the subject:

Still here? Fine, go check out the Monty Hall Problem then!

How to get your Unity LLAPI/WebSocket WebGL app to run under https with AutoSSL & stunnel

<continuing my “blog about whatever random issue I last dealt with in the hopes that some poor soul with the same issue will google it one day” series>

The problem

So you made your new Unity webGL game using the LLAPI and it works fine from a http:// address.  But when you try with https, even with a valid https cert being installed, you get this error:

“Uncaught SecurityError: Failed to construct ‘WebSocket’: An insecure WebSocket connection may not be initiated from a page loaded over HTTPS.”

This is your browser saying “Look, the website is https, but don’t let that fool you; it’s using a normal old web socket to send data under the hood which isn’t encrypted, so don’t trust this thing with your credit card numbers”.

Unity (at the time of this writing) has no internal support for what we really need to be using:  a Secure Web Socket.  So where http has https, ws has wss.  So how do we connect securely if our unity-based server binary can’t serve wss directly?

A little background info about CPanel & AutoSSL

Note: I’m using CentOS 7 on a dedicated server with WHM/CPanel

Setting up your website for proper SSL so it can have that wonderful green padlock used to be a painful and sometimes expensive ordeal.

But no longer!  Enter the magic of CPanel’s AutoSSL.  (I think it’s using Let’s Encrypt under the hood as a plugin?)  Behind the scenes, it will handle domain validation and setup everything for you.  While it does need to renew your cert every three months, it’s free and automatic.  Add four new domains?  They will all get valid certs within a day or so, it’s great.

We can use this same cert to make your websockets secure as long as they are hosted at the same domain.

Setting up stunnel

This is an open source utility that is likely already included on your linux server box, if it isn’t, go install it with yum or something.

It allows you to convert any socket into a secure socket.  For example, if you have a telnet port at 1000, you could setup stunnel to listen at 1001 securely and relay all information back to 1000.

The telnet connection has no idea what’s happening and sees no difference, but as long as the outside user can only access 1001, plain text information isn’t sent along the wire and one or both sides can be sure of the identity of who’s connecting.

Depending on the stunnel settings, it might be setup like https where the client doesn’t have to have any certain keys (what we want here), or it could be like a ssh where the client DOES need a whitelisted key.

A way to test a SSL port is to use OpenSSL from the command line on the host server via ssh.  For example (keep in mind 443 is the standard https port your website is probably using):

<at ssh prompt> openssl s_client -connect localhost:443

<info snipped>
subject=/OU=Domain Control Validated/OU=PositiveSSL/CN=host.toolfish.com
issuer=/C=US/ST=TX/L=Houston/O=cPanel, Inc./CN=cPanel, Inc. Certification Authority
---
No client certificate CA names sent
Peer signing digest: SHA512
Server Temp Key: ECDH, P-256, 256 bits
---
SSL handshake has read 4946 bytes and written 415 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
 Protocol : TLSv1.2
<info snipped>
Start Time: 1518495864
 Timeout : 300 (sec)
 Verify return code: 0 (ok)

Hitting enter after that will probably cause the website to an html error message because we didn’t send a valid request. That’s ok, it shows your website’s existing SSL stuff is working so we can move on.

So first edit your /etc/stunnel/stunnel.conf to something like this:

pid = /etc/stunnel/stunnel.pid

#we won't screw with changing this because we don't want to relocate/change permissions on our files right now
#setuid = nobody
#setgid = nobody

sslVersion = all
options = NO_SSLv2

#for testing purposes.. these should be removed later:
output = /etc/stunnel/log.txt
foreground = yes
debug = 7

[websitename1]
accept = 29000
connect = 80
cert = /var/cpanel/ssl/apache_tls/oversi.io/combined

[websitename2]
accept = 30000
connect = 20000
cert = /var/cpanel/ssl/apache_tls/oversi.io/combined

Next, still from the ssh prompt, run stunnel by typing stunnel.

Because we have foreground=yes set above it will run it in the shell, showing us all output directly, instead of in the background like it normally would. (Ctrl-C to cause stunnel to stop and quit)

Look for any issues or errors it reports.  The .conf file I listed aboveshows how to set it up for two or more tunnels at once, you likely only need one of those settings.

The “websitename1” part doesn’t matter or have to match anything.

The SSL cert is the most important setting.  You need to give it your private & public & CA info in  the same file.

Now, initially, you might try to setup your keys using the files in ~/ssl/keys and ~/ssl/certs but they seem to not have everything all in one nice file including the CA certs.  I figured out ‘bundled’ ones already exist in a cpanel directory so I linked straight to them there.  (replace oversi.io with your website name)

If stuff worked, you should be able to test your SSL’ed port with OpenSSL again.  In the example above under “websitename1” I told it to listen at 29000 and send to port 80, for no good reason.

So to test from a remote computer we can do:

(you did open those ports in your firewall so outside people can connect, right?)

C:\Users\Seth>openssl s_client -connect oversi.io:29000
Loading 'screen' into random state - done
CONNECTED(00000270)
depth=2 /C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO RSA Certification Authority
verify error:num=20:unable to get local issuer certificate
verify return:0
---
Certificate chain
 0 s:/CN=oversi.io
 i:/C=US/ST=TX/L=Houston/O=cPanel, Inc./CN=cPanel, Inc. Certification Authority
 1 s:/C=US/ST=TX/L=Houston/O=cPanel, Inc./CN=cPanel, Inc. Certification Authority
 i:/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO RSA Certification Authority
 2 s:/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO RSA Certification Authority
 i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root
---
Server certificate
-----BEGIN CERTIFICATE-----
<snipped>
-----END CERTIFICATE-----
subject=/CN=oversi.io
issuer=/C=US/ST=TX/L=Houston/O=cPanel, Inc./CN=cPanel, Inc. Certification Authority
---
No client certificate CA names sent
---
SSL handshake has read 5129 bytes and written 453 bytes
---
New, TLSv1/SSLv3, Cipher is DHE-RSA-AES256-SHA
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
 Protocol : TLSv1
<snipped>
 Key-Arg : None
 Start Time: 1518497616
 Timeout : 300 (sec)
 Verify return code: 20 (unable to get local issuer certificate)
read:errno=10093

Despite the errno=11093 and return code 20 errors, it’s working and properly sending our CA info (“cPanel, Inc. Certification Authority”).

Or, easier, let’s just use the browser instead for this one since we’re connecting to port 80 if it works in this case:

https://oversi.io:29000

It worked, see the green padlock?  Oh, ignore the error the website is sending, I assume that’s apache freaking out because the URL request is different from what it’s expecting (http vs https or the port difference?) so it can’t match up the virtual domain.

From here, you should probably remove the debug options in the .conf (including the foreground=yes) and set it up to run automatically.  I just placed “stunnel” in my /etc/rc.d/rc.local file. (this gets run at boot)

Actually connecting using the Unity LLAPI

Congratulations, everything is setup on the server and you’re sure your web socket port is listening and ready to go.

While your server binary doesn’t need to change anything, your webgl client does.

You now need to connect to WSS instead of WS.  Example:

try
 {
   _connectionID = NetworkTransport.Connect(_hostID, "wss://oversi.io", portNum, 0, out error);
 }
 catch (System.Exception ex)
 {
   Debug.Log("RTNetworkClient.Connect> " + ex.Message);
 }

That’s pretty much it.  If someone doesn’t care about https and decides to play over http, it still works fine. (internally the websocket code will still connect via wss)

If you want to see it in action, check out my webgl llapi multiplayer test project https://www.oversi.io

Unity snippet: Finding a GameObject by name, even inactive or disabled ones

I use GameObject.Find() in Unity for things like enabling or fading in/out a menu or to grab an object reference via code to store for later.   (I usually prefer doing things in code rather than drag and dropping references using the Unity Editor when I can)

A problem is GameObject.Find() won’t locate inactive gameobjects which causes me problems because I tend to have inactive object trees in a scene that are just turned on/off when they are being used, like a GUI menu for example.  It’s just kind of my programming style to do things that way.

I couldn’t find a clean full snippet for this online that used scene.GetRootGameObjects, so figured I’d post one.

Cut and paste this to MyUtils.cs or your own utils class:

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.SceneManagement;

public class MyUtils 
{

    //hideously slow as it iterates all objects, so don't overuse!
    public static GameObject FindInChildrenIncludingInactive(GameObject go, string name)
    {

        for (int i=0; i < go.transform.childCount; i++)
        {
            if (go.transform.GetChild(i).gameObject.name == name) return go.transform.GetChild(i).gameObject;
            GameObject found = FindInChildrenIncludingInactive(go.transform.GetChild(i).gameObject, name);
            if (found != null) return found;
        }

        return null;  //couldn't find crap
    }
    
    //hideously slow as it iterates all objects, so don't overuse!
    public static GameObject FindIncludingInactive(string name)
    {
        Scene scene = SceneManager.GetActiveScene();
        if (!scene.isLoaded)
        {
            //no scene loaded
            return null;
        }

        var game_objects = new List();
        scene.GetRootGameObjects(game_objects);

        foreach (GameObject obj in game_objects)
        {
            if (obj.transform.name == name) return obj;

            GameObject found = FindInChildrenIncludingInactive(obj, name);
            if (found) return found;
         }

        return null;
    }

}

And use it from anywhere like:

GameObject obj = MyUtil.FindIncludingInactive(“MyMenuName”);