Universal Game Translator – Using Google’s Cloud Vision API to live-translate Japanese games played on original consoles (try it yourself!)

Why I wanted a “translate anything on the screen” button

I’m a retro gaming nut.  I love consuming books, blogs, and podcasts about gaming history.  The cherry on top is being able to experience the identical game, bit for bit, on original hardware.  It’s like time traveling to the 80s.

Living in Japan means it’s quite hard to get my hands on certain things (good luck finding a local Speccy or Apple IIe for sale) but easy and cheap to score retro Japanese games.

Yahoo Auction is kind of the ebay of Japan.  There are great deals around if you know how to search for ’em.  I get a kick out of going through old random games, I have boxes and boxes of them.  It’s a horrible hobby for someone living in a tiny apartment.

Example haul – I got everything in this picture for $25 US! Well, plus another $11 for shipping.

There is one obvious problem, however

It’s all in Japanese.  Despite living here over fifteen years, my Japanese reading skills are not great. (don’t judge me!) I messed around with using Google Translate on my phone to help out, but that’s annoying and slow to try to use for games.

Why isn’t there a Google Translate for the PC?!

I tried a couple utilities out there that might have worked for at least emulator content on the desktop, but they all had problems.  Font issues, weak OCR, and nothing built to work on an agnostic HDMI signal so I could do live translation while playing on real game consoles.

So I wrote something to do the job called UGT (Universal Game Translator) – you can download it near the bottom of this post if you want to try it.

Here’s what it does:

  • Snaps a picture from the HDMI signal, sends it to google to be analyzed for text in any language
  • Studies the layout and decides which text is dialog and which bits should be translated “line by line”
  • Overlays the frozen frame and translations over the gameplay HDMI signal
  • Allows copy/pasting the original language or looking up a kanji by clicking on it
  • Can translate any language to any language without needing any local data as Google is doing all the work, can handle rendering Japanese, Chinese, Korean, etc  (The font I used is this one)
  • Controlled by hotkeys (desktop mode) or a control pad (capture mode, this is where I’m playing on a real console but have a second PC controller to control the translation stuff)

In the video above, you’ll notice some translated text is white and some is green.  The green text means it is being treated as “dialog” using its weighting system to decide what is/isn’t dialog.

If a section isn’t determined to be dialog, “Line by line” is used.  For example, options on a menu shouldn’t be translated all together (Run Attack Use Item), but little pieces separately like “Run”, “Attack”, “Use item” and overlaid exactly over the original positions.  If translated as dialog, it would look and read very badly.

Here are how my physical cables/boxes are setup for “camera mode”. (Not required, desktop mode doesn’t need any of this, but I’ll talk about that later)

Happy with how merging two video signals worked with a Roland V-02HD on the PlayStep project, I used a similar method here too.  I’m doing luma keying instead of chroma as I can’t really avoid green here. I modify the captured image slightly so the luma is high enough to not be transparent in the overlay. (of course the non-modified version is sent to Google)

This setup uses the windows camera interface to pull HDMI video (using Escapi by Jari Kompa) to create screenshots that it sends to Google.  I’m using an Elgato Cam Link for the HDMI input.

Anyway, for 99.99999999% of people this is setup is overkill as they are probably just using an emulator on the same computer so I threw in a “desktop mode” that just lets you use hotkeys (default is  Ctrl-F12) to translate the active Window. It’s just like having Google Translate on your PC.

Here’s desktop mode in action, translating a JRPG being played on a PC Engine/TurboGrafx 16 via emulation. It shows how you can copy/paste the recognized text if want as well, useful for kanji study, or getting text read to you.  You can click a kanji in the game to look it up as well.  (Update: It now internally can handle getting text read as of V0.60, just click on the text.  Shift-Click to alternate between the src/dest language)

Try it yourself

Before you download:

  • All machine translation is HORRIBLE – this is no way replaces the work of real translators, it’s just (slightly) better than nothing and can stop you from choosing “erase all data” instead of “continue game” or whatever
  • You need to rename config_template.txt to config.txt and edit it
  • Specifically, you need to enter your Google Vision API key.  This is a hassle but it’s how Google stops people from abusing their service
  • Also, you’ll need to enable the Translation API
  • Google charges money for using their services after you hit a certain limit. I’ve never actually had to pay anything, but be careful.
  • This is not polished software and should be considered experimental meant for computer savvy users
  • Privacy warning: Every time you translate you’re sending the image to google to analyze.  This also could mean a lot of bandwidth is used, depending how many times you click the translate button.  Ctrl-12 sends the active window only, Ctrl-11 translates your entire desktop.
  • I got bad results with older consoles (NES, Sega Master System, SNES, Genesis), especially games that are only hiragana and no kanji. PC Engine, Saturn, Dreamcast, Neo-Geo, Playstation, etc worked better as they have sharper fonts with full kanji usually.
  • Some game fonts work better than others
  • The config.txt has a lot of options, each one is documented inside that file
  • I’m hopeful that the OCR and translations will improve on Google’s end over time, the nice thing about this setup is the app doesn’t need to be updated to take advantage of those improvements or even additional languages that are later supported

After a translation is being displayed, you can hit ? to show additional options.  Also, this is outdated, use the real app to see the latest.

5/8/2019 – V0.50 Beta – first public release, experimental
5/13/2019 – V0.51 Beta – Added S to screenshot, better error checking/reporting if translation API isn’t enabled for the Google API key, minor changes that should offer improved translations
5/30/2019 – V0.53 Beta – Added input_camera_device_id setting to config.txt for systems with multiple cameras.  Moves mouse offscreen for “camera” mode captures
9/5/2019 – V0.54 Beta – Fixes crash on startup problem some people had, adds “audio|none” config.txt command to optionally disable all sound.  Added “minimum_brightness_for_lumakey” setting to config.txt in case the default isn’t right
9/15/2019 – V0.60 Beta – New feature, text to speech!  You’ll need to enable Google’s Text To Speech API, Fixed a crash bug, added some in-app persistent settings, gamepad can now move around the cursor and click things.  Controls changed a bit. Added automatic reading of detected dialog, can choose to read src or dest langs, can hide text overlays if you want now.  A few new options in the config.txt. Switched to FMOD audio, SDL_Mixer has buggy mp3 playback which was causing some me grief. Changed the translate button sound to something more soothing.

Note: I plan to open source this, just need to get around to putting it on Git, if you’re someone who would actually do something with the source, please hassle me into doing it.

Download Universal Game Translator for Windows (64-bit) (Binary code signed by Robinson Technologies)

Conclusion and the future

Some possible upgrades:

  • Built-in text to speech on the original dialog (well, by built in I mean using Google’s text to speech API and playing it in UGT, easier than the copy and paste method possible now)
  • A built in Kanji lookup also might be nice,  Jim Breen’s dictionary data could work for this.
  • My first tests used Tesseract to do the OCR locally, but without additional dataset training it appeared to not work so hot out of the box compared to results from Google’s Cloud Vision.  (They use a modified Tesseract?  Not sure)  It might be a nice option for those who want to cut down on bandwidth usage or reliance on Google.  Although the translations themselves would still be an issue…

I like the idea of old untranslated games being playable in any language, in fact, I went looking for famous non-Japanese games that have never had an English translation and really had a hard time finding any, especially on console.  If anyone knows of any I could test with, please let me know.

Also, even though my needs focus on Japanese->English, keep in mind this also works to translate English (or 36 other languages that Google supports OCR with) to over 100 target languages.

Test showing English being translated to many other languages in an awesome game called Growtopia

35 thoughts on “Universal Game Translator – Using Google’s Cloud Vision API to live-translate Japanese games played on original consoles (try it yourself!)

  1. Victor

    Incredibly wonderful work, thank you for sharing it with the world. I wonder if you were aware of an unclaimed bounty that could be relevant?


    Also, one of the first things that popped up in my mind when reading over your post was the text-only annotated translation of Mizzurna Falls, a late-period Playsation release, available at https://projectmizzurna.tumblr.com/ . It would be interesting to know how and if OCR / computer-vision&translation could assist or further these kinds of nearly completed but stalled fan translation projects.

    Thanks again for your amazing work.

  2. Dominic Tarason

    This is something I’ve been hoping for since Google Translate was a thing. Thank you so much, and I hope you continue work on it! This feels like the first step in a seriously big deal.

  3. Dan Z

    I’m having issues where the text is being pulled from the window, but no translation appears (desktop mode). I can copy the original text and paste in to Translate though.

  4. Seth Post author

    Hey Dan, can you check for a log.txt file in the same directory as the .exe? It might show an error that happened. Also, an “error.txt” file might exist, if so, it will show the error that Google is giving if it refused to translate something. Strange it would do the OCR but not the translation. Feel free to send the files to me at seth at rtsoft.com to check out as well.

  5. Maxwell Yezpitelok

    Hi, amazing invention! I’m having the same issue as Dan Z above. I don’t see anything marked as an error on log.txt, and there’s nothing new on error.txt, but I’d be happy to send them to that address too. Also, not sure if this is relevant, but sometimes after pressing Control + F12 I get like 8-10 messages saying “Target language is Japanese” really fast.

  6. Seth Post author

    Hi Maxwell, feel free to also contact me by email, I’ll fix the issue if I can recreate it.

    Pressing the # 2 on your keyboard is a shortcut for Japanese by default, not sure why Ctrl-12 would cause the app to think 2 was hit, strange. Is the keyboard or Windows Keyboard layout non-US by any chance? If so, you might try editing the config.txt file and changing Ctrl-F12 to a different hotkey. It should say “Analyzing” when it’s hit.

    If the target language is the same as the detected language in a block of text it won’t actually do any translation, maybe that’s happening? Press 1 to change the target language back to English.

    For a test, I downloaded the zip on a different machine but it worked here. (I did rename config_template.txt to config.txt and set my google API key, nothing else should be required for it to work).

  7. Seth Post author

    Oh, one more thing, if you look in the directory for a “temp.jpg” and view the image, that is the last image sent to Google, might be useful to make sure it’s sending what you think.

  8. Seth Post author

    Hey guys, I might have guessed the problem. I think the Cloud Language API has to be enabled similar to the Cloud Vision API.

    Can you check here? https://console.cloud.google.com/apis/api/translate.googleapis.com

    On mine, if I click Overview, it shows “Activation status: Enabled”. There may be a button you need to click to enable it. Under credentials, the API key listed is the same as the one for my Cloud Vision API. (I think Google does that automatically, I didn’t set it or anything)

    I’m going to update the app soon to detect this issue and give a clear message about it. The “error.txt” thing isn’t working for this error currently.

  9. Doug T

    Thank you so much Seth, that was exactly it. I had to enable the Translation API and then add it to my key under credentials. And now it works!! Amazing!

  10. Dan Z

    Yup, that did it for me too! Feel free to ignore the log.txt I just sent you, I should have check the later replies.

  11. Maxwell Yezpitelok

    Just wanted to confirm that the issue was solved for me too!

    One feature I think would be cool, if it’s possible, is the ability to keep the translated screenshots in a folder. I’m pressing Print Screen frantically to save all this good stuff.

  12. Seth Post author

    Thanks for the updates. I uploaded a new version with some tweaks, same URL as before. (will say 0.51 instead of 0.50 on title screen).

    Maxwell, I added an S to screenshot option. (It saves as screenshot_0001.jpg, screenshot_0002.jpg, etc) Should probably change to save to png later, but this only took a few minutes so good enough for now I figure.


    Congratulations on the excellent work

    – OCR detection fails on poor fonts

    I have a suggestion as to the problem of poor results with old consoles (at least in desktop mode and for Western characters)

    I remembered a process I’ve used in the past with a program that extracts subtitles from DVD movies known as SubRip

    The process consisted of OCR detection, but you should type the letter found (once each time a different character), making the program, by comparison, detect all the rest

    This worked very well for subtitles where the texts were very bad in the movies and the same case fits perfectly in old games, because usually the source of the texts is the same during the whole game

    Why not include this feature in the UGT?

    It would have a small work at the beginning of the game, so the program would set up a kind of table where it would “learn” which letter it is and theoretically would work throughout the game without detection and translation errors

    I just do not know how I could help for Eastern characters

    PS: Sorry for some typos, I’m Brazilian and I used google translate to post this message, original text below

    Parabéns pelo excelente trabalho

    – A detecção de OCR falha em fontes pobres

    Tenho uma sugestão quanto ao problema de maus resultados com consoles antigos (pelo menos no modo desktop e para caracteres ocidentais)

    Lembrei de um processo que já usei no passado com um programa que extraia legendas de filmes em DVD conhecido como SubRip

    O processo consistia na detecção de OCR, mas você deveria digitar a letra encontrada (uma vez a cada caracter diferente), fazendo o programa, por comparação, detectar todo o resto

    Isso funcionava muito bem para as legendas onde os caracteres eram muito ruins nos filmes e o mesmo caso se encaixa perfeitamente em jogos antigos, porque normalmente a fonte dos textos é a mesma durante o jogo inteiro

    Por que não incluir essa funcionalidade no UGT?

    Teria um pequeno trabalho no começo do jogo, assim o programa montaria uma especie de tabela onde ele “aprenderia” que letra é aquela e teoricamente funcionaria por todo o jogo sem erros de detecção e tradução

    Só não sei como poderia ajudar para caracteres orientais

    PS: Desculpe por algum erro de digitação, sou brasileiro e usei o google tradutor para postar esta mensagem, texto original abaixo

  14. Mark

    I’m using desktop mode on a laptop, and any time I use ctrl +12 or +11, I get an image back that is much too big for the active window I’m using, and it cuts off the lower right half of the image by a good deal. Ends up looking like so:
    https://imgur.com/a/3K1sh2g . The original window is the second image in the link.

  15. Harvey Smith

    I’m having an issue where it isn’t capturing the entire screen/window.
    Looking at the log file it looks like the program is setting the video mode to 1024 x 768 instead of 1920 x 1080
    Any clue how I might be able to fix this?

    Other than that it’s been working great for me, thanks!

  16. Seth Post author

    Hey guys, sorry for the slow replies, the anti-spam filter has been catching comments until I manually un-spam them.

    Ramzero: I think Google DOES have a way to “train” the dataset so in theory this is possible now. Wonder if anybody has messed with that?

    Mark: Hmm, yeah I can see the scale is off, like it’s zooming in or something. Do you any custom scaling options (like 150% scale so fonts are bigger) on your desktop by any chance? Also, if you press S so it takes a screenshot, is the screenshot cutoff also or it’s correct?

    Harvey – In desktop mode it’s normal for the app to start in 1024X768, as soon as you do a “window scan” or “desktop scan” it should resize itself to grab (and overlay) the correct data. The custom resolutions are ignored in desktop mode, it should auto-resize as needed – I wonder if it’s the same problem as Mark had above.

  17. Luc

    All machine translation is HORRIBLE – this is no way replaces the work of real translators
    not agree with this 1 because official is more suck translating than machine, a lot of them instead of translating the dialog they make their own dialog.
    i hope i can get voice translate too because i getting feed up enough hearing missmatched translate from official translator game.

  18. Z4

    Hi, i have a problem with a game in fullscreen. When i scan, everything works perfectly but when i hit space for resume playing, my game minimize and i’m back on my desktop. Can anyone help me with that ?

  19. James

    Everytime I open UGT.exe, it opens up in my task manager for a few seconds then closes. Any solutions?

    I tried to run compatibility under other versions, also tried to run as admin: no go. Any advice?

  20. chloe

    same here. black splash screen is appear and after 2 sec, closes windows.

    I have put in API key,
    i enabled translation api and cloud translation api

  21. Seth Post author

    Hmm, Chloe and James – is there a log.txt file in the directory? Near the end, maybe it has some error message that would give us an idea of why it shutdown?

  22. chloe

    Valid key names:
    Reading config.txt
    Registered hotkey hotkey_to_scan_whole_desktop to Control,F11
    Registered hotkey hotkey_to_scan_active_window to Control,F12
    Registered hotkey hotkey_to_scan_draggable_area to Control,Shift,F11
    Setting native video mode to 1024, 768 – Fullscreen: 0 Aspect Ratio: 0.00
    Setting size for GUI
    Window is already 1024, 768
    App got focus
    Initial window pos is 448, 156
    Initting Universal Game Translator 0.53 Beta by Seth A. Robinson (www.rtsoft.com)
    GL Version = 4.6.0 NVIDIA 436.02

    GL Vendor = NVIDIA Corporation

    GL Renderer = GeForce RTX 2060 SUPER/PCIe/SSE2

    Log.txt just texted this

  23. chloe

    I checked working other computer win10 1809. As same config.txt file.

    My desktop is latest win10 and still not working.

  24. Seth Post author

    Hmm. Even if the key was wrong or missing, it shouldn’t crash. (It will show a nice error message if the key isn’t accepted)

    It looks like the app is failing to initialize either OpenGL or the audio system.

    Is there a valid sound device on your system?

    I’ve tested with a GeForce 1080 as well as a GeForce 2080 RTX so I think the GeForce RTX 2060 should be ok.

    If nothing works drop me an email at seth at rtsoft.com and I might be able to do a special debug build to isolate the problem.

  25. chloe

    I think related audio system. Because i using soundfx system for enhance windows audio. I will test it.

    However, it has never affected the behavior of other programs. It seems to be a problem that needs to be fixed.

  26. Nester

    Laptop is still working fine but desktop loads and then closes right away. Here’s my current log.txt:

    Valid key names:
    Reading config.txt
    Registered hotkey hotkey_to_scan_whole_desktop to Control,F11
    Registered hotkey hotkey_to_scan_active_window to Control,F12
    Registered hotkey hotkey_to_scan_draggable_area to Control,Shift,F11
    Setting native video mode to 1024, 768 – Fullscreen: 0 Aspect Ratio: 0.00
    Setting size for GUI
    Window is already 1024, 768
    App got focus
    Initial window pos is 448, 156
    Initting Universal Game Translator 0.53 Beta by Seth A. Robinson (www.rtsoft.com)
    GL Version = 4.6.0 NVIDIA 436.15

    GL Vendor = NVIDIA Corporation

    GL Renderer = GeForce RTX 2070/PCIe/SSE2

  27. Seth Post author

    Hey Nestor and others, please download the file again, I’ve updated it to V0.54 which should fix the crash on startup problem. Thanks to Chloe for testing this via email earlier today to verify it worked.

  28. Mike

    What a great application, Seth! Any plans to add logging of the source+translations for the PC/desktop mode? I’d throw in on that if you were to consider adding such a feature. Do you have a patreon?

    The big barrier to that feature, in my mind, would be the persistence of text on each screen. Would need some way of throwing out duplicates when writing to file, I’d imagine. Or perhaps just maintain an in-memory expanding list of translations and then provide a button that dumps it to file.

    Regardless, I came here to express my admiration and gratitude for this work.

  29. Seth Post author

    Thanks Mike. Don’t have a patreon, but thanks for asking.

    Thinking about logging – hmm. Unfortunately it’s pretty shaky as it is for my system to determine what is menu options and what is dialog (colored green) – it tends to make a lot of mistakes which would confuse logging. What might work is something like you can drag-out a rectangle to specify “log all text that is written in this area” and perhaps that would work for certain kinds of games, but you’d have to re-draw the rect if it changed, like if you went into a shop and the dialog box was a different size.

    Ignoring already posted dialog would be a challenge as you mentioned but should be possible in theory. Hrm.

Leave a Reply

Your email address will not be published. Required fields are marked *