Universal Game Translator – Using Google’s Cloud Vision API to live-translate Japanese games played on original consoles (try UGM yourself!)

(download links near the bottom)

Why I wanted a “translate anything on the screen” button

I’m a retro gaming nut.  I love consuming books, blogs, and podcasts about gaming history.  The cherry on top is being able to experience the identical game, bit for bit, on original hardware.  It’s like time traveling to the 80s.

Living in Japan means it’s quite hard to get my hands on certain things (good luck finding a local Speccy or Apple IIe for sale) but easy and cheap to score retro Japanese games.

Yahoo Auction is kind of the ebay of Japan.  There are great deals around if you know how to search for ’em.  I get a kick out of going through old random games, I have boxes and boxes of them.  It’s a horrible hobby for someone living in a tiny apartment.

Example haul – I got everything in this picture for $25 US! Well, plus another $11 for shipping.

There is one obvious problem, however

It’s all in Japanese.  Despite living here over fifteen years, my Japanese reading skills are not great. (don’t judge me!) I messed around with using Google Translate on my phone to help out, but that’s annoying and slow to try to use for games.

Why isn’t there a Google Translate for the PC?!

I tried a couple utilities out there that might have worked for at least emulator content on the desktop, but they all had problems.  Font issues, weak OCR, and nothing built to work on an agnostic HDMI signal so I could do live translation while playing on real game consoles.

So I wrote something to do the job called UGT (Universal Game Translator) – you can download it near the bottom of this post if you want to try it.

Here’s what it does:

  • Snaps a picture from the HDMI signal, sends it to google to be analyzed for text in any language
  • Studies the layout and decides which text is dialog and which bits should be translated “line by line”
  • Overlays the frozen frame and translations over the gameplay HDMI signal
  • Allows copy/pasting the original language or looking up a kanji by clicking on it
  • Can translate any language to any language without needing any local data as Google is doing all the work, can handle rendering Japanese, Chinese, Korean, etc  (The font I used is this one)
  • Controlled by hotkeys (desktop mode) or a control pad (capture mode, this is where I’m playing on a real console but have a second PC controller to control the translation stuff)
  • (Added in later versions) Can read the dialog outloud in either the original or translated language
  • Can drag a rectangle to only translate a small area (Ctrl-F10 by default)

In the video above, you’ll notice some translated text is white and some is green.  The green text means it is being treated as “dialog” using its weighting system to decide what is/isn’t dialog.

If a section isn’t determined to be dialog, “Line by line” is used.  For example, options on a menu shouldn’t be translated all together (Run Attack Use Item), but little pieces separately like “Run”, “Attack”, “Use item” and overlaid exactly over the original positions.  If translated as dialog, it would look and read very badly.

Here are how my physical cables/boxes are setup for “camera mode”. (Not required, desktop mode doesn’t need any of this, but I’ll talk about that later)

Happy with how merging two video signals worked with a Roland V-02HD on the PlayStep project, I used a similar method here too.  I’m doing luma keying instead of chroma as I can’t really avoid green here. I modify the captured image slightly so the luma is high enough to not be transparent in the overlay. (of course the non-modified version is sent to Google)

This setup uses the windows camera interface to pull HDMI video (using Escapi by Jari Komppa) to create screenshots that it sends to Google.  I’m using an Elgato Cam Link for the HDMI input.

Anyway, for 99.99999999% of people this is setup is overkill as they are probably just using an emulator on the same computer so I threw in a “desktop mode” that just lets you use hotkeys (default is  Ctrl-F12) to translate the active Window. It’s just like having Google Translate on your PC.

Here’s desktop mode in action, translating a JRPG being played on a PC Engine/TurboGrafx 16 via emulation. It shows how you can copy/paste the recognized text if want as well, useful for kanji study, or getting text read to you.  You can click a kanji in the game to look it up as well.  (Update: It now internally can handle getting text read as of V0.60, just click on the text.  Shift-Click to alternate between the src/dest language)

Try it yourself

Before you download:

  • All machine translation is HORRIBLE – this is no way replaces the work of real translators, it’s just (slightly) better than nothing and can stop you from choosing “erase all data” instead of “continue game” or whatever
  • You need to rename config_template.txt to config.txt and edit it
  • Specifically, you need to enter your Google Vision API key.  This is a hassle but it’s how Google stops people from abusing their service
  • Also, you’ll need to enable the Translation API
  • Google charges money for using their services after you hit a certain limit. I’ve never actually had to pay anything, but be careful.
  • This is not polished software and should be considered experimental meant for computer savvy users
  • Privacy warning: Every time you translate you’re sending the image to google to analyze.  This also could mean a lot of bandwidth is used, depending how many times you click the translate button.  Ctrl-12 sends the active window only, Ctrl-11 translates your entire desktop.
  • I got bad results with older consoles (NES, Sega Master System, SNES, Genesis), especially games that are only hiragana and no kanji. PC Engine, Saturn, Dreamcast, Neo-Geo, Playstation, etc worked better as they have sharper fonts with full kanji usually.
  • Some game fonts work better than others
  • The config.txt has a lot of options, each one is documented inside that file
  • I’m hopeful that the OCR and translations will improve on Google’s end over time, the nice thing about this setup is the app doesn’t need to be updated to take advantage of those improvements or even additional languages that are later supported

After a translation is being displayed, you can hit ? to show additional options.  Also, this is outdated, use the real app to see the latest.

5/8/2019 – V0.50 Beta – first public release, experimental
5/13/2019 – V0.51 Beta – Added S to screenshot, better error checking/reporting if translation API isn’t enabled for the Google API key, minor changes that should offer improved translations
5/30/2019 – V0.53 Beta – Added input_camera_device_id setting to config.txt for systems with multiple cameras.  Moves mouse offscreen for “camera” mode captures
9/5/2019 – V0.54 Beta – Fixes crash on startup problem some people had, adds “audio|none” config.txt command to optionally disable all sound.  Added “minimum_brightness_for_lumakey” setting to config.txt in case the default isn’t right
9/15/2019 – V0.60 Beta – New feature, text to speech!  You’ll need to enable Google’s Text To Speech API, Fixed a crash bug, added some in-app persistent settings, gamepad can now move around the cursor and click things.  Controls changed a bit. Added automatic reading of detected dialog, can choose to read src or dest langs, can hide text overlays if you want now.  A few new options in the config.txt. Switched to FMOD audio, SDL_Mixer has buggy mp3 playback which was causing some me grief. Changed the translate button sound to something more soothing.
11/22/2019 – V0.61 Beta – Replaced audio system with Audiere to prepare for putting it on GIT, added more logging and error checking with libCURL –  I’ve put the complete source on Github, feel free to bugfix or add some features if you’re a programmer!

4/12/2020 – V0.62 Beta:

* FEATURE: Added draggable window option (Changed hotkey to Ctrl-F10 in the default config.txt, if upgrading, it will be be Shift-Ctrl-F11 though)
* Removed an include for wiringpi (it isn’t used)
* Added “FMOD Release” MSVC configuration profile, this enables FMOD as well, it will be the default now as I noticed some clicks/pops from Audiere sometimes when playing text to speech generated by Google
* Added status at the bottom that shows what is happening with uploading/download, in situations where “nothing is happening” these status updates will let you know what it’s doing, useful for slow internet or whatever
* Can now cancel spoken audio by clicking it again
* Added a font so rendering Hindi is supported (hotkey is 0)
* Initiating a translation when a translate dialog is already on the screen now just toggles it off instead of doing weird things
* Added “audio_device” option to config.txt, if text matches an audio device that will be used instead of the default
* Joystick deadzone increased from 0.15 to 0.20, needed because my 360 sticks are just bad
* BUGFIX: Word wrap doesn’t sometimes cause spaces to be missing between words

4/23/2020 – V0.63 Beta

* Added support for rendering Punjabi (note: the only open source font I could find doesn’t have English letters in it..hope to find something better later)
* Added support for setting a source language hint in the config.txt. Required to read some non latin languages, for example setting to “pa” for Punjabi allows that language to be read. Hint language
is shown on startup screen, “auto” means no hint
* Now shows exact google error text onscreen (like bad API key or whatever) instead of saying “open error.txt” (error.txt is still written also though)
* Shows “<language code> language not supported for audio” if Google can’t do text to speech on it (Punjabi for example)
* Punjabi as a translation target is now one of the included languages you can cycle through using [ and ] or L and R on a control pad. Note: these languages can be changed/added via the config.txt, the first one set will be the default on startup
* “Press space to continue or ? for help – rtsoft.com” changed to “<Space or ?>” and doesn’t show at all for extremely small drag rects, so it doesn’t overlay the translation
* Shows “Nothing found” if there is zero text to translate, better than looking like it crashed or something

NOTE FOR UPGRADING: It’s recommended to start with the config_template.txt again, just copy over your google API key and rename it config.txt again.

Download Universal Game Translator for Windows (64-bit) (Codesigned by Robinson Technologies)

This is an open source project on Github

Conclusion and the future

Some possible upgrades:

  • A built in Kanji lookup also might be nice,  Jim Breen’s dictionary data could work for this.
  • My first tests used Tesseract to do the OCR locally, but without additional dataset training it appeared to not work so hot out of the box compared to results from Google’s Cloud Vision.  (They use a modified Tesseract?  Not sure)  It might be a nice option for those who want to cut down on bandwidth usage or reliance on Google.  Although the translations themselves would still be an issue…

I like the idea of old untranslated games being playable in any language, in fact, I went looking for famous non-Japanese games that have never had an English translation and really had a hard time finding any, especially on console.  If anyone knows of any I could test with, please let me know.

Also, even though my needs focus on Japanese->English, keep in mind this also works to translate English (or 36 other languages that Google supports OCR with) to over 100 target languages.

Test showing English being translated to many other languages in an awesome game called Growtopia

70 thoughts on “Universal Game Translator – Using Google’s Cloud Vision API to live-translate Japanese games played on original consoles (try UGM yourself!)

  1. Seth Post author

    Chris – thanks for the feedback. Would love to support vertical writing properly. Haven’t actually run into it with the games I’ve been playing but I’m sure it’s going to come along at some point and annoy me enough to add support. :)

    In any case, if you do get around to looking at the source and maybe submitting additions I’d definitely work with you on getting it integrated/accepted on github.

  2. Gimmy

    In fact, for some time now I was wondering … but why on earth can’t you somehow use Google Translator to automatically translate games? It is true that automatic translations are not very precise but it is also true that they can be modified, perhaps manually where needed … NICE WORK MATE.

  3. Zak

    Hey, Just wanna say Thank you! SO SO MUCH. Using this I was finally able to make it through Ys V on PS2. it’s been my white whale of games I want to play but can’t understand and this tool did a wonderful job helping me know where to go or what I needed to find. Seriously can’t thank you enough.

    If I may make a suggestion…. Could there be a way to map the translate active window hotkey to buttons on my USB controller when using emulators to trigger the translation window rather than having to hit my take my hands off the controller?

  4. Zak

    Sorry if this is a double post.

    Thank you so much! With this I was able to finally get through Ys V on ps2. If I may make a suggestion for desktop mode could we somehow map it to a button combo on a USB controller rather than Keyboard only? that way I wouldnt have to put my controller down to translate active window.

    Seriously. Thank you so much.

  5. Anonymous

    Looks like Google has switched to charging $45/hour or $80/2m characters of text for API access.
    Am I missing something, or did this suddenly become an expensive way to play games?

  6. fairylander

    do you know if there is some way i can translate a old outdated graphics game on PC that is chinese?some text in-game were big enough and got translated, but most of the chinese letter shows in game is quite small, and i was not able to get it translated with this program. thanks for answering

  7. Will

    I might need some help, been trying to set it up to play emulators but I haven’t been successful. Any help?

  8. Gunwant Bhambra

    Hi developer you have done a great work here my friend does not understand English but only punjabi but punjabi is unable to render and shows boxes. Is there anyting that can be done about that, like change the font??

  9. CP

    Question, this seems to be working FOR THE MOST PART, but, no matter what I try it will only capture the upper left portion of my screen at an EXTREMELY zoomed in segment

  10. Seth Post author

    Zak> I agree that would be useful. I don’t think it’s possible to share control with a game on the same controller (via XInput or DirectInput – has anybody ever seen that done?) – perhaps it could work with a second USB controller or something else though.

    Anonymous> Hmm, I haven’t yet been hit with any charges, but I did get a notice that my “free year” was up a month or two ago. I set an alarm to email me when it hits $5 to be safe. You can do that from the Google Cloud Platform console by choosing “Billing”, then “Budgets & Alert” and then “Create Budget”. So far I still owe $0 but everybody should be careful just in case.

    fairylander> Sorry, super small text gives me bad results too.

    Gunwant Bhambra> I added support in V0.63 for Punjabi rendering (can hit [ and ] to cycle to that language)

    CP> Hmm. If you check display properties in Windows 10, it might fix it to set your “scale” to 100% if it’s set to something other than that.

  11. Jason

    Seth> Any plans on supporting other auth types besides the API key? The more advanced translation options (V3 and by extension autoML) require a service account for the software to use.

    I was also curious about how the two systems were interacting with one other and didn’t see to much in github (though I didn’t actually open the source files to dig around)

  12. Luis

    Hello! I have been playing with this for a while now. I am getting the following in error.txt:

    “Error 400 (Bad Request)!!1 Your client has issued a malformed or illegal request. That’s all we know.”

    I have entered my API key in various formats to try and resolve this. I don’t see what else it could be on my end. It seems it will accept anything as long as BEGIN/END PRIVATE KEY exist somewhere in the entered string. I only get a bad API Key error when I remove these prefix/suffix.

    Any ideas what may be going wrong?

  13. Seth Post author

    Jason: I would definitely add support for V3 communication if it meant better translations – my understanding that using AutoML for say, OCR of games would require work though, a training dataset and such. If anybody knows or has such a thing made for retro games I’d love to try it. Like, could a set designed for NES level Japanese kana work perfectly? Different sets made for certain game types? Without that, I don’t think there is any advantage in using the v3 API though, correct me if I’m wrong.

    Luis: Hmm. Your cloud API key shouldn’t need any BEGIN/END stuff, I think maybe you’re using a service key instead of a Cloud API Key. A cloud API key should look like this: https://prnt.sc/s5l9du

    You know, I think Google changed their docs and removed the link to make the API key from the help page I linked in the .txt file. It looks like it can still be made though, under “Credentials”, choose “+Create Credentials” and then choose API key. See pic: https://prnt.sc/s5lc0j

    Hmm, maybe I really do need to support service keys… For now, I’ve updated the config_template.txt file inline instructions with better info.

  14. Jason

    Yea there isn’t really any benefit to the new auth besides not needing to edit a config file to put in the API key (since I believe it uses an actual cert file).

    From what I was reading on building an AutoML lib (or just basic vs advanced) it should let you have it learn from specific phrases you enter so some game terminology would probably come across making more sense.

    You technically could have different AutoML libs for different game types (like if you wanted one so the translation came across more formal for medieval / fantasy games, and another with a lot of slang for a GTA style game)

    Big problem being I’m pretty sure you’d have to train them separately, but might be worth for some of the more avid gamers.
    (also I’m 100% not a programmer aside from slight edits to JSON files for azure templates and powershell scripts, so I have no idea how hard getting the new auth setup would be)

    Link to the comparison page: https://cloud.google.com/translate/docs/editions

  15. Fredator

    I can’t thank you enough for your work. I beat Sakura Taisen on PS2 flawlessly, understanding most of it. It was awesome.

    I have a suggestion, I don’t know if it is possible but, could you implement a live translation system ? same as google translate when the smartphone constantly scan the picture and replace the text/kanjis in live (no need to push keys to translate, it’s transaling all the time)
    I hope you see what I mean ^^

  16. Jinzo

    The software works well, ocr is more accurate than other software, if there is an extra copy of the translated text, it is more wonderful, I will not need to go to google translate to translate it again. Google translate is a little bad at translating it, it doesn’t tell the line of the sentence.
    Thank the author very much

  17. RubenMG

    Hey, I think this project is really cool, I’ve been thinking myself in this aproach for some time. I haven’t tested it yet but I have a question/suggestion. Are custom glossaries supported? That way you could add more accurate translations to each game and maybe some translators would be more interested in translating games seeing an easier approach, more similar to making movies/shows subtitles or mangas translations, since the only thing they would need would be the script transcripted and organized to put the translations in.

  18. Brandon

    You paid only $25 for ALL those games???

    DAMN, son! I haven’t seen even THE BIGGEST ‘game lots’ have *that much* here on eBay (esp. for such a low price) :o

Leave a Reply

Your email address will not be published. Required fields are marked *