Machine Translation On Local Computer: the AI Way or Drained-Wallet Way

Machine Translation on Local Computer with AI

Image by ddzphoto from Pixabay


Short Introduction

Are you aware that you can have an almost fully flagged local machine translation engine running on your local laptop? I know. Who needs this, in this time when Google offers this for free on his site? Not mentioning Microsoft here, they are always behind Google in the Internet game.

Yet, try to use either Google or Mircosoft translation services from your application, and you are in trouble. You have two options with them, and in the most common use case scenario, you are a losing player.

Strategy no. 1

You try to use an already published web interface and automatically generate HTTP requests. A strategy that almost works. Implemented in several Python libraries. Unfortunately, both Gooogle and Microsoft are resilient to such “attacks”. They can block your requests. Then you can catch their catches, delay your HTTP requests for a second or two, then try again. Classic cat and mouse game, and cats are winning almost all the time, on long terms.

Strategy no. 2

Pay for translation API. Both Google and Microsoft will give you an API key, and you can use it within your application as much as you need and pay for what you have used. It is weird to me why the translations provided in that way are so much different from the translations given on their free web pages. And the differences are enormous. I been there, tried them, was surprised by the results. I can’t say this for sure because I have no proof from inside, but, looking at them as the black boxes, they run different engines for paid and free translation services. Oh, and yes, they are not that cheap as it might seem if you are looking at the prices for the translated texts per character. I just paid 45$ to Microsoft for the translation services, used for the short testing small application, with several translations over text that weights 7 kb. Try to put this in production, with texts 200+ kb (which is a realistic scenario for my application), run translations several times, since there is a need for manual tunings over and there while producing the final result, and just try to calculate the costs.

Both Google and Microsoft are not consistent in this game, they change the translation Rest API, and there is a need to spend the un-plannable time maintaining them.

Of course, there are smaller players in this translation game, specialized for translation services only. They are far better than both Google and Microsoft. You can take a look at https://www.deepl.com/translator. Almost the same story again: good quality on the free web translation. They provide API for automated translation, and they are consistently good here. And expensive: two times more costly than Google or Microsoft paid translation services.

Long story short: you lose either on quality, using Google or Microsoft, or let DeepL drain your wallet, with high quality and reliability.

Strategy no. 3: Use AI on Your Local Computer

Back to the roots: who needs this, in this time when there is a freely available translation on the web? The answer is: if you need this in your own application, it’s you.

So, you need an automated translation service that is reachable within your application. What you are translating is not that complicated: it’s not an essay, neither Shakespeare, translating to the German language. Let’s say you need translation over customer feedback. If you are lucky, that would be a bunch of short texts, and you need them translated for automated analysis, whatever kind of it is. One very important feature of such texts: they are short, mainly written in a hurry.

It is very much possible to establish and run a local machine translation on your computer. Of course, you cannot compete with Google, Microsoft, or DeepL web-based translation with the translation quality and speed. But, for a use-case when all you need is an automatic translation of not-that-complex and large texts, all you need is:

  1. Python environment
  2. PyTorch installed
  3. A will to wait for a couple of seconds per translation, depending on the speed of your local machine. If you have CUDA-enabled GPU, you’ll wait for 10 times less.
  4. Few lines of Python code, as follows:
from transformers import pipeline
translation = pipeline("translation_en_to_de")
translated_text = translation('I love AI', max_length=40)[0]['translation_text']
print(translated_text)

Enjoy in slow but free AI lunch.

See you tomorrow!

No comment

Leave a Reply

Your email address will not be published. Required fields are marked *