How Well Does MTL Work?
-
@oathkeeper95 There are specialized otaku, as well as "general" otaku.
Someone can be a car otaku, or a train otaku, with no interest in anime, unless it dealt with their particular hobby.
Getting back to the subject of this (split-off) topic...
I found that the MTL of Mahouka Koukou no Rettousei was NOT "impossible" to read, because I had familiarity with the characters. I "knew" how various ones acted and spoke, and whether they were male/female, etc, so I could gloss over most of the gaffs in my mind.
But, if it were presented here as a pre-pub, the topics about what needed changing would be dozens of messages long, per section. And, if you didn't know the characters, figuring them out would be extremely difficult.
And therein lies the issue of MTL - the computers don't "understand" the book. There's a comment by the author in volume 9 of Rokujouma when he found out that it was being translated into other languages, about how they dealt with dialogs.
In Japanese, we can distinguish the characters based on how they refer to themselves. Here’s a list of how it generally works out.
Ore = Koutarou
Atashi = Sanae
Warawa = Theia
Watashi = Yurika
Waga = Kiriha
Watakushi = Ruth
Oira = The HaniwasOn top of this, the characters can be differentiated by what they’re saying and their tone. When it’s all taken together, dialogue tags to label the speaker aren’t necessary.
But a question popped into my mind the other day. How would this work in another language? Take English, for example. In English, all subjects refer to themselves as “I.” As a result, Sanae, the haniwas, and everyone else would all talk about themselves the same way. It would be impossible to distinguish them based on pronouns.
MTL does not take any of that into account, because it's working on a sentence, not a complete work.
-
Anyone know what text to speech engine capture 2 text uses? I've been using that for a manga to get the gist of what is going on, and I find it helpful enough (not good or perfect by any stretch though).
I've "read" two volume 1 of light novels (ZoaHunter + Attacking the Dungeon with My Boss is OT Work) with MTL, and it worked.. ok at times, at other times you just couldn't understand anything going on for pages. It definitely works to give you an idea of what a novel is if you want to see if it's worth requesting to JNC or Seven Seas survey or something, but I wouldn't rely on it to make it through a 20 volume series. T_T
-
Use both google and bing to translate, they compliment each other very well.
-
The word of the day is tsundoku.
-
@nosgoroth Is that where you have to put girls' names into a grid? ;-)
-
I just got curious of this subject, so I played around with Google translate which I haven’t touched for more than 10 years. I thought, well, it may have improved significantly over the last 10 years. So I read a couple of chapters of the Bookworm through Google translate and see how well it does.
It is still junk. It’s accuracy is 30% at the most (I’m generous here. “Accurate” here means - “it is not mistranslation”, hardly a good translation). But of course you wouldn’t know what that 30% is. Here are the patterns of errors I see:
- It is often confused with what the subject of the sentence is.
- It is often confused with what the object of the sentence too.
- I was hugely disappointed with the quality of the dictionary. I would expect google should be good with big data. It made really puzzling mistake even with what I considered relatively easy word (words without many context dependent multiple meanings)
- And for the word with context dependent multiple meanings, often wrong one is selected
- Seems to know very little about phrases, even the very basic ones.
- When thing gets a little bit complicated, I’ve seen it quite often that Google translate just drops what it gave up on, without any trace that it did so.
- It can completely misunderstand the structure of sentence with just a little bit of complexity in grammar.
- With all the issues combined, it is particularly bad with dialogues, since they tend to rely heavily on context.
In a nutshell, you’ll get a jumble of questionable 5Ws that you could completely misunderstand what is going on. Yes, you sometimes get a pretty "accurate" translation, but they are rare and far apart.
The most important thing it misses is anything to do with expression of feelings and emotions, which tend to require more careful attention to translate right, with frail vocabularies with many subtle differences depending on the context. Combine that with with all the issues mentioned above, its ability to convey those are non existent.
It is possible that Google translate works much better with more logical source documents, like computer manual or business document where conveying information is the main purpose.
But when encountered with literature sprinkled with trickery of language to induce the emotion of the reader, it falls flat on its face.
I would imagine reading novel through Google translate is like watching a movie with three layers of sunglasses on while somebody blasting boombox right next to you.
I cannot see how you would not get a headache.
-
Love is where it struggles the most definitely.
That and Japanese phrases / proverb type things.
And of course names. Actually trying to read a book currently with MTL and there's some names that I know it's butchering completely.
But if you just want to know the movement of a story, the general premise, you can kind of get that. But expect to just not know a lot of lines or make guess work on some based on what is around them.
-
I remember reading a comparison of translations of FFVI, between the original SNES Woolsey translation, the newer GBA translation, a popular fan translation, and Google MTL.
It turned out Google MTL could actually catch a few nuances that the Woolsey and fan translations missed. The reviewer speculated that since Google has some form of collective user input and AI Deep Learning for how well a translation works, the places where it does surprisingly well are possibly stuff that is commonly plugged into Google Translate.
Of course, for the vast majority, Google Translate is stiff and grammatically questionable, and often completely incoherent.
(If it matters, Woolsey's translation has lots of issues as well, understandable due to the constraints he was working under, and often getting the context wrong. The fan translation was even worse, often making things up wholesale and cribbing from Woolsey's incorrect translations at other times. I believe the original creator of the fan translation has long since disavowed it.)
-
@unsynchedcheese said in How Well Does MTL Work?:
It turned out Google MTL could actually catch a few nuances that the Woolsey and fan translations missed. The reviewer speculated that since Google has some form of collective user input and AI Deep Learning for how well a translation works, the places where it does surprisingly well are possibly stuff that is commonly plugged into Google Translate.
Problem is that you need good input of source/target pair to train the system. Just people entering the source text does not result in any improvement in the system unless you enter the good corresponding translation.
That kind of dataset is really hard to come by.
EDIT
My understanding is that such high quality data exist among European languages due to high quality translation gets created continuously as regular EU activity and being government documents, they are in public domain.
But still, literature is by definition copyrighted so it could be still weak even among European languages.
-
@hiroto said in How Well Does MTL Work?:
EDIT
My understanding is that such high quality data exist among European languages due to high quality translation gets created continuously as regular EU activity and being government documents, they are in public domain.To a certain extent this is true; to a certain extent it's simply because boilerplate text (think of some forms of contract, for example) is easier to translate by rote than something that needs more nuance. (Maybe Google wouldn't be such a good way to translate a sensitive diplomatic communiqué, for example.)
More to the point, big database or no, literature often demands a sure touch with nuance and implication in both the source and target languages that a huge pile of A-equals-B phrases isn't necessarily designed to cope with.
-
@hiroto Already read that thread earlier today. And yes, I know that verb objects & subjects are frequently incorrect. Can totally ignore any gender in pronouns since as someone else stated "... swapped gender more often than Ranma." And plenty of additionally problems. The lack of quality of MTL is a major reason I'm also reading the professionally translated light novels.
I would definitely like to learn Japanese, but that's going to be a long term project, especially without having anyone accessible who speaks the language. Can see some things that make the process simpler, one of which being that the katakana and hiragana are syllabaries instead of alphabets, but then again, some things become more complex such as the numerous kanji and the frustratingly annoying lack of spaces or other delimiters between words. For instance, to use the example of 領主. What makes that a "single word"? It's composed of 2 kanji, where 領 means territory and 主 means main. So I can see the concept of main one for a territory which the word "Lord" is a reasonable description. In google translate, it converts 領主 to the phonetic expression "Ryōshu" whereas for 領主会議, it uses "Ryōshu kaigi", so obviously it considers 領主会議 to be two separate "words" with the division being between the 2nd and 3rd kanji. So to my way of thinking, it looks like not only does one need to learn a large number of kanji, but also one needs to learn frequent combinations of kanji that are used to represent common concepts. Definitely a non-trivial problem. Will say I found it quite interesting when I learned a few weeks ago that Japanese basically has three types of sounds in the language. Those being vowels, consonant followed by a vowel, and a nasal stop (ン). Learning that explained to me why loan words borrowed by the Japanese frequently have an extra vowel tacked onto the end since the vast majority of Japanese words end in a vowel. Also explained to me a cartoon I saw many years ago when someone wanted to play a word chain game with someone else who didn't want to play. The one not wanting to play kept ending their word with ン which could be used to end a word, but can't ever be used to start one. The cartoon has an explanation about that, but I never really internalized the explanation until I understood how vowel heavy Japanese is and the uniqueness of ン. And I know that there are some translations I've read that can't be accurately performed. One scene that comes to mind is the game played between blank and Jibril in "No Game No Life". They played that word chain game and I can't imagine any English translation being able to handle that. So the words used are effectively random from an English speaker's point of view with no discernible connection between the words, while in Japanese the connections should be obvious.
-
@jcochran said in How Well Does MTL Work?:
@hiroto Already read that thread earlier today. And yes, I know that verb objects & subjects are frequently incorrect. Can totally ignore any gender in pronouns since as someone else stated "... swapped gender more often than Ranma." And plenty of additionally problems. The lack of quality of MTL is a major reason I'm also reading the professionally translated light novels.
If you know that much, you should stay away from it. It is really a waste of time. It is much worse experience than not reading it. You can pickup some factual information, rough flow of the story (with good mix of misinformation), but for what purpose?
You'll be completely missing out on emotion of characters who weave through those facts/events and tie them together, leaving you with very empty experience.
I would definitely like to learn Japanese, but that's going to be a long term project, especially without having anyone accessible who speaks the language.
Bookworm will take 5 more years to complete. You have a long time to learn.
I respect your energy and attention to details. But I really hate to see that energy misguided by your obsession with MTL.
-
@hiroto said in How Well Does MTL Work?:
Bookworm will take 5 more years to complete. You have a long time to learn.
From what some people have told me, it could take that long to acquire sufficient proficiency to be able to read LN in Japanese.
-
I think people that use MTL do it to be able to know what to ask for in request topics / Seven Seas survey (that you can't find on blogs or fanslation scene, especially the unpopular genre). At least that's why I do it.
There's folks that spend time and energy reading web novel and manga adaptation translations to inform themselves on what LN to request, media which isn't always an accurate reflection of the LNs either. They're all different kinds of valid to inform yourself on something, but none are perfect, and if they do come out in English you get a whole new / better experience.
But learning Japanese is obviously the dream. On that note, time to do my Duo Lingo for the day... I'm very rusty. T_T
-
@hiroto Much of what you say is true. But even professional translators and good fan translators make mistakes. When I read volume 1 of Ascendance of a Bookworm, I had absolutely no idea what craft Myne used to make Tuuli's hair ornament. The translation by Quof implies multiple needles, has Gunther looking at Effa's larger needles as an example of the shape of the needle tip, which is a bit of overkill for a simple knitting needle. And since Effa had larger needles of the correct shape, it strongly indicates that Myne simply adapted a craft that Effa was already familiar with. Her innovation was to use thread instead of yarn for the craft and to use it in a decorative instead of a utilitarian fashion. At no point in that volume was the word crochet ever used. And the fan translation by Blastron also has the same details. Multiple needles. Using Effa's needles as a model for Myne's smaller needles, etc. And also, no mention at all about crochet.
But the machine translation of WN chapter 15 does mention crochet. The needle is singular, not plural.
Additionally, confirmation of the craft used being crochet is then verified in the manga, anime, and if you examine the cover art for volume 1, you will see a crochet needle to the right of the book Myne has on her lap amidst what look to be multiple crochet flowers. And the cover art for this series is noted for having each element of the cover art somewhere within the novel, although the actual scene shown on the cover isn't within the novel. So I have three different confirmations as to the craft Myne used for Tuuli's hair ornament being crochet. Yet, crochet is mentioned nowhere in the translations performed by two humans, one being a professional and the other I assume to be a talented amateur. But the lowly machine translation makes that detail clear.
I'll agree that currently machine translations are of poor quality and you have to take a lot of what they produce in context in order to actually extract the correct meaning. One thing that sticks in my mind was a statement in the WN (I forget the chapter, so I'll be paraphrasing the result) where it said that since "Myne's cook was the one who prepared a meal, Fran didn't need to poison it." Obviously that translation is wrong since it doesn't make any sense for Fran to make sure that Myne's meal is poisoned. So I think it would be safe to assume that the actual intent was that because Myne's cook prepared the meal, Fran didn't need to check the result for poison.
And frankly, in order to make a good translation. I believe that the translator needs to be good in both the source and target languages. But that still isn't sufficient. The translator also needs to be at least familiar with any crafts or technologies mentioned in detail in the source language. I suspect the "crochet" issue was the both Quof and Blastron were unaware of crochet, saw the word "knit" in nearby context, and then distorted the actual meaning to be something of a strange mutation involving knitting. Basically they knew the languages, but were ignorant of the craft being described. And Honzuki no Gekokujou uses a lot of different crafts in much greater detail than most novels, making it a rather difficult story to translate. As for myself, I knew of crochet because as a child, I would visit my grandmother several times a week and she did a lot of crochet and taught me a little bit. And the hair stick I was aware of since I have a few friends with lovely long hair who use hair sticks to control their manes. Quof was unaware that a "hair stick" is an actual item and that the simple phrase "hair stick" is the correct English phrase for that object. And that's understandable since I wasn't aware of hair sticks until I meet those friends a few years ago.
So in conclusion, is MTL suitable for creating a polished readable product? The answer is obviously "no". But is MTL a useful tool for extracting the meaning from a foreign language document? And I would say "Yes, as long as you're aware of its limitations and pay attention to context."
-
@jcochran said in How Well Does MTL Work?:
For instance, to use the example of 領主. What makes that a "single word"? It's composed of 2 kanji, where 領 means territory and 主 means main. So I can see the concept of main one for a territory which the word "Lord" is a reasonable description. In google translate, it converts 領主 to the phonetic expression "Ryōshu" whereas for 領主会議, it uses "Ryōshu kaigi", so obviously it considers 領主会議 to be two separate "words" with the division being between the 2nd and 3rd kanji. So to my way of thinking, it looks like not only does one need to learn a large number of kanji, but also one needs to learn frequent combinations of kanji that are used to represent common concepts.
Not to take us too far off topic, but I wanted to briefly address this. Consider this entire post as being in parentheses.
In the case you've cited, 領主会議 (ryoushu kaigi) is being rendered as two separate words because, in a meaningful sense, it is. 領主 and 会議 are discrete vocabulary items in Japanese, the first meaning (e.g.) "governor," "ruler," or "territorial lord" and the second meaning "meeting," "council," or the like. If you broke the kanji up some other way, the combinations would have other meanings, or perhaps no meaning, but if you think of the actual vocabulary words (ryoushu and kaigi) first and work from there, the division makes sense.
(By the way, in this case, 主 has a meaning of ruler or master, as you seem to have intuited. By itself, it can be read nushi when it has this meaning. The meaning "main" is sometimes associated with the reading omo, as in 主に [omo ni], "mainly," "chiefly," "primarily." I suppose I'm not making this sound any less complicated.)
There's no question that the lack of spaces can make Japanese text look intimidating, but as with the above, it's mostly a matter of recognizing enough vocabulary to distinguish one word from the next. Oddly enough, kanji are more of an aid than an impediment in this task. (Reading a text rendered entirely in hiragana, now that can be a challenge.)
-
@kevin-s アイドントノーフアットユーミーン、ジスイスベリーイージーツリードフォアミー、嘘だけど
-
@kevin-s said in How Well Does MTL Work?:
Oddly enough, kanji are more of an aid than an impediment in this task.
This video makes a really good point of that - Why Kanji Are Your Best Friend
-
@nosgoroth said in How Well Does MTL Work?:
@kevin-s アイドントノーフアットユーミーン、ジスイスベリーイージーツリードフォアミー、嘘だけど
Microsoft Translator:
Idon't Noh at Eumeen, Ji-Swiss Berry Easy Treed Foremy, It's a Liex)
-
@hiroto said in How Well Does MTL Work?:
This video makes a really good point of that - Why Kanji Are Your Best Friend
Good video, and I can see how knowing the kanji can make some previously unseen combinations of kanji understandable. And since you provided a video, I'll return with another video... https://www.youtube.com/watch?v=bcdYKxHT8kY