Guest post by Vasco Pedro, from Unbabel
Think of 5 things without which you could not live. Chances are they’re around you right now. It is very likely that your computer or your mobile phone or even both are on that list. We spent a good part of the day at the computer, staring at the screen of the mobile phone, touching information. What made screens ubiquitous in our day-to-day was essentially the desire to consume written information, its spreading and sharing. Even in hyperconnectivity, we are animals of old habits: we like stories.
An important part of what we do online is to read and write these stories. We read and share news and blogs, we comment on forums, we read reviews and compare products before buying them, we write emails, we post our own story every day. Every day millions of words are written, shared , ” tweeted”, posted. Shared experiences (Facebook), shared travel (TripAdvisor), shared knowledge (Quora, Wikipedia), shared opinions about products (Amazon), we sell, trade and share our goods (eBay), our houses (airbnb), our cars (Huber), our time and professional knowledge, free and paid (Fiverr, oDesk, elance).
There is a new global economy based on sharing without frontiers, and yet we are limited by language. The experience of a netizen fluent in English is very different from the experience of someone who speaks only Portuguese, Spanish, Russian. The communication and information sharing between people who do not share a common language is very difficult, if not impossible.
And now that the internet has several decades of existence, now that we have found so many and so diverse uses for it, now that we communicate with people across the world, as long as we speak the same language, it’s time to break the language barrier through technology, through sharing and inevitably via our computer and mobile phone.
Google Translate is perhaps the most widely used translation software in the world. Although there is no official data, we can assume that billions of words are translated every day in this software. Google Translate uses a statistical method to translate. Briefly this means that the software analyzes millions of translations made by human of books, official documents and websites to find patterns and predict the likelihood that “car” is equivalent to “voiture”, “right” to “droit” or maybe it is “juste”. For each word the software chooses an option that seems the most probable. This probabilistic analysis includes elements of artificial intelligence, to the extent that the software is learning to translate better every time it is used. Machine translation today produces a reasonable quality translations in many language pairs, English to French, Spanish or Portuguese for example. Translation into asian languages on the other hand still offer dismal results. and anyway, reasonable is not enough. Certainly you have experienced this: an automatic translation was enough to understan that email written in another language. But did you feel comfortable enough to reply using the machine translation without fear of sounding unnatural or just plain wrong? Probably not. Is there a solution for this?
The online job market for translation has never been so big. Several companies are trying to reinvent the ancestral work of the translator and bring it to the internet. However, despite all the tools and aids that translators have gained over the past few years, translation is still viewed more as an art than a science, as work that can only be done by a community of dedicated and experienced professionals. But a much larger community exists also: there are more bilingual people in the world than monolingual, reveals the wikipedia. Surprising, right?
What if there were a way to harness the power of all these bilingual people, people who are willing to share their knowledge of languages, cross it with the artificial intelligence of software, to create a platform able to translate millions of words each day, breaking linguistic barriers and breaking the frontiers created by languages? Companies like Unbabel are currently developing technology for this novel approach.
Many and varied pieces of technology are needed: a statistical translation engine that learns from each translation, an algorithm that delivers each translation task to the right person according to languages he or she understands, and also according to the reputation each they on the platform, the quality they’re able to deliver, the topics that interest them, we’d need an algorithm to analyze said quality, a way to shows the original text and the translation, an advanced text editor, the ability to choose a different alternative to a word and have that influence the text thereafter, a software to collect millions of texts which require translation and post the completed translations automatically, an application to do all this on your computer, tablet or mobile phone, that mobile phone without which you no longer know how to live without. All this is a frictionless product, where money exchange (when it happens) is arbitraged and users can have fun, all 3 ingredients for a successful sharing economy product.
The challenge is large: imagine a world where we can conserve every different language and culture, help them thrive really, and simultaneously we are able to communicate universally. The pieces of technology required for this are already being developed by universities and companies. They only work with human intervention, with the precious help of the translators and bilingual community. This integration is maybe the key. Man and Machine working together to create a smaller, more connected world.
