The Defense Advanced Research Projects Agency (DARPA) uses software that translates thousands of Arabic documents every day using state of the art machine translation technology. But as communication shifts from written documents (which tend to follow formal Arabic grammar conventions) to social media such as twitter, instant messages, and blog posts (which are less formal, have less context and have less strict grammar conventions because they are written in regional dialects), DARPA needed to adapt their translation system. They needed to generate a library of thousands of translations from “Social” Arabic to English and turned to Mechanical Turk to help with this problem.
DARPA compiled a library of messages from social media sources in Arabic. They asked Workers to translate these from Arabic to English. They opted for Mechanical Turk because using professional translators would cost too much and take too long.
Each sentence initially was translated from Arabic to English by three Workers. These translations were sent to another set of Workers who were asked to review the translation and make edits to correct grammar. Finally, they then created a new HIT asking Workers to rank the 3 edited translations in order of best to worst. DARPA used the translation with highest score for its training library, and was further able to identify a set of preferred translators who were paid a higher rate and whose work did not need to undergo as rigorous quality control.
DARPA was able to translate 1.5MM words in two months, for a cost of around $0.03 per word, one tenth the cost of using professional translators.
By using Mechanical Turk we were able to not only cut costs by an order of magnitude, but at our peak we were translating over 200,000 words per week, a feat unachievable with our previous methods.
Chris Callison-Burch is an Associate Research Professor in Computer Science at Johns Hopkins University who was the chief consultant for this project.