What is Statistical Machine Translation?

Over the past few years, a transformation has taken place in machine translation tools as rule-based translation systems have given way to statistical language analysis techniques that use known translations (e.g., United Nations archives and other open content) to derive nuances and meanings not easily addressed by rule-based systems. Tools like Google Translate have used statistical methods to move machine translation to the point where it is now a viable, low-cost, and easy option for automated, rapid translation on web content. While the translation tools are not yet perfect, they are fairly accurate in most cases, and are well-suited for credible on-the-fly translations.

The ability to embed translation tools quickly and easily into websites such that the viewer may choose a preferred language removes the need to prepare individual copies of online material in different languages. This simplifies upkeep and maintenance as well as making it easier to deploy new content quickly. Statistical machine translation is an increasingly robust and low-cost option that has developed to the point where it is a viable and easy solution for museums looking to make general information easily available in multiple languages.

INSTRUCTIONS: Enter your responses to the questions below. This is most easily done by moving your cursor to the end of the last item and pressing RETURN to create a new bullet point. Please include URLs whenever you can (full URLs will automatically be turned into hyperlinks; please type them out rather than using the linking tools in the toolbar).

Please "sign" your contributions by marking with the code of 4 tildes (~) in a row so that we can follow up with you if we need additional information or leads to examples- this produces a signature when the page is updated, like this: - Sam Sam Jul 21, 2011

(1) How might this technology be relevant to the museums you know best?

  • Add your perspective here...
  • Another perspective here.

(2) What themes are missing from the above description that you think are important?

  • - nik.honeysett nik.honeysett Aug 25, 2011 I think this is a subset of a larger trend based on increases in computational power and storage. Rule-based and algorithm approaches to solving problems are losing out to statistical and inference based solutions, because we can. Its not just about museums providing tools for audiences to consume their content, its about museums being able to consume content from currently untapped sources.
  • - susan.chun susan.chun Sep 1, 2011Nik's is one way to re-categorize this subject. But I'm going to repeat my suggestion from last year that we treat this topic under the broader rubric of multilingual content and include in it not just machine translation, but also the myriad ways in which both technology and social norms encourage museums to create and manage content in many languages. Rob L. has started a thread on this topic in Q2: more from me there.
  • I'd agree with Nik that considered strictly as a technology, statistical machine translation is a subset of the set of all such solutions to a wide assortment of problems. That said, in a different way this translation subset also inhabits another area of special importance to museums: multilingual content handling, and its associated fields of opportunity and cultural expectation. Seeing this more as a trend than as a technology as such, I added an item on it to Q3 (which seemed a slightly better typological fit than Q2). - rob.lancefield rob.lancefield Sep 1, 2011

(3) What do you see as the potential impact of this technology on education and interpretation in museums?

  • Reusing resources, more cross-Institutional Sharing of resources, efficiencies of costs - beth.harris beth.harris Aug 31, 2011
  • Another perspective

(4) Do you have or know of a project working in this area?

  • Add your perspective here...
  • Another perspective here.

Please share information about related projects in our Horizon Project sharing form.