NASSLLI 2012 June 18 - 22

Sign up here for our Mailing List:

Questions? Email us:

Search the site:

  Powered by Google

Statistical Machine Translation

Room: UTC 4.104


Google translate can instantly translate between any pair of over fifty human languages (for instance, from French to English). How does it do that? Why does it make the errors that it does? And how can you build something better? Modern translation systems *learn* how to translate by reading millions of words of already translated text, and this course will show you how they work. Despite demonstrable success over the last decade, much work remains to be done, so we will also identify open questions at the heart of current research, as well as computational and linguistic insights that may help solve them. The course covers a diverse set of fundamental building blocks from linguistics, machine learning, algorithms, data structures, and formal language theory, along with their application to a real and difficult problem in artificial intelligence.


  1. An introduction to statistical machine translation: how can machines learn to translate?
  2. Probabilistic modeling, language models, and finite-state translation models based on words.
  3. Finite-state translation models based on phrases, and decoding of finite-state models.
  4. Context-free models and their decoding algorithms, and unsupervised learning of translation models.
  5. Evaluation and supervised learning of translation models.


Adam Lopez

Email: alopez (AT) cs (DOT) jhu (DOT) edu


Adam Lopez is a research scientist at Johns Hopkins University in the Human Language Technology Center of Excellence. His research and teaching focus on technology that will break the language barrier, in particular systems that learn how to translate from vast amounts of data (like Google Translate); his work draws on core ideas from algorithms, machine learning, formal language and automata theory, and computational linguistics. Previously he was a research fellow in the machine translation research group at the University of Edinburgh, where he moved after earning his Ph.D. in computer science from the University of Maryland.