- Recent
- Popular
- Tags (4)
- Subscribers (11)
- Translation is Risky BusinessDecember 30 2008
-
Posted by Shankar Kumar and Wolfgang MachereyAt Google, we like search. So it's no surprise that we treat language translation as a search problem. We build statistical models of how one language maps to another (the translation model) and models of what the target language is supposed to look like (the language model) and then we search for the best translation according to those models (combined into one big log linear model for those of you taking notes).But, just as putting all of your money in the investment with the highest historical return is not always the best idea, choosing the translation with the highest probability is not always the best idea either - especially when you have a relatively flat distribution among the top candidates. Instead, we can use the Minimum Bayes Risk (MBR) criterion. Essentially, we look at a sample of the best candidate translations (the so called n-best list) and choose the safest one, the one most likely to do the least amount of damage (where 'damage' is defined by our measurement of translation quality). You might want to view this as choosing a translation that is a lot like the other good translations instead of choosing that strange one that had the good model score.If this is our 'diversification' strategy, how can we make things even safer? E
- plop: Probabilistic Learning of ProgramsNovember 10 2008
-
Posted by Moshe Looks
Cross-posted with Open Source at Google blogTraditional machine learning systems work with relatively flat, uniform data representations, such as feature vectors, time-series, and probabilistic context-free grammars. However, reality often presents us with data which are best understood in terms of relations, types, hierarchies, and complex functional forms. The best representational scheme we computer scientists have for coping with this sort of complexity is computer programs. Yet there are comparatively few machine learning methods that operate directly on programmatic representations, due to the extreme combinatorial explosions involved and the semantic complexities of programs.
The plop project is part a new approach to learning programs being developed at Google and elsewhere that takes on the challenges of learning programs through a unified approach based on reducing programs to a hierarchical normal form, building sequences of specialized representations for programs as search progresses, maintaining alternative representations, and managing uncertainty probabilistically by applying estimation-of-distribution algorithms over program spaces, and exploiting probabilistic background knowledge.
For more information on this approac - New Technology Roundtable SeriesOctober 3 2008
-
Posted by Alfred Spector, VP of Research and Special Initiatives
We've just posted the first three videos in the Google Technology Roundtable Series. Each one is a discussion with senior Google researchers and technologists about one of our most significant achievements. We use a talk show format, where I lead a discussion on the technology.
While the videos are intended for a reasonably technical audience, I think they may be interesting to many as an overview of the key challenges and ideas underlying Google's systems. And of course they offer a glimpse into the people behind Google.
The first one we made is Large-Scale Search System Infrastructure and Search Quality." I interview Google Fellows Jeff Dean and Amit Singhal on their insights in how search works at Google.
The next title is "Map Reduce," a discussion of this key technology (first, at Google, and now having a great impact across the field) for harnessing parallelism provided by very large-scale clusters computers, while mitigating the component failures that inevitably occur in such big systems. My discussion is w - Doubling UpSeptember 29 2008
-
Posted by Franz Josef Och
Machine translation is hard. Natural languages are so complex and have
so many ambiguities and exceptions that teaching a computer to
translate between them turned out to be a much harder problem than
people thought when the field of machine translation was born over 50
years ago. At Google Research, our approach is to have the machines
learn to translate by using learning algorithms on gigantic amounts of
monolingual and translated data. Another knowledge source is user
suggestions. This approach allows us to constantly improve the
quality of machine translations as we mine more data and
get more and more feedback from users.
A nice property of the learning algorithms that we use is that they
are largely language independent -- we use the same set of core
algorithms for all languages. So this means if we find a lot of
translated data for a new language, we can just run our algorithms and
build a new translation system for that language.
As a result, we were recently able to significantly increase the number of
languages on translate.google.com. Last week, we launched eleven new
languages: Catalan, Filipino, - Remembering Randy PauschJuly 26 2008
-
Posted by Kevin McCurley, Research Team
It is with great sadness that we note the passing of Randy Pausch, who taught computer science at Carnegie Mellon University. Randy was well-known by many within the research community, including quite a number of us here at Google. Alfred Spector, our Vice President of Research, was his Ph.D. advisor. Rich Gossweiler, a Senior Research Scientist, was his first Ph.D. student. Several other former colleagues and coauthors (Joshua Bloch, Adam Fass, and Ning Hu) now work here.
All of us strive to make an impact with our research, and Randy was no exception. He will be remembered for his work, but also for his contributions to humanity at large. Millions have watched the video on YouTube from his lecture titled Achieving your Childhood Dreams. The strength of his character was already known to his family, his colleagues, and the broader computer science research community. The courage and optimism that he displayed at the end of his life became inspirational to millions more.
I've seen Randy repeatedly go to bat for what is right. As a leader, he consistently evoked incredible enthusiasm and optimism for the subjects he embraces. Randy had a very human passion about people and not just who they are, but their potential, despite any flaws or obstacles in their way. His contributions will be remembered for
