We use cookies. You have options. Cookies help us keep the site running smoothly and inform some of our advertising, but if you’d like to make adjustments, you can visit our Cookie Notice page for more information.
We’d like to use cookies on your device. Cookies help us keep the site running smoothly and inform some of our advertising, but how we use them is entirely up to you. Accept our recommended settings or customise them to your wishes.
×

ORielly: Data is the Intel Inside

Somewhat stale, but interesting post from Tim O'Rielly on Google's free 411 service. Tim suggests the angle isn't advertising, but data collection. Emphasis mine:
There's a hidden story here about the speech recognition itself... speech recognition took a huge leap in capability when automated speech recognition started being used for directory assistance. All of a sudden, there were millions of voices, millions of accents to train speech recognition systems on, and much less need for the individual user to train the system. This is reminiscent of a comment that Peter Norvig, Director of Research at Google, made to me last year about automated translation, and why it's getting better. "We don't have better algorithms. We just have more data." In short, I'm speculating that the 1-800-GOOG-411 service is designed to harvest voice data to build Google's own speech database, rather than licensing from Nuance or another player. If I'm right about this, we see here another demonstration of my Web 2.0 principle that "data is the Intel Inside", and that many of the future battles between industry giants will be around who owns data, rather than who controls software APIs.
Join the Discussion