FEATURE NEWS Free eNews Subscription

Waiting for Alexa 2.0 (and Other Voice AI Assistant Improvements)

By Doug Mohney January 03, 2017

There's nothing like a new holiday gift to start the "What if" train, and the arrival of an Amazon Alexa Echo on December 25 was clearly an eye-opener. The Amazon Echo is a sweet piece of hardware and the only place it can go is up.

For those who haven't seen the Echo (or got a Google Home instead), it is a round tall black cylinder about a foot plus high, containing seven or so microphones plus a beautiful set of speakers. Echo is always-on, always listening for voice commands triggered by the keyword "Alexa" to invoke recording and processing.

(Note to Google - "OK Google" as a voice assistant trigger is: 1) One word too many and 2) You didn't really get it why Amazon and Apple named their assistants differently from their services, right?)

It is fair to say everyone has only begun to scratch the surface of a virtual assistant through a dedicated appliance (i.e. Echo and the lower-cost Dot). Amazon provides API hooks so third-party services from pizza delivery to Lyft and Uber ride shares can be ordered via voice, but it is going to take time and education – (Thank you Amazon, for the daily TV reminders!) – for people to truly understand and get the most out of a virtual assistant.

There are three things I'd like Alexa to do in the future. First, I'd love for it to have unique user profiles keyed to voice identification/voice biometrics. If you can order goods and services through Echo via voice, there will be a need and demand for better security beyond “The order came through your device.” At some point this will happen, probably the day after the news story where an eight year old goes wild buying video games and toys with his voice using Amazon Prime while the parents were in a different room or sleeping in.

Voice biometrics provide authentication and the ability to create user profiles. Amazon is already probably working on this feature, as it will give more granular details on consumer behavior, which it can leverage to sell more things. It wouldn't shock me if Alexa is already keeping a background count as to how many different users access it through an Echo and for how long each day.

Once Alexa can figure out who is speaking, I want to see it move into the business arena to ride shotgun with me – an area where Google might have an upper hand. Going through my cloud-based email, it would be nice to say things like, "Alexa, please schedule this meeting." Yes, I know there are all kinds of ways you can do this already through Google-based assistants and voice recognition and my phone, but if I'm sitting at my desk and Alexa is already "listening" and playing music in the background, why can't I get it to do a bit more work for me?

Key applications in the work environment would include hands-free operation during presentations – “Alexa, next slide please,” – and anywhere else where a voice-interface would make sense in the trinity of business applications. Microsoft's Cortana already has a leg up with its integration into Windows 10, so I suspect we will see a Microsoft Surface Cortana gizmo down the road, with an LED screen and projector built in. An actual image of Cortana could appear, pictures and videos could be displayed as part of the assistant's functions, and so on.

"Telephony" would be the final integration I'd like to see, but that gets into the hybrid oddness we have today between traditional-ish SMS text messaging, voice telephony, and Voice over IP (VoIP) and IP-based messaging. Still, it would be nice to say "Alexa" (Or "Cortana") "Please call Jeff Rodman" and simply have the call happen. Again, I know I can do this by pulling out my cell phone and saying "OK Google, call Jeff Rodman," assuming I have programmed Jeff's phone number into my contact list, but we don't have the quality of ease of use we should have.

If you had to pick a loser out of the AI assistant race, Apple is probably it. It had the opportunity to open up Siri to third-party developers relatively early, but didn't. Meanwhile, Siri's developers built a better Siri emphasizing APIs and a third-party ecosystem, while being bought by Samsung shortly after unveiling. Amazon can use Alexa and Echo to collect more consumer data and sell more things; meanwhile, Microsoft has Cortana embedded across its platforms and is no doubt building deeper hooks between Cortana and cloud and business applications.

Meanwhile, Google Home can tap into a larger ecosystem and knowledge base than Echo currently, but like all things Google, there are a lot of rough edges in how things work together. If Google could combine its product builds with a health seasoning of Apple's (former) "get it right" product polishing, it would have a category winner.

Edited by Alicia Young

Get stories like this delivered straight to your inbox. [Free eNews Subscription]