Amazon Echo: Should I buy?

amazon_echoThe Amazon Echo is a home automation maker’s dream. It provides an easy way to use voice recognition to interact with your devices, but there are a couple things you should know about developing for it before you buy one and start.

  1. Despite it being called an Echo “App”, your development will take place in a web-service hosted in the cloud that can answer it’s calls. What the Echo will do is translate what it hears into text and then hand it off to your service by calling your API with a package that contains the information.
  2. Creating an “app” with Amazon for the Echo requires you to fill out an “Interaction Model” which consists of an “intent schema” and “sample utterances” as well as program your web-service.
    • The “intent schema” is pretty straightforward and you basically create a JSON array of “intent” which contain a name and “slots” which are used like parameters and you must define the type.
    • The “sample utterances” are a list of the “intent name” and potential sample phrases.

Making it talk to a web service hosted in Azure using Node.js turned out to be fairly trivial and I was able to get a basic implementation hooked into the OpenSmartHub that I have been developing in less than a couple hours. I even created a sample in a github repository for those who want simple instructions and an easy place to start.

It really is amazing to see it come together and interact with your voice commands in a custom scenario that you have developed, but still has a long way to go in order to improve it’s voice recognition. It works really well with the pre-programmed functions, but there aren’t that many that I find particularly useful in an every day scenario and it doesn’t do well with brands or non-dictionary words. For example, it recognizes “Pandora” because it’s a vital part of their pre-programmed functionality, but it doesn’t recognize “Yamaha” or “Wemo” well.

Another thing that I’ve noticed is that it can sometimes mix up the singular and plural versions of words when converting text-to-speech. (For example, mine would sometimes hear “lights” when I say “light”)

Overall, I think it’s going to only improve from here and I think it’s worthwhile to invest into in order to integrate voice-recognition and voice commands into your homemade projects!

Add Speech Recognition Easily!

Just found out about a cool HTML5 Web API that is built into Google Chrome.


Since it is vendor prefixed, this API will only work if a user’s browser supports it (for now Chrome). The cool thing about it though is that in order to use it on your site, all you need to do is write some javascript that checks for “webkitSpeechRecognition” and then creates an instance of it, and then use it. Combine this with your IoT devices and you’ve got voice enabled commands for your interacting with your devices!

This tutorial walks you through the basics of how it gets integrated:

This is the full sample: