Skill Spotlight: The Misty Conversation Skill

As more Misty’s are released into the wild, the Misty Team is excited to see the skills that developers build. Today, we’re talking Cameron Henneke, a developer from the Boulder area, about the Misty Conversation Skill he recently built and shared in the Community Forum. 

Cameron, welcome to the Misty blog! Before we dive into your skill, can you tell us a little about yourself and how you got involved with Misty Robotics?

Cameron Henneke
Cameron Henneke

Sure! I’m an entrepreneur and founder of GQueues, which is a collaborative task manager for businesses that run on G Suite. At my core, I’m a software engineer who has enjoyed building products on the web for the last 15 years. 

My first experience programming a robot was a year ago when I first got my hands on a Misty prototype at a hackathon here in Boulder.  I was instantly hooked by the power to write code that causes an action to happen in the real world, instead of just lighting up pixels on a computer screen.

Is this the first skill you built?

Yes, this is the first skill I’ve built using the new Javascript SDK that allows code to run on the robot itself. Last year I programmed a very early version of Misty using the REST API. It was very cool, but since the code lived and ran on my laptop it came with limitations.

With the new platform for Misty II, developers can really start making the robot more autonomous, self-sufficient and useful. And isn’t that what all parents want for their robots when they grow up?

What does the Misty Conversation Skill do?

With this skill Misty can have a conversation with you on any range of topics, and it’s fully customizable! Misty listens for her wake word, records your voice input, and sends the audio off to Google DialogFlow for processing. DialogFlow converts the audio to text, uses machine learning to recognize intent, and sends back an appropriate response as an audio file for Misty to play. After you initially set up the skill, you can easily add training phrases and responses through DialogFlow’s web interface so Misty can have a conversation about anything!

Setting up the code for Misty and Google Cloud Functions 

Note: Be sure to check out Cameron’s Misty Conversation Skill on GitHub where it describes how to set up and configure DialogFlow in addition to his code.

Step 1: Listen for keyphrase:

Step 2: Record audio:

Step 3: Send audio to DialogFlow:

Step 4: Process response from DialogFlow:

Why did you decide to build this skill? 

I built this skill because I believe voice input will be the most natural way for people to interact with Misty. All the other skills I’m dreaming about will be built on top of this primary conversation skill, whether it’s giving Misty commands, asking her questions, or having her share data she’s gathered. Plus, I want a Misty to be friendly, so she needs to learn how to hold an interesting conversation!

What capabilities does the Misty Conversation Skill incorporate? 

The conversation skill uses the wake word functionality to start the conversation, record audio to listen, send external requests to connect with DialogFlow and save / play audio to respond to the person.

I laughed when Misty responded to your question asking about her age — Did you mean to make her funny?

I envision Misty being a funny, friendly robot who adds a little joy to people’s daily life. So, that’s why in my tests I put some tongue-and-cheek answers to the questions you might ask Misty.  However, the beauty of the Conversation skill — and the Misty platform as a whole — is that you can give Misty any kind of personality you want.

Do you envision others being able to use this skill in specific ways (ie for a specific business or personal purpose)? 

Yes, the possibilities are endless! Since you can add your own intents and responses to your DialogFlow project, you can enable Misty to have a conversation about any topic you want. In a business setting, Misty could serve as a robot concierge, welcoming visitors, answering questions and getting people where they need to go.

How could others expand upon the Misty Conversation Skill? 

One natural expansion of the Conversation skill is to have Misty perform actions based on voice input. So you might say “Hey Misty, I’m thirsty” and Misty could respond “Would you like me to get you some water?” and then if you say “Yes,” Misty navigates to the kitchen and asks your in-home chef for a glass of water to deliver back to you.  

Why did you choose to use DialogFlow over other conversational interfaces?

I chose DialogFlow for two reasons:

  1. I listened to the voice samples on many other platforms and they all sounded very robotic. Ironically, I want my Misty robot to sound like a human. Unlike voices on other platforms that use concatenative text-to-speech, DialogFlow uses WaveNet technology developed by Google that employs machine learning models to generate speech that mimics human voices and sounds more natural.  
  2. I use Google APIs extensively for my startup GQueues, which helps people get organized and integrates with Google products.  So I’m very familiar with Google’s API platform, which made it easy for me to get a project in DialogFlow up and running quickly.

Did you run into any unexpected challenges while building and deploying your skill?

The Google DialogFlow API requires a short-lived AccessToken be sent along with requests to confirm authorized use. Importing and using a JavaScript JWT Library on Misty seemed pretty complex, so instead I created a Google Cloud Function that runs a little Node.js code to provide an AccessToken whenever needed. Admittedly, this is less secure than signing a JWT on the robot but is sufficient for my current experiments.

Final question: Do you plan to incorporate the Misty Conversation Skill into a use case or to expand it into a larger skill?

Absolutely! Conversation is just the beginning. Next, I’m planning to develop an “Empathy Skill” which builds on top of Conversation.

For Empathy, Misty will roam around and when she detects a new face she will stop moving, start face recognition, and then ask the person their name so she can store that info for the future. When she encounters that person again, she will take a picture of the person’s face and send it to the Google Vision API which will determine if the face is happy, sad, angry or surprised. Depending on the emotion detected Misty will say something like “Hi Cameron, you look sad. What’s wrong?”

I’ve already started working on that Empathy Skill, actually. So far, Misty can take my picture, send it to the Vision API, and then say “You look sad. What’s wrong?”  I still need to build the face recognition part and then pull everything together.

Cameron, thank you for taking the time to share your Misty Conversation Skill today!

If you’re a developer who has a Misty skill you’d like us to highlight, share it in the Misty Community Forums and let us know.

Part of our team’s excitement is in learning more about the work you do and not only hearing your skill ideas but seeing you bring Misty to life through your code. We can think of so many ways others developers might take the skill Cameron’s shared to include in their own use cases – This is only the beginning.

Leave a Reply

Your email address will not be published. Required fields are marked *

Free Misty Robot Giveaway for Developers

Enter to win a Misty prototype and then receive a trade-in for a brand new Misty II during the first wave of shipments!