Last month Boston Dynamics released a video showing off a new feature of their Big Dog robotic mule: voice control.  I had been working on a similar project for a while, though I think it's actually cooler: voice control of a flying robot drone.

Above is a video I made last night of the current state of my project, which shows an AR.Drone doing whatever I tell it (mostly):

In the video I'm talking to my custom voice control software running on my laptop, which uses ar-drone-rest and node-ar-drone to talk to the quadrotor.

I'm currently working on doing the speech recognition on an iPhone or Android phone, so you don't need to talk into an unwieldy laptop.

Views: 4059

Comment by Ruwan on January 11, 2013 at 4:20pm

Great work John

Comment by Chris Webb on January 11, 2013 at 5:59pm

Amazing work, but did anyone else feel sorry for the drone he had to force it to land.

Comment by John Wiseman on January 11, 2013 at 6:39pm

No drones were harmed in the making of this video.

3D Robotics
Comment by Chris Anderson on January 12, 2013 at 4:27pm

So cool! Will this work with/through MAVelous?

Comment by John Wiseman on January 12, 2013 at 6:55pm

My original plan was to do voice control with the ArduCopter, and mavelous was really just a side project to learn about the mavlink protocol.  When I learned how easy it was to use the API developed by Felix Geisendörfer ( to control AR.Drones, I bought one and created this demo in a couple hours.

I'm interested in multi-modal interfaces: Talking to the drone when that makes sense, tapping on a map or using a GCS like mavelous when that's appropriate, and occasionally grabbing the sticks and flying it by hand.  The goal is to make it easy to switch between those interfaces fluidly, resolving conflicts in a reasonable way.

The military and NASA are interested in drones that can take instructions from air traffic controllers in case the radio link between the drone and the pilot has been lost, and one way they've considered doing that is via voice commands.  The idea is that this will make integration of drones into the air space safer.

Like I said, this demo was created quickly.  I'll be working on adding a lot more language.  Some examples I've thought of:

  • Giving it information about the state of the world: "There's a person standing 3 feet in front of you.  Fly around them."
  • More flight modes: "Follow me," "Fly to the red marker."
  • Plans: "Go 10 feet forward, then climb to 20 feet, then go 10 feet left."
  • Queries: "What's the battery level?" "How high are you?" "Where are you?" "How many cars do you see?"
  • Named locations: "Position Alpha is 30 degrees bearing from me, 100 feet away. Go to position alpha."

A couple other project ideas that follow naturally from this one & thinking about ArduCopter:

  • Extend the mavlink project with a javascript generator, to take advantage of the energy behind the robotjs movement.
  • Develop an API that's as simple to use as Felix's node-ar-drone library, but for any vehicle that speaks mavlink (and takes advantage of the extra capabilities of ArduCopter etc.)

Comment by Jesper Andersen on January 18, 2013 at 3:46am

What interprets the word to actual commands? How is "move 3 feet foreward" interpreted (especially, how is x feet achived? Through some sensor or by calculating distance from speed)? 

Comment by John Wiseman on January 18, 2013 at 9:22am

The speech recognition & parsing code is running on my laptop, so I talk to it--there is no microphone on the drone, and if there was it would be hard for it to hear anything over the sound of the motors.  I am working on a version that lets you talk to your phone.

This is the code I use for parsing the text that comes back from the speech recognizer:

Then there's some custom code that processes the results that come back from the parser and makes calls to the drone API.

The AR.Drone can send telemetry data that includes its best estimate of its position, which they say is a result of fusing data from "all sensors", so I assume it's based on force readings from the accelerometer plus distance estimates from the downward facing camera doing optical flow (with help from the sonar to get altitude).  To "move three feet forward" I do the following: 1. Start watching the position telemetry data. 2. Set the forward velocity. 3. Stop the drone when the desired distance has been traveled.


Arducopter has the advantage of having GPS, so it can know it's absolute position (while AR.Drone only knows its relative position), but I'm curious to experiment and see if Arducopter is as accurate as AR.Drone over small distances (0-10 feet).

Comment by Jesper Andersen on January 18, 2013 at 12:00pm
Terrible interesting! I'd like to do some voice interaction with my usav drone. I will probably bug you with 10000000 questions. ..
Comment by Joshua Johnson on March 17, 2013 at 9:06pm

John I purchased a voice controlled helicopter that can be controlled with voice while also being controlled via Infrared controller very fluently.  I was just about to post a discussion asking for assistance in reverse engineering(Sort Of) the helicopter and the controller to build a product that is compatible with the APM 2.5 and MAVelous.  Would you be interested in more information on this helicopter I purchased?  


You need to be a member of DIY Drones to add comments!

Join DIY Drones

© 2019   Created by Chris Anderson.   Powered by

Badges  |  Report an Issue  |  Terms of Service