Voice Controlled Drone

Last month Boston Dynamics released a video showing off a new feature of their Big Dog robotic mule: voice control.  I had been working on a similar project for a while, though I think it's actually cooler: voice control of a flying robot drone.

Above is a video I made last night of the current state of my project, which shows an AR.Drone doing whatever I tell it (mostly):

In the video I'm talking to my custom voice control software running on my laptop, which uses ar-drone-rest and node-ar-drone to talk to the quadrotor.

I'm currently working on doing the speech recognition on an iPhone or Android phone, so you don't need to talk into an unwieldy laptop.

E-mail me when people leave their comments –

You need to be a member of diydrones to add comments!

Join diydrones


  • John I purchased a voice controlled helicopter that can be controlled with voice while also being controlled via Infrared controller very fluently.  I was just about to post a discussion asking for assistance in reverse engineering(Sort Of) the helicopter and the controller to build a product that is compatible with the APM 2.5 and MAVelous.  Would you be interested in more information on this helicopter I purchased?  

  • Terrible interesting! I'd like to do some voice interaction with my usav drone. I will probably bug you with 10000000 questions. ..
  • The speech recognition & parsing code is running on my laptop, so I talk to it--there is no microphone on the drone, and if there was it would be hard for it to hear anything over the sound of the motors.  I am working on a version that lets you talk to your phone.

    This is the code I use for parsing the text that comes back from the speech recognizer: https://github.com/wiseman/energid_nlp

    Then there's some custom code that processes the results that come back from the parser and makes calls to the drone API.

    The AR.Drone can send telemetry data that includes its best estimate of its position, which they say is a result of fusing data from "all sensors", so I assume it's based on force readings from the accelerometer plus distance estimates from the downward facing camera doing optical flow (with help from the sonar to get altitude).  To "move three feet forward" I do the following: 1. Start watching the position telemetry data. 2. Set the forward velocity. 3. Stop the drone when the desired distance has been traveled.


    Arducopter has the advantage of having GPS, so it can know it's absolute position (while AR.Drone only knows its relative position), but I'm curious to experiment and see if Arducopter is as accurate as AR.Drone over small distances (0-10 feet).

    Natural language parsers and conceptual memory. Contribute to wiseman/energid_nlp development by creating an account on GitHub.
  • What interprets the word to actual commands? How is "move 3 feet foreward" interpreted (especially, how is x feet achived? Through some sensor or by calculating distance from speed)? 

  • My original plan was to do voice control with the ArduCopter, and mavelous was really just a side project to learn about the mavlink protocol.  When I learned how easy it was to use the API developed by Felix Geisendörfer (https://github.com/felixge/node-ar-drone) to control AR.Drones, I bought one and created this demo in a couple hours.

    I'm interested in multi-modal interfaces: Talking to the drone when that makes sense, tapping on a map or using a GCS like mavelous when that's appropriate, and occasionally grabbing the sticks and flying it by hand.  The goal is to make it easy to switch between those interfaces fluidly, resolving conflicts in a reasonable way.

    The military and NASA are interested in drones that can take instructions from air traffic controllers in case the radio link between the drone and the pilot has been lost, and one way they've considered doing that is via voice commands.  The idea is that this will make integration of drones into the air space safer.

    Like I said, this demo was created quickly.  I'll be working on adding a lot more language.  Some examples I've thought of:

    • Giving it information about the state of the world: "There's a person standing 3 feet in front of you.  Fly around them."
    • More flight modes: "Follow me," "Fly to the red marker."
    • Plans: "Go 10 feet forward, then climb to 20 feet, then go 10 feet left."
    • Queries: "What's the battery level?" "How high are you?" "Where are you?" "How many cars do you see?"
    • Named locations: "Position Alpha is 30 degrees bearing from me, 100 feet away. Go to position alpha."

    A couple other project ideas that follow naturally from this one & thinking about ArduCopter:

    • Extend the mavlink project with a javascript generator, to take advantage of the energy behind the robotjs movement.
    • Develop an API that's as simple to use as Felix's node-ar-drone library, but for any vehicle that speaks mavlink (and takes advantage of the extra capabilities of ArduCopter etc.)

    A node.js client for controlling Parrot AR Drone 2.0 quad-copters. - felixge/node-ar-drone
  • 3D Robotics

    So cool! Will this work with/through MAVelous?

  • No drones were harmed in the making of this video.

  • Amazing work, but did anyone else feel sorry for the drone he had to force it to land.

  • Moderator

    Great work John

This reply was deleted.