This summer, during my summer internship at Navstik Labs, I took up a project to implement vision-based object tracking and following, using a gimbal mounted camera on a drone. There are already some ready-to-fly drones that offer such a capability. However, it is still quite complex to integrate this feature with your own custom built drones.

Now anybody can guess that tracking and following objects using gimbal will need some OpenCV code and a PID controller; what else could possibly be required! And, that is all I had to do. However, the challenge was to get it up and running on the hardware, and provide a user-friendly interface for live video-feed, object-selection and controller-tuning, on a remote machine (web interface). All this had to be done in a matter of days, not months, by a person with no prior experience of working with drones (me!).

Thanks to FlytPOD/FlytOS platform, I did not have to spend any time interfacing autopilot, gimbal, camera with companion computer, or installing/configuring ROS, OpenCV to work with it, not even setup the wifi link. FlytPOD/FlytOS provided all of that, out of the box. That was a huge timesaver!

3689697657?profile=originalDrone with camera, 3 axis Gimbal and FlytPOD

Just some connections between gimbal and FlytPOD and I was ready to roll. I had APIs to command gimbal and drone from ROS, access camera feed in OpenCV, a high-bandwidth wifi-link between FlytPOD and my laptop to monitor live data and set control parameters.

Starting off with OpenCV, I got a color and shape based object detector working. So the detection is handled by Hough circle transform. Later, I also used Zdenek Kalal's TLD algorithm, ported to FlytOS, which uses machine learning to track objects in video stream. Then I put together an onboard app, which calls ROS services for the FlytGimbal API from FlytPOD. Next step was to extract the centroid of the object which would lead me to angular setpoints for gimbal.

3689697599?profile=originalFlytGimbal: Web-app for selecting objects to track and gain tuning

Implementing and tuning PID controller was the real experimental stuff, which could have taken me a lot of time, had there not been the web-app to configure my settings during testing. One of my fellow interns got me this nice looking web-app, FlytGimbal, within a day, using RESTful APIs available in FlytOS. Now from my browser, I could select any object from video feed with just a mouse drag, watch the detected false objects or tune my PID gains. FlytOS took care of all the data-plumbing between my ROS code and this web-app.

Now, with this I can ask the gimbal/camera to keep looking at my dog or car, while the drone is free to fly all over the place. I am planning to take this work ahead to solve interesting problems, like, having a drone follow me visually, or land on a moving object, etc.

Read more about FlytOS, FlytPOD here.

Here is link to code for web-app and OpenTLD algorithm ported to FlytOS.

FlytPOD is live on Indiegogo, support us and grab your own Flight Computer.

E-mail me when people leave their comments –

You need to be a member of diydrones to add comments!

Join diydrones


  • With USB CAM (Logitech C920 USB HD Camera) at the FlytOS - Pi3, I am able to get the live video feed, see following video clip;

    From the FlytConsole, I am able to get and change parameters and Waypoints in the Pixhawk but the things which are still not working are as follows;

    1. No GPS status update from the Pihhawk while connected with Pi3 however in standalone when I connect the Pixhawk to PC via USB, the GPS status comes and GPS get locked.

    2. 5MP CSI Camera module when hooked on to Pi3 instead of USB Camera, no live video feed comes.

    3. HUD does not change in the FlytConsole when I tilt the Pixhawk which is connected with Pi3 running FlytOS.

    4. Can't get the object tracking working in the flytfollowme.

  • well, it certainly wouldn't do miracles

    higher precision, the angles are set as floats, for small angle ranges the PWM resolution is ok, but for e.g. 330° it's just 0.3°

    with PWM you can't do 360° turn around, so the target must always be to some extend forward

    once established you easily can use all other functions, have free look, etc.

  • You are absolutely right, we have endless power in our side. The onboard storm32 takes care of the internal loop rate, as you did rightly mentioned. We can surely have a better algorithm running on our side, on top of gimbal controller of storm32. Currently we only have a PID loop running on FlytOS, commanding the gimbal controller of storm32, but surely we can improve upon that, and all your suggestions are being well taken into consideration.

    Regarding the point that you had using serial communication VS the PWM commands which we are using. Could you please elaborate on what you think would give us better leverage if we use serial communication.

  • thx a lot, helped a lot

    I would think that the D in the PID achieves similar to a prediction of the target object location, or the beta in the a-b-g filter.

    I thought a bit more about what you're doing, and I'm not sure if what I had originally in mind is really helpful to you. I mean, you have, compared to the gimbal controller, endless computational power available on your side and run the most sophisticated algorithms. Any code in the gimbal would "only" have the advantage that it can run and correct at the internal loop rate (and bypass the acceleration limiter). Considering how slow any desired gimbal movements are in comparison to that, this likely is no advantage at all. With the serial commands you easily could send high-precision data at a 100Hz rate, or faster, which should be more than enough.

    Anyways, if at some point it occurs to you that the STorM32 could do something better which would help you to achieve a more accurate object tracking, please just contact me, I will listen. :)

  • Ok, to begin with we can have two simultaneous tracking algorithms working in tandem.

    The first one being in OpenTLD(where TLD stands for Tracking-Learning-Detection). This visual tracking involves probably predicting and correcting where the target object is gonna be in the next frame, if its a simple Kalman Filter. It would just provide the pixel coordinates of the target object in the current camera/image frame.

    The second one being in the controller which makes sure that the camera(using gimbal) is always pointing towards the target object, wherein the algorithm would predict what should be the gimbal angle command in the next iteration and then use feedback from OpenTLD about the exact location of target object, thus sending the correct gimbal commands. Currently we are NOT using any tracking algorithm on this part. We have simply implemented a PID control strategy, which DOES NOT predict the target objects location. It simply expects feedback from OpenTLD, and sends gimbal commands.

    I hope it helps!

  • ah, thx for the clarification. I would think you would profit from using a serial data line instead of PWM.

    I might be confused by the wording "tracking part", but in my little understanding OpenTLD gives you the target coordinates (angles, position, whatever), which are the crucial input for tracking, but doesn't do the tracking itself. Please correct me if that's wrong and OpenTLD does in fact include some code to minimize the difference between a target and a current orientation, which I would consider the tracking part.

    I've not thought through what this PD control strategy in detail would be, but I would think it's similar to a alpha-beta filter. These filters, as much as I understand it, are a form of simplified Kalman filter, meaning that a real KF would behave better. A comparative study would indeed be useful.

    Thx for the link, I totally missed that. Need to read that now. :)


  • Hi,

    Sorry for the misunderstanding, but we indeed are using SToRM32 controller for the gimbal. As you might as well know, the controller does provide its current orientation, but we don't need that information as we are using a PID control strategy as my colleague suggested before. Also, it accepts other mode of commands for the gimbal angles, but we were comfortable with using PWM signals, so we went ahead with it.

    Regarding the a-b-g filter, it a nice find. Even we are facing issues with fast moving objects. Currently we are using OpenTLD ( which takes care of the tracking part. We have also implemented HSV based object detection and kalman filter based object tracking (, but a comparative study between the two techniques has not been taken up by us.

    Contribute to flytbase/flyt_open_tld_3d development by creating an account on GitHub.
  • It looks interesting stuff. Thanks for the info OlliW.

  • You say it can only accept PWM commands, so it is not a STorM32 controller, right? I was kind of sure it is, from the looks of the gimbal in the video. LOL.

    I happen to be the developer of the STorM32, and recently got attracted to the idea to implement some specific tracking code into it. The primary motivation was digaus' great tracking application (, where however a significant lag in "fast" motions remained, which I thought could be improved with better tracking algorithms. I figured that such a thing could be of potential interest also in your application. I had scanned the web a bit, and have found that a-b-g filter, which seems to be kind of a standard (besides a controller). I was just very curios about what you're doing in that regard. But since it's not a STorM32 it's of no relevance.

    Keep up the excellent work.

This reply was deleted.