ArsTechnica has a good primer on UAV control theory and machine learning. The good stuff starts around page 2. Here's an excerpt (citing this Stanford research):
The helicopter is performing an entire airshow sequence, but the flights in the video are completely automated! As the helicopter flies through the various maneuvers, the computer has to deal with all kinds of complicated aerodynamics and aero-elastic effects. The rotor blades flex and twist and the helicopter has to fly through turbulence generated by its own rotor. These things are hard to measure and almost impossible to model mathematically with accuracy.
Given how nasty the aerodynamics get, the Stanford researchers couldn’t design the controller the same way as the quadrotor designs we talked about above. Instead, the Stanford team turned to the best authority they could find on helicopter flight: a skilled human radio-control (RC) pilot. They had the helicopter learn to fly by watching the pilot and flying itself using machine learning.
Again, there were two parts to this process, low-level control, and high-level trajectory generation. The researchers needed control algorithms that could keep the helicopter stable and in control during the kind of extreme flights they wanted to perform. They also had to come up with some way to represent the sequence airshow maneuvers to fly.
For the control algorithms, the team used a technique known as reinforcement learning, or RL. They started by recording the position, orientation, and speed of the helicopter, plus control data like joystick and throttle positions from a skilled human pilot. Then they took a basic model of the helicopter in simulation and paired it with a basic control algorithm, and simulated the flights thousands of times. Each time, the computer adjusted the model and the control algorithms to try to get the simulated angles and speeds closer to the real flight from the human pilot. By running the flights in simulation but with real data, they were able to learn a control algorithm that could fly helicopter at the extreme conditions for the aerobatics, as well as a model for how the helicopter behaves in those conditions.
Now, to get the helicopter to perform the extreme maneuvers shown in the video, the research team also had to specify the angles and positions of the helicopter (just like the paths generated for the quadrotors before. But this turns out to be quite complicated).
During these flights, the helicopter is performing in a highly nonlinear way. This is engineer-speak for being very unpredictable: as the helicopter spins and pirouettes, it flies through its own prop wash and turns through large angles. In this case, the aerodynamics get so complicated that it’s impossible to do what we talked about before: picking out desired intermediate points and having the computer simply connect the dots. If the helicopter doesn’t exit one maneuver with the right speed and orientation, it won’t be able to complete the next one. The whole sequence is an intricate, interdependent chain.
To get around this, the Stanford team again turned to their human expert pilot. They had the pilot take the helicopter through the entire airshow routine again and again, giving the helicopter a set of examples. However, this was just the start. The human pilot isn’t perfect; every performance differs slightly from the others. Sometimes turns are cut a little short, or a flip is faster than the pilot might have wanted. But each flight has thousands of data points, each with different angles and positions. So how do you know which part of each routine is correct?
In this case, you can’t combine the trajectories and try to average out the error. Some maneuvers were performed slightly faster or slower, and it’s almost impossible to line things up. Simply cutting and pasting different maneuvers and stunts from different flights get nonsensical results: the start of a flip may depend on the speed coming out of the loop before, and the slightest error means the difference between a completed maneuver and a crash. You can’t simply take a flip from one flight and a loop from another and just hope they work together.