Current architecture of drone autopilots is wrong , Drone hardware architecture needs to be completely re-engineered.

I know this statement will raise the question who is this guy coming and telling us that current day autopilots are all wrong. Well I have been flying RC Planes from the age of 10 and have been using autopilots from the early Ardupilot of 2009 -2010 vintage till the more recent ones from 3DR, DJI, Feiyutech including their cheap clones  over the last 5-6 years. I have been coding from the age of 15 and am now 21 years old.
Based on my experience with this wide range of autopilots I have come to the conclusion that hardware of majority of autopilots  are adapted from the world of data based computing made for processing huge chunks of predefined data and giving a appropriate notification or display. In the case of data based computing inputs are got from low response data source like Ethernet/internet or some sensor network, this  data is processed and outputs are either notifications or a display and in a few cases some very slow speed controls. Nothing where high speed control of a dynamic object is involved even on a single axis.
Hence  the question : are these processors/hardware  made for controlling a dynamic moving object with freedom across 3 axis’s like a drone??
After using all types of available autopilots I realized that the fundamentals of  drone control at its core requires the following steps to be done repeatedly as fast as possible
1. reading  sensor values and conveying them to the controller/processor
2. filtering  these  sensor values
3. pushing the filtered values  into a PID loop
4. transferring control commands to the actuators for immediate action.

This cycle needs to be repeated over and over again the faster the better . This is what determines the stability of the drone the higher the cycle time the higher the stability .So what is needed in the case of drones is a continuous high speed input –output action reaction control system. I realized that drone control is not so much about data crunching as about speed of the control cycle.

If the use of  drones has to grow developers have to be given freedom to code for their applications without compromising this core control cycle. In the case of drones a developers code resulting in a system hang will result in catastrophic outcomes like either crashs or fly aways, both which have been regularly reported in current autopilots. Achieving high control cycle speeds & isolating the flight controls is not possible with the current architecture of sequential processing, hence the future of drones is limited by the architecture of
currently available autopilots.

So unless a new thought process emerges drone use cannot grow exponentially. What is needed is a motherboard that it radically different from anything available today.

I have been working on this for a while now and my first hand experience is that the moment I shifted my focus to achieving higher speed control loops with my self  designed autopilot the level of stability and performance I have been able to get are awesome even in very high costal wind speeds on a small 250 mm racer. I achieved this with the most primitive of micro controller used in the first ardupilot  the  ATMEGA 328. Things got even better when I replaced the MPU 6050. IMU with the MPU 9250.

With my custom made Distributed Parallel Control Computing Bus I have been able to achieve Altitude hold with a total drift  accuracy of less than 1 meter  and a very accurate heading hold as well as GPS navigation on the 250 mm  racer. All I did was to add another ATMEGA 328 in parallel to the first one to add on features.

Thanks to this I am able to completely isolate the core flight control loop from the APP development coding there by the drone is never compromised by faulty APP development coding.

Distributed parallel control computing I have found from my own experience is an architecture that really has the potential to create exponential growth in drone applications. I would be interested to know of any other ways by which others are trying to address this core unique control processing requirements of drones.

Views: 12203

Reply to This

Replies to This Discussion

This makes a lot of sense from a couple of viewpoints. On a practical design level, multicore CPUs of even lower spec should perform better on short independent loops than a higher spec processor (assuming proper threading of software). There are several paradigms in biology, the most obvious being reflex nerve paths for a lot of low level 'processing'. Carrying this model forward, it would make sense to have cheap multicore processors handle well threaded software for control loops and perhaps having a slightly heftier 'brain' processor that handles route planning and obstacle avoidance, etc. This seems to be q natural extension of the dual trends toward more specific architecture for purpose and natural or accidental mimicry of biological systems. 3 billion years of system optimisation for survival is hard to argue with.

I trust a lot in your forward thinking idea. Keep me updated!

@Dan Wilson:

”Separating the flight critical tasks from non flight critical” , Yes that is the purpose of my architecture , But I Would coin it as ”isolating the flight critical tasks from non flight critical”. What I Provide for access to the KCU(Core 1) is a Bus interface similar to USB , so the IHOU(Interface and Higher Operations Unit , Core2) only gives control via the interface (Something similar to a mouse giving commands to a PC’s OS to move pointer) . The access of the CORE2 to the CORE1 functions are limited to a specific instruction set(like a mouse’s driver software) , hence even any erroneous code in CORE2 will not reflect on CORE1 in terms of time dependencies or instruction execution as the access instruction set is predefined and limited and we know exactly how much time the access should take, so even if BUS gets hung on access(very unlikely) , the CORE1 just withdraws after the allocated time for access is over (one of the reasons why a mouse works even if the PC’s OS hangs). Hence, the reliability of the system is very high without having extra redundancy  sets of sensors , gps’s or processors.


Yes very early on in my development I realized that the algorithms need to be  perfectly working taking into account all the possible scenarios it , Like for example when I was designing and testing my special Semi – Cardioid Locus based cross track correction algorithm which corrects cross track from reference line using ONLY TARGET HEADING . I did not take into account the effect of our algorithm by wind and that if the (referencePosition.angleToTarget – currentPosition.angleToTarget )>50 the TARGET HEADING would have a 70 degs deviation from the currentPosition.angleToTarget  and it would start oscillating about the reference line . Then I had to add a smoother which gives controls based on rate of change of crosstrack error if (referencePosition.angleToTarget – currentPosition.angleToTarget )>30 . Then I developed a control experiment to test if it is working correctly along with real time data along with a JAVA based software simulator (with this algorithm no additional tuning is required for navigation , only a decent heading hold tuning ) . additionally, all my new algorithms(for navigation) is tried out on an autonomous car before implemented on airplane n multirotor as any small deviation is easily identified on a car running a tight navigation track than a flying object . So , yes a lot of control experiments and testing is done to make sure all algorithms work and sensors fusers work consistent in all scenarios including inside a fridge at -5degs (India is not a cold country , it DAMN HOT !! here ;-) ) as well as 60degs .


Why I said EKF type of SLAM because as of now we use EKF only for fusing our local sensors (one’s that consider the drone as a free body in free space like IMU , BARO , GPS ) get a 3Dimentional idea of where we are in free space from our reference plane , but additionally if we add Environmental sensors which can map and create environmental variables like positions of objects in space , dimensions and distances from objects in our local space can be fused with 3D position  in free space to give very accurate local space and global space position actually merging both forming a single positional entity .This is basically Spatial Localization(our 3D free space position) and Mapping(position with respect to object in local space)  one of the means to do that is EKF , but we also using (Free Space model + Madwick algo), (Dijikstra’s node model + madwick algo) to compute this , but EKF theoretically speaking will do it more accurately as it also estimated dynamic params like x velocity vector , y velocity vector , heading rate vector etc.. As , mentioned earlier all my fusers for (imu speed + gps speed) ,(imu Z Vect + baro Z Vect) all are based on EKF , but are not implemented as blocks but as parts.

“But for the second point I'd say that the second CPU is as critical as the first one, as it manages the GPS and waypoint management : if it has any bug in its code it can result as well in crash or fly-away.”

Precisely , that why it should be isolated and source not left for any one to code , Efficiency makes a lot of difference while coding a GPS navigation block mine occupies about 12Kb of physical memory space on chip (coded in C++ so portable to any embedded system with GCC) , maximum use of Unary operators and step wise formula implementation in stead of block based formula implementation:

Eg :

If We want to compute :

F(y)=x*sin(y*M_PI)+z*tan(y*M_PI); where x,y is of int16_t type

Instead of one block , I write:


S(y)= (float)((float)z)*tan(((float)y)*M_PI));


By doing this I can give explicit type conversion , re use the function Q(y) and debug each step while  implementing to ensure efficiency , consistency and reliability

Now , I give an interface with a specified instruction set to build apps in a SDK, like in my previous replies , by this access to core variables(like angleToTarget) from interface is not there for app developer , hence unless app developer code an app to fly away there would not be a fly away ;-) . Even if any app is executed or not the efficiency , performance and reliability will remain same (no dependencies on any additional apps or user code).


@John Arne Birkeland

The Parallax processor has a similar architecture to the NVIDIA K1 with 256 cores.

They are a good choice for making an autopilot , But then again you will never have total isolation is processes , Why ? its pretty simple the Cores that many companies advertise eg: SnapDragon S4 Quad core processor , actually don’t have isolated cores , the cores which are advertised are basically separate ALU’s which have Separated reference Instruction sets . How , it works is that the instruction’s in the instruction registers are popped at a single time onto two or more ALU’s , so in one clock cycle we can execute 2 or more instructions , but the H/W resources are the same memory is also shared , the H/W multiplexer for output is same , in some cases they go the extra step of adding two NMInterrupts of each sub module , like the intel ATOM , so h/w and resources is accessed in real time after ALU is computed and not is a buffered fashion.

What I am talking about id multicore , h/w layers totally isolated control computing environment , where tasks are separated in terms of H/W resource , Instruction registers , ALU , memory and FLASH.

@Bill Bonney

Thanks for you reply and appreciation.

Current autopilots do not have separate H/W core whose only purpose is to do one specific control , So the actual Kinematic control , gps math etc. are computed on a single core .

What I am doing is like developing a distributed organisation , I have X amount of work , Instead of getting one highly paid and smart employee to do this work , I get 5 average  and cheap employees , break the work X into 5 different tasks A,B,C,D,E .

Each employee is trained to only do his task repeatedly , so his focus is to Complete the task in hand with full attention , hence Emp1 completes task A , emp2 completes task B , emp3 task C so on n so forth , The time taken by each to do it will is much shorter than the project as a whole , Since the focus is only on one task that perfection with which it is done is better , and none of the employees are over worked hence they will not fail to show up  or quit. Basically , apply this to computing you will understand what I am trying to build , with Multiple low cost reliable chipset , I can do complex tasks by distributing and isolating their processing and do it repeatedly and reliably.


  • One of the reasons are that when an atmega can do the job why a Dsp as any additional functions can be added as application via  interface anyway.
  • Since Low cost Dsp’s only have interface channels like SPI , I2C , USB ,CAN ,(like Intel Edison architecture) I cant use it as a core 2 , as along with interfacing I need to control gimbal via PWM , decode PWM rx data , decode analog ultrasonic data , which are core MCU functions , so what I will have to do is to have a slave MCU to DSP MP which will do this , this reduces efficiency n thru put which is bad for control computing . For data computing you can use it like a GPU , where a MCU is the main core , we create instruction sets on the DSP according to what we want to calculate , give values via a bus like USB to it and get the computed data .In this arch. , the thruput and H/W flexibility of MCU is not hindered.

@Joseph Owens

Bang On you have understood the concept perfectly

Thanks to you and for the benefit of the other forum members the concept is as below:

“ The proposed architecture of parallel processing mimics the body .

We all have two ways of responding to external information :

a)      Reflex response to sensory information  : which are fast instantaneous reactions to sensory information  , not much thought or processing goes into these actions

b)      Thought out response to sensory information  : we assimilate the information understand it and respond after processing the information.


Core 1 ( KCU ) in my architecture works like a) above as far as responses to sensory information is concerned as it deals with the  effect of ambient conditions on the drone along 3 axis like a reflex at high speed responses .


Core 2 ( IHOU ) is like b) : It handles the higher level processing like navigation , APP expansion etc . this processor need not necessarily by ATMEGA 328 it could be any processor like M3 , M4 depending on the requirement of the application.


As things progress yes , the architecture will get more specific for applications as it does in nature by evolution of the drone space.

Joseph thanks once again:)

Sure will please ping me at


There is a guy here using 3 atmega328 for his FC



Thanks Andy , proves the concept works well .


Bill has a point there. 
That is: these are both different approaches, with differing sets of design goals/strategies, priorities, and audiences, complementing each other by adding choice and leading to more new ideas. So neither idea has to be wrong, in the same way that Linux can't be called a bad/wrong design from a Windows point of view and vice versa.
So perhaps and hopefully it was a clumsy choice of title, but humbly calling it a 'a new or alternate approach to hw architecture' or something like that would've been a better choice imho, as there's no point in risking inadvertently offending others who're doing a fantastic job at their set of goals.

Reply to Discussion



Season Two of the Trust Time Trial (T3) Contest 
A list of all T3 contests is here. The current round, the Vertical Horizontal one, is here

© 2019   Created by Chris Anderson.   Powered by

Badges  |  Report an Issue  |  Terms of Service