beagle.jpgHello everyone, beaglepilot project has reached two most important milestones in the last couple of weeks. First one was first Arduplane flight by Andrew Tridgell: First Flight Plane and first Arducopter flight by Victor Mayoral Vilches: First Flight Quad. As Victor mentioned in his blog post that he was running Arducopter @ 100Hz and that we are working on enabling 400Hz. In this post I'll be sharing some figures which display performance of Arducopter running @ 400Hz and under three prominent conditions, the conditions are as follows:

1. no load: The code is executed without any stress i.e. no extra software is run besides background processes.

2. Very heavy load: The code is executed under very high stress environment: 2 CPU-bound processes, 1 I/O-bound process, and 1 memory allocator process. Under this environment CPU is running @ 100% and MEM is running @ greater than 50% overall.

3. Only CPU load: The code is  executed with load only over CPU (100%) using 7 CPU-bound processes.

Tool used for analysis is a tool named Stress which is a simple workload generator for POSIX systems.

Following is a chart of results obtained:

URpF6hW.jpgclick on the image to checkout full perf gists.

Description about kernels used:

Preempt-rt: this version of kernel is 3.8.13 with preempt-rt patch (about) and basic preemption enabled.

Vanilla: plain 3.8.13 kernel without any patch or extra configs.

Preempt: this version of kernel is plain 3.8.13 kernel with preemptible kernel option enabled.

You all are welcome to share your conclusions or critique on this data.

Thankyou.

Regards,

Siddharth Bharat Purohit

E-mail me when people leave their comments –

You need to be a member of diydrones to add comments!

Join diydrones

Comments

  • Michael, The term APM 4.0 is referred to hardware version like APM1, APM2 .x, APM3.0(referred to as Pixhawk), APM4.0 (upcoming linux Autopilot platform). For software, the nomenclature APM [version] is not used, instead APM:[vehicle_name] [version] is used because every vehicle has got their own development cycle, you don't have release version names as APM 3.x instead you have APM:Copter 3.x (or AC 3.x for short) or Arducopter/Arduplane/APMRover [version_name]. In short APM4.0 is a new name for BBB+PXF. I hope this clears things up.

  • @Siddharth, can we get a post title change to replace "APM 4.0" for "APM 3.3 for PDC Cape and BBB" to avoid confusion or is "APM 4.0" now official hardware?

    (see http://diydrones.com/profiles/blogs/apm4-0-first-copter-flight#c_0f4).

  • I'd be interested to know how performance is affected when network resources are in use.  For example, suppose you hooked up an IP camera and wanted to transmit that video stream over an 802.11 link while simultaneously running the Ardupilot software.

  • Developer

    Looking good, nice to see the tests.

    When I give ArduCopter a performance test (normally a couple of weeks before an official release), I stick it into Loiter mode and hold the throttle up and pitch forward.  This ensures that the navigation and althold controllers are working their hardest.

    The numbers in the chart are from the running at 400hz so it's already very clear that it's running a lot faster than the Pixhawk.  The pixhawk consumes about 80% CPU vs 17.5% for Linux.  The number of slow loops is also well over twice what it is on Linux although I think that has something to do with some badly behaved I/O on the pixhawk that we haven't gotten to the bottom of yet.

  • Hi folks, we came to similar conclusions for PenguPilot: PREEMPT_RT does not have any advantage over PREEMPT... in the end it's more important to set up the priorities/scheduling policy in good correct way. Then you can start to compile your Linux Kernel in flight, even using a single-core CPU :)

  • Hi Tridge,

    Thanks for your comments. I'll surely look into the points you mentioned and come up with new analysis. Preempt-rt kernel had your i2c patch included and i used preempt kernel you had shared earlier, but vanilla kernel is the plain distributed kernel without any extra settings or patches.

  • Developer

    Hi Sid,

    Thanks for this analysis!

    It is indeed a puzzle why PREEMPT_RT isn't as good. I'd suggest trying with/without various drivers, especially things like network drivers. The idea would be to identify which drivers (if any) are causing the issue. It would also be good to try with LOG_BITMASK set to zero (to eliminate logging).

    You may also like to graph the PM results rather than relying on just the single value every 10s. The perf results are available in the logs in the /var/APM/logs/ directory, but only if you arm the copter. You should really test when armed anyway, and perhaps when in loiter, as that is a bit more valid test (although given the CPU speed of the BBB, I'd be very surprised if the extra processing in loiter makes any difference).

    You might also like to look in /proc/interrupts and see if there any any significant interrupt sources while you are testing (eg. network, disk, UARTs etc).

    Cheers, Tridge

    PS: I presume all 3 kernels had the patch to run both I2C buses at 400kHz?

  • Thanks for putting all this together Sid. This is an amazing work and it complements nicely the analysis i performed during BeaglePilot.

    This results are indeed quite interesting and i believe that there's room for discussing why exactly PREEMPT (Vanilla with the PREEMPT option enabled) outperforms PREEMPT_RT.

    I'm sharing below a table with some results i got a few months ago:

     Kernel   Min (us)  Avg (us) Max (us)
    vanilla  14  19 193
    PREEMPT  16  21  68
    RT_PREEMPT  20  27  91
    Xenomai  15 23 630

    It matches nicely with the PERF results Sid just got. PREEMPT outperforms PREEMPT_RT but, shouldn't it be the other way around?

  • So how does that describe the effect of CPU load on latency?  Why not toggle a pin every time the Kalman filter iterates & record a sound file during heavy CPU load.

  • Sorry Daniel, my bad, I included that detail in my first draft, but lost it due to browser crash then forgot to include it. Anyhow, you measure the performance using PERF values which are in this format
    [No. of slow loops]/[No.of loops analyzed] [Max time in us taken by anyone among analyzed loop], main loops are referred to as loops here. ideally their value for 400Hz should be "0/4000 2500" but we can live with (0-5)/4000 (<4000). I hope it clears things up.

This reply was deleted.