LOITER and RTL crashing bug

Hi there,

I'm writing to report a very likely bug affecting at least firmwares 3.1 rc8 to 3.1.2 (), at least for hex X frames. The bug manifests when RTL is the last step of an AUTO mission and RTL return altitude is different from the current altitude. The drone crashes consistently in these conditions. Additionally, the drone crashes from LOITER after a non-deterministic time - sometimes is quick (tens of seconds), sometimes it can loiter for minutes. In both cases the symptoms are the same: the drone goes from hover holding output (motors ~ 1600) to shutting down 1-2 motors in 100-200ms, and then shutting down the others in another 100-200ms.  The way it shots down the motors is very specific: it reduces throttle to 12xy where xy is the same for all motors. In the attached logs for example, after LOITER the motor commands are as follows:

RCOU, 487396, 1695, 1645, 1672, 1667, 1443, 1874, 32767, 32767
RCOU, 487495, 1523, 1597, 1517, 1602, 1231, 1824, 32767, 32767
RCOU, 487595, 1398, 1421, 1368, 1451, 1231, 1581, 32767, 32767
RCOU, 487696, 1231, 1231, 1231, 1231, 1231, 1231, 32767, 32767
RCOU, 487796, 1231, 1231, 1231, 1231, 1231, 1231, 32767, 32767
RCOU, 487897, 1231, 1231, 1231, 1231, 1231, 1231, 32767, 32767
RCOU, 487996, 1231, 1231, 1231, 1231, 1231, 1231, 32767, 32767
RCOU, 488096, 1329, 1444, 1401, 1372, 1231, 1545, 32767, 32767
RCOU, 488195, 1392, 1380, 1545, 1231, 1299, 1474, 32767, 32767
RCOU, 488297, 1404, 1369, 1501, 1271, 1231, 1545, 32767, 32767

Obviously 1231 is not sufficient to sustain flight - the fact that all motors have the exact same value (1231) reinforces the likelihood of this being the output of a bug.

The RTL example is similar - the RTL command is executed right after the first RCOU:

RCOU, 742799, 1697, 1603, 1542, 1756, 1571, 1728, 32767, 32767
RCOU, 742898, 1241, 1241, 1241, 1241, 1241, 1241, 32767, 32767
RCOU, 742799, 1697, 1603, 1542, 1756, 1571, 1728, 32767, 32767
RCOU, 742898, 1241, 1241, 1241, 1241, 1241, 1241, 32767, 32767

This time the motors are all 1241 - interestingly different from 1231, but still very close, not nearly enough for flight and all equal in all motors.

In case you wonder if there is something with the drone, both bugs have been reproduced in two different drones several times: the loiter crash three times and the RTL crash twice.

In case it helps, in addition to the dataflash logs I have telemetry logs from mission planner as well as videos, but I assume that the dataflash logs are sufficient, as they had enabled practically everything relevant (I think) - certainly they had ATT, CTUN, NTUN, IMU, RCOU, GPS, and RCIN.

If this is a known bug, please drop a line here, if I can help in tracking it down, let me know.

Best,

Mihai

loiterCrash2.log

rtlFromAutoCrash.log

You need to be a member of diydrones to add comments!

Join diydrones

Email me when people reply –

Replies

  • Developer
    In reply to Oliver's question. There will be a patch out later today that includes the 7 missing characters in the code.

    It is also worth mentioning that this problem has been in the code since 2.9 (i think). There have been thousands of trouble free hours flown during this time. There is little to panic about.

    Having said that I am relieved this bug is no longer present and would rate this as a MUST upgrade because this bug has the potential to impact all copter frame types.
    • Could you please explain how to apply the patch or modify the code directly?  I would like to get this applied before I get my copter back up in the air with the gimbal and camera!  Also thanks for your contributions to the code, we non-code-savvy people really appreciate it.

    • Absolutely. No reason not to upgrade.

  • Mihai, nice work on help to track this down.  And thanks to Jonathan for finding the actual bad line.  Good work guys, this is what open source is about. :)

    • Developer

      Yeh, this bug has been in the code since 2.9 so it was very unusual to have it come to the surface so reliably. I think it is because our control algorithms are getting better and able to use more and more of the motor range.

      • Speaking about which Leonard, I do really appreciate the attempt to push the envelope and we'll do the same soon. However, for our upcoming event this coming Saturday we don't want to push the envelope. On the contrary - we want to err on the side of caution and we'd like to have the drones as stable as possible - what parameters should we change to make the flying less dramatical, i.e., less of an abrupt break when reaching waypoints, and having things as calm as possible?  Our drones are a bit portly at 2kg so they break quite hard when they do - furthermore, the stability params are on the high side (e.g., P is 0.22, D is 0.14) which if I understood the code well results in more aggressive action (which may have precipitated triggering the bug).

        Thank you for your help,

        Mihai

        • Maybe similar but hope it can help others.

          I have had similar unstable performance with our quad and hexa with params that are too aggressive/high. I suggest relaxing params. Also, noticed that twitching/dipping and other similar symptoms disappeared when I turned off logging of heavier (compute cycle intensive) params such as IMU, etc. when done using them or maybe reducing the number of params during logging. Could be a combination of both can cause severe problems such as crashing. In addition, movement (whether swinging or bumping) from nearby ferrite rings suspended from power wires going to the APM board can cause unstable accel and gyro performance during flight. After the adjustments and improvements, copters are performing well.

          Using Arducopter 3.1.2 firmware.

        • Developer
          I would suggest that you do no more than drop your auto tune Stab P by 25 percent but I wouldn't even do that.

          The most important parameter for smooth and relaxed mission performance is low acceleration settings. I am on my mobile at the moment so I can't tell you what that parameter is precisely but I think it is something like WP_Accel. Keep it to 100 to 200 and you should be good. Just remember that this will result in a slower stop if you switch to rtl.

          On the issue of greater control and pushing the envelop. The whole motivation of pushing the performance harder is to get much smoother and more controlled flight. The better we can make the control system work and the greater the motor range it can effectively use, the safer and smoother we can make the copter fly. It all goes hand in hand. It is all about control WITHOUT sacrifice. :)
          • Got it, thanks - I believe you refer to WPNAV_ACCEL which by default is 100 in 3.1.2 I believe. I'll make sure it's not any larger.

            Thanks again,

            Mihai

            • Developer

              Hi Mihai,

              Sorry, the other parameter that makes a significant difference is the HLD_LAT_P and HLD_LON_P. If you reduce them the copter will be less aggressive in the way it starts to accelerate. It will also be a little gentler in the way it reacts to disturbances.

This reply was deleted.

Activity