I decided it is time to dust off some of my old ArduPilot boards. As we all know we *should* be running the latest code at all times, per the development staff. *Supported* means running latest.
As several other users have pointed out "What I really don’t like is the lack of trust I have after every update" - http://diydrones.com/xn/detail/705844:Comment:1064781 Because of this I am always hesitant to jump onto the latest code revision. Alas, any time I have gear sit for an extended period of time I always poke around the forum for *known* issues that I may have missed.
The current top post "Naza M vs. APM 2" is what actually spawned my interest to spin up some of my old APM gear since I've been flying Naza a lot lately. http://diydrones.ning.com/forum/topics/naza-m-vs-apm-2 The tone of the conversation is Ford vs. Chevy as expected, however there has been some interesting commentary with regard to how DJI as a company chooses to inform its end users about issues.
Will Snodgrass put it quite well here http://diydrones.ning.com/xn/detail/705844:Comment:1065480 "They obviously have taken the time to setup a system that is there just so they contact you with important info regarding your product. So Why not use it?"
Somehow I got to poking around on YouTube in my hunt for recent bugs and wound up stumbling upon Marco Robustini's channel. Marco (http://diydrones.com/profile/marco67) says the following about himself on the channel: "I'm coordinator and developer of the ArduCopter Tester Team. My goal is to test the various electronic flight board to compare them and find bugs/problems and the related solutions for stable and safe flying". I thought the channel was pretty cool and I shared it with a few multi-rotor enthusiast friends of mine. One pointed out that if I had not seen Marco's Acro mode bug video that it was something that I HAD to see.
After a bit of hunting to find his "Acro mode bug" crash this gem was presented for my viewing pleasure.
I can see in the YouTube comments that Tridge said there were fixes out for this specific issue:
"Andrew Tridgell 11 months ago
this bug has now been fixed in ArduCopter master. Many thanks to Marco for his patience in working through the bug report with us and allowing us to find this bug!"
For what ever reason I simply can't remember hearing a single word about this *bug* anywhere on the forums. It most certainly was not trumpeted as something that folks should look out for. So oddly enough once again I find myself combing through the horrible Ning forum interface looking for answers. "you switched to kamikaze mode?" clearly was not what I was looking for.
After quite a bit of hunting this is what I uncovered:
Marco Robustini on January 9, 2012 at 2:51am
"I have also almost completed my heavy octo (destroyed after the crash for the "acro I-term bug" in the code = < R5)"
Based on the above comment we know the bug impacts code =< rR5 for the ArduCopter 2.1.1 tree.
Marco Robustini on January 10, 2012 at 2:41pm
"Do not fly with this version for know reason: - i2c library is not update (possible bus lockup) - "acro I term " bug is unfixed in this version. You are warned! :P"
We are told by Marco in a random forum comment not to fly ArduCopter 2.1.1 alpha. The root of the topic also has more generic bug info: "Update R5: This is a quick patch based on a bad crash Marco had. My theory was an I term that built up during wind that needed to be reset, but wasn't. It's a corner case but It bit Marco pretty bad."
R_Lefebvre on January 23, 2012 at 9:54am
http://diydrones.com/xn/detail/705844:Comment:766380
"I have been following this closely pretty much since 2.0.55 came out, and I can only think of 2 fatal bugs that definitely caused crashes. Marco's I-term bug, which was instituted in the code.... Might have been before 2.0.49?"
Marco Robustini on January 27, 2012 at 5:18pm
http://diydrones.com/forum/topics/arducopter-2-2-beta?commentId=705844%3AComment%3A769961
"Now John will try to explain something, but dunno if I can, I'll try anyway.
Explain it with a video that until now was reserved only for development team (back at the end of December), the incident was found that was due a code bug with "Acro I Term" (fixed after this event), and now it's time that people like you see this."
the first thing that caught my attention was that this bug was seemingly and intentionally hidden from public view? Why? I am especially concerned about this when I see comments like "I had the same, 2 times uncontrolled flights and crash." - http://diydrones.com/xn/detail/705844:Comment:769934
Marco Robustini on February 3, 2012 at 6:15am
http://www.diydrones.com/xn/detail/705844:Comment:774992
"I destroyed my heavy octo because these two lines were missing."
So… after all that reading and hunting I am still not sure exactly which versions were affected, what exactly triggers the issue and *when* it was full addressed in the code. Given the criticism with regard to how DJI has on file email address info, yet fails to utilize it to contact their end users I felt it appropriate to mention this "corner case". 3DR has all of our attention via the Ning forum package, additionally they have a Tumblr blog AND all of our info from the initial purchase. Is there any reason that there are not better efforts to help let us know what we should and should not be worrying about?
When I put my quad copter on the shelf 11 months ago it was working fine… I am hesitant to update to latest for obvious reasons, alas the version that I am running may have some latent / partially explained bug waiting to chop my face off.
Thoughts? do we just circle back to the "It is DIY, what do you expect?" mantra? Why are ANY "incidents" held from public view and made "reserved only for development team"? Should we not be sharing this stuff? Luckily no one has gotten hurt...
Replies
Just so I'm clear on this, were talking about a bug that was apparently found and fixed along time ago?
If anything I see that the Dev community has been doing exactly what you what you would expect.
When I made the comment you quoted, I was talking about the system that DJI utilizes, where you must submit and verify an Email address before you can use there software. Yes 3DR has your Email when you purchase something, but you really can't expect them to hand over those addresses to an open source Dev community. Maybe something could be setup where you register to receive emails about bugs when they happen....but if your not interested in hearing about bugs in real time then that won't help you either.
If you insist on comparing this with DJI, It is at least worth mentioning that any bug found by the Naza team will be strictly an internal matter, and as here it will be fixed....you just won't know about it. (unless it is a major problem)
I've dug it all out again :-)
The fix is here : ACM: reset all I terms on gyro calibration It was put in master on jan 3, 2012
The bug was introduced a few revisions before (unclear exactly when), when the acro I terms were split out based on feedback of multi-wii flyers. However it was forgotten to reset those new I-terms on arming. This allowed the acro I terms to wind up while marco's copter was still unarmed on the ground, without them being reset on arming. This didn't hurt as long as he didn't switch to acro mode. But at some point he switched to acro instead of loiter. The copter immediately got the "bad" I-term applied which commanded a 70°/s roll rate, with the known result.
From the devlist (after a lot of posts hunting the bug) :
> set for ACRO only. This was a request based on feedback from the
> Multi-Wii guys.
>
> The new g.pi_acro_ PI terms didn't get added to the reset call.
great, that's an easy fix then. I'd also suggest we wipe all the I terms
at the end of startup_ground(), and all the places where we call
imu.init_accel() and imu.init_gyro(). Maybe have a reset_I_all() call ?
Cheers, Tridge
Acro was almost not used at that time, and nobody in the dev team flew acro, that's why the bug went unnoticed for a few versions.
I remember this issue. It was this incident that got Marco on the devlist and shortly after to be appointed as lead tester.
It caused a rather big stirr in the developer group and we immediately focussed our efforts in tracking down the bug. If I remember correctly it was Tridge who finally nailed it with a very thourough analysis.
I have found the original thread back in the devlist, but not yet the thread where Tridge explains it and fixes it. I do remember he did this in a seperate thread.
Will post the fix if I can find it.
Also, from the initial thread it's clear that making the video private was Marco's own initiative. He was not a developer or official tester yet at that time, and he didn't want to cause a panic with his clearly disturbing video, before it was clear if it was a bug, or some hardware failure from his octo. I think this was understandable and commendable behaviour.
When it became clear that it was indeed a bug, he proposed to make the video public, so others could be warned. I guess that's what he meant with the quote '"It's now time that you see this video". He meant it's now confirmed as a bug, be warned.
I don't think there was any coverup intended on this one. First post in the original topic on the devlist is from Chis, encouraging everyone to get to the bottom of this crash ASAP.
Interesting read, but based on your quoted text this all surrounds an alpha release, which I'd sort of expect to not be announced on the front page in red letters but 'hidden' in the developers mailing list and a contributors personal youtube comments.
I do tend to wait a bit after a new release, but I do that with all software (especially drivers or things that can chop my face off :)).
All defects in the software are logged here: http://code.google.com/p/arducopter/issues/list
That can be viewed by anyone at anytime. Defects are not hidden from the public. There is a repository for them at the above link so that developers can work on them and keep track of the progress.