Incidents "reserved only for development team", are there more? Why hide this from us?

I decided it is time to dust off some of my old ArduPilot boards. As we all know we *should* be running the latest code at all times, per the development staff. *Supported* means running latest.

As several other users have pointed out "What I really don’t like is the lack of trust I have after every update" - http://diydrones.com/xn/detail/705844:Comment:1064781 Because of this I am always hesitant to jump onto the latest code revision. Alas, any time I have gear sit for an extended period of time I always poke around the forum for *known* issues that I may have missed.

The current top post "Naza M vs. APM 2" is what actually spawned my interest to spin up some of my old APM gear since I've been flying Naza a lot lately. http://diydrones.ning.com/forum/topics/naza-m-vs-apm-2 The tone of the conversation is Ford vs. Chevy as expected, however there has been some interesting commentary with regard to how DJI as a company chooses to inform its end users about issues.

Will Snodgrass put it quite well here http://diydrones.ning.com/xn/detail/705844:Comment:1065480 "They obviously have taken the time to setup a system that is there just so they contact you with important info regarding your product. So Why not use it?"

Somehow I got to poking around on YouTube in my hunt for recent bugs and wound up stumbling upon Marco Robustini's channel. Marco (http://diydrones.com/profile/marco67) says the following about himself on the channel: "I'm coordinator and developer of the ArduCopter Tester Team. My goal is to test the various electronic flight board to compare them and find bugs/problems and the related solutions for stable and safe flying". I thought the channel was pretty cool and I shared it with a few multi-rotor enthusiast friends of mine. One pointed out that if I had not seen Marco's Acro mode bug video that it was something that I HAD to see.

After a bit of hunting to find his "Acro mode bug" crash this gem was presented for my viewing pleasure.

I can see in the YouTube comments that Tridge said there were fixes out for this specific issue:
"Andrew Tridgell 11 months ago
this bug has now been fixed in ArduCopter master. Many thanks to Marco for his patience in working through the bug report with us and allowing us to find this bug!"

For what ever reason I simply can't remember hearing a single word about this *bug* anywhere on the forums. It most certainly was not trumpeted as something that folks should look out for. So oddly enough once again I find myself combing through the horrible Ning forum interface looking for answers. "you switched to kamikaze mode?" clearly was not what I was looking for.

After quite a bit of hunting this is what I uncovered:

Marco Robustini on January 9, 2012 at 2:51am
"I have also almost completed my heavy octo (destroyed after the crash for the "acro I-term bug" in the code = < R5)"

Based on the above comment we know the bug impacts code =< rR5 for the ArduCopter 2.1.1 tree.

Marco Robustini on January 10, 2012 at 2:41pm
"Do not fly with this version for know reason: - i2c library is not update (possible bus lockup) - "acro I term " bug is unfixed in this version. You are warned! :P"

We are told by Marco in a random forum comment not to fly ArduCopter 2.1.1 alpha. The root of the topic also has more generic bug info: "Update R5: This is a quick patch based on a bad crash Marco had. My theory was an I term that built up during wind that needed to be reset, but wasn't. It's a corner case but It bit Marco pretty bad."

R_Lefebvre on January 23, 2012 at 9:54am
http://diydrones.com/xn/detail/705844:Comment:766380
"I have been following this closely pretty much since 2.0.55 came out, and I can only think of 2 fatal bugs that definitely caused crashes. Marco's I-term bug, which was instituted in the code.... Might have been before 2.0.49?"

Marco Robustini on January 27, 2012 at 5:18pm
http://diydrones.com/forum/topics/arducopter-2-2-beta?commentId=705...
"Now John will try to explain something, but dunno if I can, I'll try anyway.
Explain it with a video that until now was reserved only for development team (back at the end of December), the incident was found that was due a code bug with "Acro I Term" (fixed after this event), and now it's time that people like you see this."

the first thing that caught my attention was that this bug was seemingly and intentionally hidden from public view? Why? I am especially concerned about this when I see comments like "I had the same, 2 times uncontrolled flights and crash." - http://diydrones.com/xn/detail/705844:Comment:769934

Marco Robustini on February 3, 2012 at 6:15am

http://www.diydrones.com/xn/detail/705844:Comment:774992
"I destroyed my heavy octo because these two lines were missing."

So… after all that reading and hunting I am still not sure exactly which versions were affected, what exactly triggers the issue and *when* it was full addressed in the code. Given the criticism with regard to how DJI has on file email address info, yet fails to utilize it to contact their end users I felt it appropriate to mention this "corner case". 3DR has all of our attention via the Ning forum package, additionally they have a Tumblr blog AND all of our info from the initial purchase. Is there any reason that there are not better efforts to help let us know what we should and should not be worrying about?

When I put my quad copter on the shelf 11 months ago it was working fine… I am hesitant to update to latest for obvious reasons, alas the version that I am running may have some latent / partially explained bug waiting to chop my face off.

Thoughts? do we just circle back to the "It is DIY, what do you expect?" mantra? Why are ANY "incidents" held from public view and made "reserved only for development team"? Should we not be sharing this stuff? Luckily no one has gotten hurt... 

Views: 1033

Reply to This

Replies to This Discussion

All defects in the software are logged here: http://code.google.com/p/arducopter/issues/list

That can be viewed by anyone at anytime. Defects are not hidden from the public. There is a repository for them at the above link so that developers can work on them and keep track of the progress.

The awesome thing about open source software is that you did not pay for it. The devs put in a TON of hours in return for nothing. That being said, it is really up to the user to be aware that this software is free to use and comes with no warranty. I would definitely suggest browsing the defect list if it is something that interests you. There are minor defects, feature requests, and occasionally major defects.

I don't believe the developers would ever hold back a potentially dangerous defect if it was discovered and verified. You also have to understand that the developers and testers are from all over the world operating on different time zones and many speak english as a second language. So by the time the whole dev list hears about a major bug, verifies its a defect in the code and then publishes something about it, it may seem like a long time. Users in the community are also not only welcome to but encouraged to post about any issues so that they can be fixed and we can all benefit from it.

Also, I think something that you put forth in your post conveys a misconception: "...per the development staff..." The developers are not staff and are not paid.

You may be right that some devs are paid now. I have been out of the dev circuit for a while now. Either way it doesn't really matter to me. I am happy to work on stuff for free!

 

I guess my point is just that this software project is really a community effort. Marco is not obligated to discuss the issues he sees with the broader community although it is excellent when he does (often). He is just part of a volunteer group of testers that are self-organized. Perhaps you could head a separate testing effort that would be as transparent as you see fit? Otherwise if you just want to monitor the issues in real-time, just visit the link I forwarded. Perhaps this software/hardware solution is not for you, but there are a lot of people who like to grow with this solution and contribute in a constructive way.

 

Either way, I hope your experience improves!

 

 

Adam: I'll share the same advice with you that I too often ignore myself. Don't feed the troll. Kevin/Drone Savant knows his agenda. I just wish that agenda would change, because he has a lot to offer, if he would simply decide to be constructive.
Yes, very cleverly hidden in plain sight, expertly done in real time as the issues are being discovered and fixed. Perhaps some issues might be presented and organized better. You have a lot of energy, why don't you do it?
Sorry, I don't quite understand. You have a free, open membership, free access to the source code, revision control history, public discussion, hundreds of thousands of man-hours of free developer time, a free forum, and I realize its "not [your] baby" and you have identified an issue you are certain you could address, namely organizing and presenting per-version bugs, but rather than contribute in a constructive way, you'd rather kick the baby because you don't feel respected because you have a long history of contensious and argumentative interactions?

You've sometimes had valid concerns, but those have been devalued by your own behavior. And there are times when some of the volunteer moderators have shown too much reflexively defensive cult of personality opposition to your curt, tense argumentative writing style. But really, if you don't want to help, then what exactly is your purpose?

Interesting read, but based on your quoted text this all surrounds an alpha release, which I'd sort of expect to not be announced on the front page in red letters but 'hidden' in the developers mailing list and a contributors personal youtube comments.

I do tend to wait a bit after a new release, but I do that with all software (especially drivers or things that can chop my face off :)).

I remember this issue.  It was this incident that got Marco on the devlist and shortly after to be appointed as lead tester.

It caused a rather big stirr in the developer group and we immediately focussed our efforts in tracking down the bug.  If I remember correctly it was Tridge who finally nailed it with a very thourough analysis.

I have found the original thread back in the devlist, but not yet the thread where Tridge explains it and fixes it.  I do remember he did this in a seperate thread.

Will post the fix if I can find it.  

Also, from the initial thread it's clear that making the video private was Marco's own initiative.  He was not a developer or official tester yet at that time, and he didn't want to cause a panic with his clearly disturbing video, before it was clear if it was a bug, or some hardware failure from his octo.  I think this was understandable and commendable behaviour.

When it became clear that it was indeed a bug, he proposed to make the video public, so others could be warned.  I guess that's what he meant with the quote '"It's now time that you see this video".  He meant it's now confirmed as a bug, be warned.

I don't think there was any coverup intended on this one.  First post in the original topic on the devlist is from Chis, encouraging everyone to get to the bottom of this crash ASAP.

Actually that video from Marco led the developers team to focus a lot more on checking code quality.  It shifted focus from feature development to reliability development.  Models for hexa and octo were developed for the auto-test suite (an automatic software test that runs every day) and a whole buch of things were implemented to be automatically tested, which were not before. (eg. mode switches).

Since then the way we implement code changes and releases has changed.  Now we work with windows in which code can be changed, and then a window in which code is tested and can no new features can be introduced, only bugfixes.  

In short, you should interpret the advice to run the latest code as 'run the latest code that is generally accepted as the most stable we have".  At this time that would be 2.8.1.  

I admit that it will cost you a bit of reading and research to find the code that is generally considered as the most stable we have... :-)  Maybe some improvement can be made there.

I've dug it all out again :-)

The fix is here : ACM: reset all I terms on gyro calibration  It was put in master on jan 3, 2012 

The bug was introduced a few revisions before (unclear exactly when), when the acro I terms were split out based on feedback of multi-wii flyers.  However it was forgotten to reset those new I-terms on arming.  This allowed the acro I terms to wind up while marco's copter was still unarmed on the ground, without them being reset on arming.  This didn't hurt as long as he didn't switch to acro mode.  But at some point he switched to acro instead of loiter.  The copter immediately got the "bad" I-term applied which commanded a 70°/s roll rate, with the known result.

From the devlist (after a lot of posts hunting the bug) :

> A few revisions back, We split the g.pi_rate_ term to give a duplicate
> set for ACRO only. This was a request based on feedback from the
> Multi-Wii guys.

> The new g.pi_acro_ PI terms didn't get added to the reset call.

great, that's an easy fix then. I'd also suggest we wipe all the I terms
at the end of startup_ground(), and all the places where we call
imu.init_accel() and imu.init_gyro(). Maybe have a reset_I_all() call ?

Cheers, Tridge

Acro was almost not used at that time, and nobody in the dev team flew acro, that's why the bug went unnoticed for a few versions.

Just so I'm clear on this, were talking about a bug that was apparently found and fixed along time ago?

If anything I see that the Dev community has been doing exactly what you what you would expect.

When I made the comment you quoted, I was talking about the system that DJI utilizes, where you must submit and verify an Email address before you can use there software. Yes 3DR has your Email when you purchase something, but you really can't expect them to hand over those addresses to an open source Dev community. Maybe something could be setup where you register to receive emails about bugs when they happen....but if your not interested in hearing about bugs in real time then that won't help you either.

If you insist on comparing this with DJI, It is at least worth mentioning that any bug found by the Naza team will be strictly an internal matter, and as here it will be fixed....you just won't know about it. (unless it is a major problem)

RSS

© 2014   Created by Chris Anderson.

Badges  |  Report an Issue  |  Terms of Service