How Free Software Development Should Work

While at EuroOSCON, I went to a presentation by a company called Zimbra. They have a very cool Ajax/DHTML (both nasty terms, but what’s better?) email and collaboration suite. Check out the demo. Afterwards, I mentioned I was from the Foundation, and one of their developers said that there was one particular bug in our XMLHttpRequest implementation which caused Firefox on Linux to spin using 100% CPU when used with Zimbra – bug 273578. They’d had to make their own internal builds of Firefox for testing to get around the problem. We looked the bug up, and there was a patch with review and super-review, but it hadn’t been checked in because the author didn’t have checkin rights and no-one had been asked to do it.

I sent some email from my laptop via the conference WiFi, and around an hour later Darin checked it in on the trunk. Zimbra QAed the builds on their web application and others, and confirmed that nothing seemed to be broken. So, 48 hours later, the patch made it into the branch as well, and Firefox 1.5 on Linux will work well with Zimbra.

I keep coming across people and companies who tell me they have problems with Firefox, but when I ask them, it turns out that they have never filed or CCed themselves on bugs. They just suffer in silence. This is an extreme case – the patch was a one-liner, and the lack of it hosed their product entirely! This isn’t how it’s supposed to work; as a free software project, we can be responsive to the needs of our users – but only if we know what they are. Once Zimbra had approached us about it and explained the problem, we were able to correctly prioritise the bug. Importantly, Zimbra stayed involved in the process and did their part (the QA) to make sure that the patch got where it needed to be. So if you’re in a similar situation, communicate! :-)

28 thoughts on “How Free Software Development Should Work

  1. Pete: what’s nonsense? My points about how development needs to be cooperative and how we need to listen to our customers? Or something else?

  2. I can tell you two things I can see wrong with users helping find bugs with Firefox:

    1. Bugzilla is scary… downright terrifying, even. All those text boxes; and what on earth do half of them even do?! I think a cut-down “user” version would do miracles here.

    2. Talkback doesn’t work, and as far as I can tell, never has. It’s a bit discouraging to have a handy-dandy crash reporting pop up, and just sit there doing nothing. :(

  3. Daniel: By users, I don’t mean end-users, I’m talking here about web developers, addon developers and other such technically knowledgeable people. And Talkback does work – at least for me, and all the people whose reports are sitting here :-)

  4. Gerv: … ok, then. In that case, I wish to retract point 2 I made earlier, and instead replace it with this:

    2. Talkback has never worked on my machine. Stupid machine…

  5. I hate to agree, but bugzilla is really complicated for new people.

    That huge form that’s always displayed at the top of each bug is probably the biggest hurdle. It’s about as user-friendly as forcing Fx/Suite users to use about:config for everything.

  6. Pete, Gerv and I both talked to Kees at EuroOSCON, and his words were rather different to yours.

    Plus, there is a significant difference between a one line fix and a “bz and darin try to find the right thing” bug. The latter is *way scary*, and for one sure thing, nothing to be pushed this late. And that bug does have attention by two of our best heads (mentioned before), so this is far from ignorance on the side of Mozilla.

  7. https://bugzilla.mozilla.org/show_bug.cgi?id=277547

    So you’re happy to write patches even after lockdown from big-name companies like Zimbra (and remember 1 line patch got OpenBSD hosed), but not for ordinary users like me.

    I got hit by that little hyperthreading bug there. It’s lucky that I’m tech-literate enough to can follow the workaround but that Mozilla apps handle threading so badly in the first place is pretty crap coding. Your code reviews must really suck.

    Imagine some poor grandma who’s bought a new HT PC (which are quite common these days) getting hit!

    The worst is that it was assigned to Blake Ross who isn’t even working on Firefox any more but on his startup! There is no chance of this getting fixed despite causing crashes for tens of thousands of hyperthreaded machines, even in 1.5 Beta 2! But he’s too lazy even to reassign this reproducible crash to someone who could do something about it.

    The crappy QA doesn’t stop there. As I understand it you’ve got loads of patches which aren’t reviewed at all – how encouraging for new prospective developers – NOT!

    And this is only going to get worse now you’re the Mozilla “Corporation”. You’ve already shown you’re willing to prioritize non-urgent corporate bugs over real end-user frustrations.

    Firefox QA sucks. The worst is that there’s no accountability. When was the last time someone was told to take a mandatory safe-coding course when they checked in a security bug? Whoever’s in charge of QA should be canned.

  8. Perhaps before you launch into a whine-fest about the quality of the code someone else is giving you for free you should at least investigate the bug you say is obviously indicative of “crap coding” and “reviews [that] suck”. Currently no one who’s commented on it has given any clues at all as to where the problem is in the code, exactly, although several peoiple have mentioned that upgrading their BIOS fixed the issue. Certainly if the reviews are as sucky as you say and this sort of thing can be easily caught, it will be easy for you to track down the problem and at least point it out to someone else, so you have more of a leg to stand on when complaining about how no one will write a patch for your bug.

    As far as I can tell all the the Mozilla crew have shown is that they are willing to push through bugs that (a) have real patches, (b) have review, (c) are low- or no-risk for adding new bugs, and (d) have actual customer interest. Prioritizing those sorts of bugs over ones that have (d) only doesn’t sound like a sign of “poor QA” to me, but what do I know, I only do commercial programming for a living.

  9. If anything, this story highlights the need for change in the review/checkin process. Someone should be going through bugs with patches on a regular basis and checking them in.

  10. I’m sure Zimbra, as a recent startup with a handful of employees and no way of making money yet that I can see, would be flattered to be called a “big-name company”.

  11. Perhaps before you launch into a whine-fest about the quality of the code someone else is giving you for free you should at least investigate the bug you say is obviously indicative of “crap coding” and “reviews [that] suck”.

    I am a supporter of open-source development and the free-software movement, however, I have to say it utterly turns me off when I hear FOSS people rub it into the user that they’re getting something for free.

    As long as this attitude is hanging around FOSS, continue to expect the majority to be willing to pay for commercial and proprietary solutions. After all, their selling point is, “you get what you pay for”, and comments like the quoted one above are simply reinforcing this selling point. A more helpful and humble attitude goes much further than “just be happy you’re getting this much for free”.

    I apologize for going off-topic, but I felt it necessary to speak out against this.

  12. One might suppose that the person who posted the patch in bugzilla thought they *were* communicating. After all, is there any other established chanel for a non-insider to get their changes into the codebase? I thought that was exactly what bugzilla was for. Doesn’t the fact that a reviewed and super-reviewed patch (which fixed a real problem) just sat there uncommitted mean someone dropped the ball?

  13. From prior experience here at Zimbra (prior to the instance Gerv starts this post with), I can testify that the Mozilla/Firefox folks have been responsive in addressing some bugs that we’ve filed in the past….

  14. “Doesn’t the fact that a reviewed and super-reviewed patch (which fixed a real problem) just sat there uncommitted mean someone dropped the ball?”

    I haven’t looked at the bug in question, so it may not apply in this case, but in general, the answer is unfortunately “no”. It’s effectively been decided that there aren’t enough people resources to have a situation where no balls get dropped, so it’s quite possible for bugs to be the responsibility of the world in general, and thus get picked up by nobody.

    Hopefully someone (I’ll have a look at some point, but mostly I only have time to complain at the moment, and not do anything…) has responded to Boris’ call for help with rescuing patches from bugs that got expired

  15. 1) BugZilla sux. It’s a developer tool, not an end-user error reporting frontend. Mozilla needs a simple form that is then catalogized by someone responsible.

    2) By “communicate” you mean what exactly? They filed a bug and obviously kept themselves updated via bugzilla. That seems as far as they should have to go to have it fixed ASAP, not after accidentally meeting you on a Con… I think this whole example is wrong, users/businesses have no reason to waste time trying to catch developers on Cons, blogs or IRC, they just want to file it and see it fixed (assuming they cooperate when possible and provide enough info of course).

    Jan

  16. I find Andy’s post unfortunate. Notice how he qualifies his statement. I wish he could’ve been stronger in his statement, but obviously something’s kept him back.

  17. Is bugzilla really all that bad? It’s not meant for end users, and it’s probably a good thing as well because end users probably won’t be able to file useful bug reports and would just fill the database with cunfused information. For end users there is Talkback (and apparently there was something called Hendrix or something but I havn’t heard anything about it lately).

  18. It’s effectively been decided that there aren’t enough people resources to have a situation where no balls get dropped

    Given this well known state of affairs, why would somenoe waste time posting to Bugzilla, triaging, creating a testcase, patching, etc.?

    I’ve pretty much stopped contributing to Bugzilla for this reason — most (I haven’t measured exactly how much) of my labor goes down the drain.

  19. Bugzilla is a very useful tool, but there is a *huge* backlog of qa/triaging work needed to be done at b.m.o, as evidenced by the need to for the recent mass auto-expiring of bugs. It makes b.m.o unusable for some, and bug reports can sit there without being touched for years. We’ve had Bug Days in the past that started to make a dent, but those always fizzled out after a few weeks.

    I agree with Alan that bugzilla isn’t meant for end users, who often file duplicate or incomplete bug reports, which creates work for the rest of us. Fortunately, now they have Hendrix and Reporter.

    mconner and others are discussing some good ideas being discussed for improving the bugzilla workflow here, such as having an additional READY state to denote bugs that have had all the proper qa done on them and are ready to be worked on by developers. I’d love to have extra keywords like “needsTestcase”, “needsConfirm” (or “needsConfirmLinux”, etc.), “needsTriage”, “notDupe”, etc., as opposed to the more generic “helpwanted” or “qawanted”. Then one can know exactly what needs to be done to a bug just by looking at the state and the keywords, and someone who likes to, say, make testcases (like me) can simply search for bugs with “needsTestcase”.

    It also would be nice if we had a short period where all the developers and QA people focused on retriaging, reassigning, closing bad or obselete bug reports, and otherwise cleaning up the bug reports in b.m.o.

  20. Gerv, while I’m on topic, I have a few random ideas for improvements to b.m.o, if you’re interested.

    (1) Shouldn’t we list Core as one of the products at https://bugzilla.mozilla.org/enter_bug.cgi ? (Or do you think the difference between the front and backend is be too confusing for users and they should file Core bugs under Firefox-General or Seamonkey-General and then have QA triage it appropriately?)

    (2) At https://bugzilla.mozilla.org/describecomponents.cgi?full=1, Core is grouped under “Components” even though it is a Product and not a Component. Maybe “Backend Software” would be a better grouping. (To confuse users more, there is a Component called Core, which is for Rhino).

    (3) Also, https://bugzilla.mozilla.org/describecomponents.cgi?full=1 is missing the AUS product.

    (4) Maybe b.m.o could block users from reporting a bug if their build of FX or SM is too old or not from the trunk (if the branch is too old). Or at least provide some sort of warning or popup dialogue. And there should be a link to a page where the difference between trunk and branch is explained.

    (5) Since many users are going to skip the checking for duplicates step anyway, perhaps we could automatically do a scan based on the summary (similar to “Find a specific bug”) after the user hits “submit”.

    (6) I think it would be simpler to change Status and Resolution to:

    State: Open or Closed
    Status: if Open, then UNCONFIRMED, NEW, READY, ASSIGNED, or REOPENED
    if Closed, then FIXED, INVALID, WONTFIX, DUPLICATE, or WORKSFORME

    This also avoids the current possibilty of searching for meaningless combinations like UNCONFIRMED, FIXED. (Although this would probably be a lot to code…).

    (7) Bugs that are regressions should have option of listing what bug it is a regression of (if known) (like with depending and blocking)

    (8) On query page, it’s worth noting that you can ctrl-click on selections to unselect them (at least when help is selected).

    Hope at least one of these is useful.

  21. For me the system worked for getting this one fixed. Responses are usually within days. The patch was only waiting a few days before being processed. When I talked to Axel and Gerv they were already working on it as Zimbra needed the fix. But I didn’t have the idea that the processed stopped. It was just close to the code-freeze.

    It does help to spend time learning bugzilla and maybee more important the structure of the modules. I had to learn how to get the code, build it, find the troublespot, make improvements, test and learn how to make patches.

    The “soft” part was harder. Who does what and is supposed to to what ? The documents describe it in generic terms, without names etc. I ended up writing an email to Darin directly asking how the process should work and what I could do. He filled in the gaps with who to ask for review etc.

    I am happy with how this bug was solved (just in time) for us.

    I am working on a big project proting a (very) rich browser application from IE to Firefox. Some bugs are blocking like this one, but in general the 1.5 release is real good and stable. Just the xml handling is a dramatic. In IE you can ignore namespaces, FF demands coding according to standards……

  22. Man. Talk about misinformation. “there was a patch with review and super-review, but it hadn’t been checked in because the author didn’t have checkin rights and no-one had been asked to do it” is not exactly what was happening. What WAS happening was:

    2005-10-17 19:08:12 PDT — I mark sr+ on the patch
    2005-10-18 09:00:03 PDT — gerv here cc’s a Zimbra guy on the bug
    2005-10-18 12:45:57 PDT — Darin checks in the patch to trunk

    Now. That patch had reviews for all of one night by the time Gerv came poking. Normally, the person writing the patch would wake up the next morning, see the review comments, address them, and attach the updated patch. Then this patch would be checked in.

    Now I’m glad that through swift action on Gerv’s and Darin’s part we got this bug fixed as much as 12 hours earlier than it might have happened otherwise. But given that there were 4 days left until lockdown at that point, I think the chance of that patch missing Gecko 1.8 were about 0.

    I have no idea what people are doing comparing this bug to bugs where no one even has an idea as to what the issue is and claiming that those should be fixed for 1.8.

  23. Well, I got the impression from Zimbra that they were having trouble getting the patch progressed. If I misunderstood things, apologies.