NEW RULE: bug reports using the word “regression” will be ignored

Here’s a fun game. I’ll describe a problem, and you guess whether it’s a regular bug, or a regression!

  1. “This code worked this way before, but now it works differently!”
  2. “A feature I really need was removed!”
  3. “An undocumented feature changed and now my code is broken!”
  4. “The old program had this feature, but its replacement doesn’t!”
  5. “The old program used to do it differently!”
  6. “This thing was broken in the previous release, and it’s still broken!”
  7. “I’m pretty sure it worked this way before, but now it doesn’t!”

Answers:

  1. none of them
  2. seriously none of them
  3. actually, stop saying “regression” altogether
  4. seriously just stop it

So, starting immediately and continuing through all time and space forevermore (no backsies), bug reports that use the word “regression” incorrectly will be ignored.

Here’s a mini-FAQ that should help you understand how to use “regression” correctly:

Q. When should I report a bug as a “regression?
A. Never.

Q. But what about-
A. No. Seriously. Stop it. It’s not a “regression”. It’s just a bug.

Q. But this used to work!
A. I know! It’s frustrating when things change!

Q. But it’s really important!
A. So say “it’s really important” in the bug report!

Q. Should I set the “priority” and “severity” of my bug higher if I think it’s a regression?
A. Sure – we ignore those, too!

Q. You’re joking, right?
A. Of course. We’ve always done it this way.

Q. Is there any time the word “regression” actually applies?
A. Okay, fine. There is one way a bug can actually be considered a proper regression. You need three things:

  1. A specification (not documentation! not a blog post!) that the developers actually adhere to when designing and writing their code,
  2. Evidence that a previous version of the code behaves as described by the spec, and
  3. Evidence that the current code does not behave according to the spec

Q. How often do bugs that claim to be regressions actually turn out to be regressions?
A. This has never happened in the entire recorded history of mankind. Approximately.

Q. Wow. It sounds like “regression” is basically useless in bug reports, and we should stop using it!
A. That’s not a question. But I’ll allow it, because it’s insightful and you’re very handsome.

depcheck: tags and timing

In the previous three blog posts I talked quite a bit about the depcheck test itself. But tests don’t run in a vacuum – there are a lot of moving pieces around the test which have to work properly for the test to provide accurate – and useful – results.

d(repo)/dt

First let’s talk about how builds become updates – that is, the steps between a new package getting built, and a new update appearing in your friendly Software Updater app. Behold the terrifying extent of my Inkscape skills:

New packages get built in Koji, and then when the maintainers are happy with a build, they file an Update Request in Bodhi. Normally they request that the update go into the updates-testing repository, so testers can poke at it for a while and make sure it works, and then the maintainer requests that it go to the official updates repository. And then it shows up in Software Update.

Note that the maintainer makes the request, and that makes Bodhi tag the package as pending – but only members of the Release Engineering (rel-eng) team can actually push packages into updates-testing or updates. So in the end, it’s up to the Release Engineers to decide when a package is ready for release – based on automated test results, feedback from testers and developers, and their own best judgement.

This also means that depcheck can’t move packages into updates or updates-testing. It can mark them as approved – for example, by providing Bodhi feedback, or keeping test results somewhere – but it can’t actually move packages out of the pending list by itself.

Acceptance

This presents a problem. What happens if a pending update is accepted by depcheck, but before rel-eng pushes it into the wild, another update appears in pending and somehow conflicts with the first one?

For example: let’s say the martini package requires gin and vermouth. But then someone introduces a new package – vodka – which Obsoletes: gin. (Yes, I know – this would be wrong and horrible and should obviously be forbidden by Fedora policy and/or the Geneva Convention. But let’s put that aside for now.)

So – what do we do? We could try to go back and revoke our previous test result, which would leave one or both updates unaccepted. We’d need to write a bunch of new code to be able to revoke test results, and that would leave us with less packages being accepted. This seems.. less than desirable.

Another solution – the one that I prefer – is that once a package is accepted, we should leave it accepted. Basically we’d treat it as if it was already part of the live repos. This makes sense: since the goal is always to have a consistent set of packages, once a package is accepted as being consistent we shouldn’t mess with that. (Plus, it’s probably about to be pushed by rel-eng anyway, so it’s not unreasonable to treat it as if it already has been pushed live.) So in practice, this means the first test result takes priority, and we just don’t revoke accepted packages. This makes the code simpler and it should mean we get packages being accepted quicker.

To handle this, we need to be able to split the pending list into two parts: pending and accepted. depcheck treats the packages in accepted like the packages in the live repos, and doesn’t provide test results for them (obviously – they’ve all passed already!). Only the unapproved pending packages actually get test results.

This doesn’t lock the packages or anything like that. It can still be removed from pending by the usual means – obsoleted by the maintainer submitting a newer package, dropped because of bad karma in Bodhi, forcibly removed by rel-eng, etc. Being accepted just means that the package has passed autoqa and is eligible to be pushed by rel-eng – if they see fit.

Timing: more than just the secret to comedy

There’s another wrinkle. The AutoQA system runs all tests independently of each other. This is nice because it means we can run a lot of tests in parallel, but it also means that we can have multiple depcheck tests running at the same time. Which presents a problem: what if there’s two depcheck tests running at the same time, and one test marks some packages as accepted while the other test is still running? What should the other test do?

This is a classic concurrency problem, and there’s a lot of different possible ways to resolve it – usually involving locking or looping or both. We had a few ideas for simple solutions –  for example, we could restart the test if the list of accepted packages changed during the test. Except what if we get stuck in a loop? And also this would change the test results in some cases – so even though it’s the same test and the same packages, the results would be different if Test #1 finishes before Test #2 starts. Why should test timing affect the test results?

After a lot of whiteboard sketching and hand-wringing and test code, we realized that the simplest solution is: just don’t run depcheck tests in parallel. (At least, not tests for the same release – we can still run depchecks for Fedora 13 alongside Fedora 14 tests, since they don’t interact at all.) True, this is less efficient, but the current runtime for a depcheck test is something like 50-60 seconds. During our busiest time ever, we pushed 1300 updates through Bodhi in a month. This works out to 43 updates a day – or somewhere between 35 and 45 minutes of depcheck test time daily, on average. Even if we had a huge burst of updates – say 250 updates submitted simultaneously – and for some reason depcheck takes 10x longer than I expect, rel-eng only pushes updates once a day anyway! So by the time of the first rel-eng push we’d have processed ~144 of the updates, and the rest would be done the next day. So even in a worst-case scenario the outcome is: Less than half of the updates get delayed by one day. That’s it!

In the future we will definitely want to figure out a general strategy for handling tests that want to share information and need some locking/concurrency magic. But this turns out to be unnecessary for depcheck to function correctly and quickly. So we’ll leave it alone. For now.

So what’s left?

Not much! We should be ready to start running depcheck on new updates – in a purely informative manner – in the next couple of days. And once we’re pretty sure the results are right, we’ll start work to make Bodhi only show accepted updates to rel-eng. If it all works as it should, we should able to use this info to keep broken updates from getting into the live repos ever again. Won’t that be nice?