Software testing and the Deming experiment.

3 min readSep 7, 2018

I recently saw an interesting video of a lecture by Dr. W. Edwards Deming where he spoke about his infamous Red Bead Experiment. The simple premise is a factory that is represented by a box full of beads. 80% are white and 20% are red. Workers were asked to dip a paddle into the box of beads and bring it out with 50 beads in small holes. This is what it looks like.

The customer of this factory demands that there can be no more than 5 defects (or red beads) in each shift (or paddle load). You can watch the video here to get the whole idea. Red Bead Experiment with Dr. W. Edwards Deming

Since I am a tester, the red beads somehow started to remind me of bugs in a piece of software.

In Deming’s original experiment, there were people at the end of the production line counting the red bead defects and others who recorded them. We are told that the customer will cancel the project if they see too a high number of defects delivered. The experiment shows that while the process may be flawless, we can still see a lot of variation in the output on a day to day basis.

This is certainly true of software development too. No matter how perfectly we code, there will always be a variance of circumstances in the platforms we use, in the architecture we’ve decided, in the production environment and even the user habits of our software product.

The number of red beads will always have a luck-factor in it.

Of course the bead experiment is designed to demonstrate the impact of a bad management practice to the production. That is why it seems so frustrating to witness the system at work. In managing teams, there are two faulty assumptions that I’ve seen too frequently in traditional testing teams too. Think about key performance indicators of a testing team for example.

The first assumption is that each worker can control his or her performance. Deming even estimated that 94 percent of the variation in any system is attributable to the system, not to the people working in the system. Yes, the percentage is arguable, but the point remains.
The second assumption is that any system variation will be equally distributed across people or teams. Deming claimed that that there is no basis for this assumption in real life experiences.

With these kind of assumptions at play, people will first fight to improve the process. If that doesn’t work as a result of bad management, people tend to start playing games to improve the metrics.

Thinking about software projects then. In the development phase, bugs will appear in the project daily. Instead of trying to manage the efficiency of the people involved, we must change the way we deal with the bugs in the first place.

The number of red beads at the end of the production line always has a luck-factor in it. To tweak the luck-factor to our favor, there is only one thing to do.

Seeing the bead experiment unfold, it’s easy to see how stupid the production process really is. Why would someone in their right mind count the red beads only at the end of the production?

It would be a lot easier to count the beads right after the paddle is out of the box and then re-paddle if necessary to meet the required standard. That way everyone could keep their jobs and the customer would be happy too.

But now, if we started to tweak the process anyways, why can’t we just start handpicking red beads out of the box before paddling? The line of questioning could go on and on.

The point is simple really. The earlier in the process we deal with the red beads, the better the pass rate in the end. In other words, less bugs in the final version.

Software testing and the Deming experiment.

Written by Antti Niittyviita