Have you experienced a “bad luck” test failure?

Wednesday, December 7th, 2011 by Robert Cravotta

Despite all of the precautions that the Mythbusters team takes when doing their tests, the team accidentally launched a cannonball into a neighborhood and through a home. The test consisted of firing a 6-inch cannonball out of a homemade cannon to measure the cannonball’s velocity. The cannonball was fired at a sheriff’s bomb disposal range, and it was supposed to hit large containers filled with water. The projectile missed the containers and made an unlucky bounce off a safety beam sending it into the nearby neighborhood. Luckily, despite the property damage, including careening through a house with people sleeping in it, no one was hurt.

This event reminds me of a number of bad luck test failures I have experienced. Two different events involved similar autonomous vehicle tests, but the failures were due to interactions with other groups. In the first case, we experienced a bad luck failure during a test flight that failed because we had delayed the test to ensure that the test could complete successfully. In this test, we had a small autonomous vehicle powered with rocket engines. The rocket fuel (MMH and NTO) is very dangerous to work with, so we handled it as little as possible. We had fueled up the system for a test flight when the word came down that the launch was going to be delayed because we were using the same kind of launch vehicle that had just experienced three failed flights before our test.

While we waited for the failure analysis to complete, our test vehicle was placed into storage with the fuel (there really was no way to empty the fuel tanks as the single-test system had not been designed for that). A few months later we got the go ahead on the test, and we pulled the vehicle out of storage. The ground and flight checkouts passed with flying colors and the launch proceeded. However, during the test, once our vehicle blew its ordnance to allow the fuel to flow through the propulsion system, the seals catastrophically failed and the fuel immediately vented. The failure occurred because the seals were not designed to be in constant contact with the fuel for the months that it was in storage. The good news was that all of the electronics were operating correctly, just that the vehicle had no fuel to do what it was intended to do.

The other bad luck failure was the result of poor communication about an interface change. In this case, the system had been built around a 100Hz control cycle. A group new to the project decided to change the inertial measurement unit so that it operated at 400Hz. The change in sample rate was not communicated to the entire team and the resulting test flight was a spectacular spinning out of control failed flight.

In most of the bad luck failures I am aware of, the failure occurred because of assumptions that masked or hid the consequences of miscommunication or unexpected decisions made by one group within the entire team. In our case, the tests were part of a series of tests and they mostly cost us precious time, but sometimes such failures are more serious. For example, the Mars Climate Orbiter (in 1999) unexpectedly disintegrated while executing a navigation command. The root cause of that failure/error was a mismatch in the measurement systems used. One team used English units while another team used Metric units.

I guess calling these bad luck failures is a nice way to say a group of people did not perform all of the checks they should have before starting their tests. Have you ever experienced a “bad luck” failure? What was the root cause for the failure and could a change in procedures have prevented it?

Tags: ,

6 Responses to “Have you experienced a “bad luck” test failure?”

  1. C. @ LI says:

    You make your own bad luck.

  2. D. @ LI says:

    “Bad luck” is a managerial euphamism for mis-managment. Don’t ever use it it provided liscense to those who make decisions without understanding the implications of those decisions.

    Clearly the test planing team did not do an adequate analysis. (risk assement associated with time delay after fueling). More importantly, however, the test was a success. You learned that seals deteriorate (there clearly were cheaper ways to accomplish this, why those evaluations were not conducted is another question). As NASA learned during the administration of Ronald Regan, when they decided to launch on a rather cold day.

  3. R. @ LI says:

    Bad luck and trouble’s my only friend. If I didn’t have bad luck, I wouldn’t have no luck at all.

    “management” or whoever cannot be expected to know everything that could occur out there in the real world, to plan perfectly one would have to know everything, and if you tried to do that you would end up with paralysis by analysis.

    The point is to have “resilience” – the ability to recover when things don’t go according to plan.

  4. J. @ LI says:

    Mismatched software units failures have occured far too often and cannot be blamed on “Bad Luck”. Some can occur due to incomplete and inadequate systems engineering specifications and some have been due to “Software Reuse”. Both types of errors can and should be eliminated in early development design/interface reviews.

  5. R. @ LI says:

    1. Plugged in one more card into the test equipment rack. The failure (that took a while to diagnose) was that the power supply was under-specified. The result was anomalies in the test equipment that were seen as anomalies in the UUT (unit under test).

    2. The air-conditioning unit in the ceiling of the lab failed and dripped water right on top of the UUT.

    3. The equipment when delivered was too tall for the doorway of the receiving bay.

    4. Airflow plenums in the UUT were accidentally shaped in a manner as to condense water – right into the UUT.

    5. Levered my finger against a toggle switch. It didn’t move. Pressed harder – it broke. It needed to be “pulled” slightly before pressing.

    6. Plugged a 12v supply into a 6v device. The “adapter” for connecting 6v and 12v was the same.

    These were different projects (mostly) I could go on.

  6. D. @ LI says:

    I agree with Chris – those failures had nothing to do with luck.

    The fuel issue was two different design problems, the first not having any way to empty a fuelled launcher, and the second not appreciating the effects that prolonged exposure to the fuel would have on the seals.

    The change of cycle rate for only part of the project was a lack-of-design problem!

Leave a Reply