The collision between an Airbus A380 and a Bombardier CRJ-700 this week at John F. Kennedy International Airport in New York City reminded me of some parallels and lessons-learned when we upgraded the target processor with a faster version. I shared one of the lessons learned from that event in an article about adding a version control inquiry into the system. A reader added that the solution we used still could suffer from a versioning mismatch and suggested that version identifications also include an automatically calculated date and time stamp of the compilation. In essence, these types of changes in our integration and checkout procedures helped mitigate several sources of human or operator error.
The A380 is currently the world’s largest passenger jet with a wingspan of 262 feet. The taxiways at JFK Airport are a standard 75-foot-wide, but this accident is not purely the result of the plane being too large as there has been an Operation Plan for handling A380s at JFK Airport that has been successfully used since the 3rd quarter of 2008. The collision between the A380 and the CRJ appears to be the result of a series of human errors stacking onto each other (similar to the version inquiry scenario). Scanning the 36-page operation plan for the A380 provides a sense of how complicated it is to manage the ground operations for these behemoths.
Was the A380 too large for the taxiway? Did the CRJ properly clear the taxiway (per the operation plan) before the A380 arrived? Did someone in the control tower make a mistake in directing those two planes to be in those spots at the same time? Should someone have been able to see what was going to happen and stopped it in time? Should the aircraft sensors have warned the pilot that a collision was imminent? Was anyone in this process less alert or distracted at the wrong time? There have been a number of air traffic controllers that were caught sleeping on the job within the last few months, with the third instance happening this week.
When you make changes to a design, especially when you add a bigger and better version of a component into the mix, it is imperative that the new component be put through regression testing to make sure there are no assumptions broken. Likewise, the change should flag an effort to ensure that the implied (or tribal knowledge) mechanisms for managing the system accommodate for the new ways that human or operator error can affect the operation of the system.
Do you have any anecdotes that highlight how a new bigger and better component required your team to change other parts of the system or procedures to mitigate new types of problems?