Entries Tagged ‘Safety’

Do you employ “Brown M&Ms” in your designs?

Wednesday, January 25th, 2012 by Robert Cravotta

I have witnessed many conversations where someone accuses a vendor of forcing customers to use only their own accessories, parts, or consumables as a way to extract the largest amount of revenue out of the customer base. A non-exhaustive list of examples of such products includes parts for automobiles, ink cartridges for printers, and batteries for mobile devices. While there may be some situations where a company is trying to own the entire vertical market around their product, there is often a reasonable and less sinister explanation for requiring such compliance by the user – namely to minimize the number of ways an end user can damage a product and create avoidable support costs and bad marketing press.

The urban legend that the rock band Van Halen employed a contract clause that required a venue to provide a bowl of M&Ms backstage but with all of the brown candies removed is not only true, but provides an excellent example of such a non-sinister explanation. According to David Lee Roth (the band’s lead singer) autobiography, the bowl of M&Ms with all of the brown candies removed was a nearly costless way for them to test whether the people setting up their stage followed all of the details in their extensive setup and venue requirements. If the band found a single brown candy in the bowl, they ordered a complete line check of the stage before they would agree that the entire stage setup met their safety requirements.

This non-sinister description is consistent with the type of products that I hear people complain that the vendor is merely locking them into the consumables for higher revenues. However, when I examine the details I usually see a machine, such as an automobile, that requires tight tolerances on every part; otherwise small variations in non-approved components can combine to create unanticipated oscillations in the body of the vehicle. In the case of printers, variations in the formula for the ink can gum up the mechanical portions of the system when put through the wide range of temperature and humidity environments that printers are operated in. And for mobile device providers are very keen to keep the rechargeable batteries in their products from exploding and hurting their customers.

First, do you employ some clever “Brown M&M” in your design that helps to signal when components may or may not play together well? This could be as simple as performing a version check of the software before allowing the system to go into full operation. Or is the concept of “Brown M&Ms” just a story to cover greedy practices by companies?

How should embedded systems handle battery failures?

Wednesday, November 30th, 2011 by Robert Cravotta

Batteries – increasingly we cannot live without them. We use batteries in more devices than ever before, especially as the trend to make a mobile version of everything continues its relentless advance. However, the investigation and events surrounding the battery fires for the Chevy Volt is yet another reminder that every engineering decision involves tradeoffs. In this case, damaged batteries, especially large ones, can cause fires. However, this is not the first time we have seen damaged battery related issues – remember the exploding cell phone batteries from a few years ago? Well that problem has not been completely licked as there are still reports of exploding cell phones even today (in Brazil).

These incidents remind me of when I worked on a battery charger and controller system for an aircraft. We put a large amount of effort into ensuring that the fifty plus pound battery could not and would not explode no matter what type of failures it might endure. We had to develop a range of algorithms to constantly monitor each cell of the battery and appropriately respond if anything improper started to occur with any of them. One additional constraint on our responses though was that the battery had to deliver power when it was demanded by the system despite parts of the battery being damaged or failing.

Even though keeping the battery operating as well as it can under all conditions represents an extreme operating condition, I do not believe it is all that extreme a condition when you realize that automobiles and possibly even cell phones sometimes demand similar levels of operation. I recall discussing the exploding batteries a number of years ago, and one comment was that the exploding batteries was a system level design concern rather than just a battery manufacturing issue – in most of the exploding phones cases at that time, the explosions were the consequence of improperly charging the battery at an earlier time. Adding intelligence to the battery to reject a charging load that was out of some specification was a system-level method of minimizing the opportunity to damage the batteries via improper charging.

Given the wide range of applications that batteries are finding use in, what design guidelines do you think embedded systems should follow to provide the safest operation of batteries despite the innumerable ways that they can be damaged or fail? Is disabling the system appropriate?

Food for thought on disabling the system is how CFL (compact fluorescent lights) handle end-of-life conditions for the bulbs when too much of the mercury has migrated to the other end of the lighting tube – they purposefully burn out a fuse so that the controller board is unusable. While this simple approach avoids operating a CFL beyond its safe range, it has caused much concern among the user population as more and more people are scared by the burning components in their lamp.

How should embedded systems handle battery failures? Is there a one size fits all approach or even a tiered approach to handling different types of failures so that users can confidently use their devices without fear of explosions and fire while knowing when there is a problem with the battery system and getting it fixed before it becomes a major problem?

Is “automation addiction” a real problem?

Wednesday, August 31st, 2011 by Robert Cravotta

The recent AP article highlights a draft FAA study (I could not find a source link, please add in comments if you find) that finds that pilots sometimes “abdicate too much responsibility to automated systems.” Despite all of the redundancies and fail-safes built into modern aircraft, a cascade of failures can overwhelm pilots who have only been trained to rely on the equipment.

The study examined 46 accidents and major incidents, 734 voluntary reports by pilots and others as well as data from more than 9,000 flights in which a safety official rides in the cockpit to observe pilots in action. It found that in more than 60 percent of accidents, and 30 percent of major incidents, pilots had trouble manually flying the plane or made mistakes with automated flight controls.

A typical mistake was not recognizing that either the autopilot or the auto-throttle — which controls power to the engines — had disconnected. Others failed to take the proper steps to recover from a stall in flight or to monitor and maintain airspeed.

The investigation reveals a fatal airline crash near Buffalo New York in 2009 where the actions of the captain and co-pilot combined to cause an aerodynamic stall, and the plane crashed into the ground. Another crash two weeks later in Amsterdam involved the plane’s altimeters feeding incorrect information to the plane’s computers; the auto-throttle reduced speed such that the plane lost lift and stalled. The flight’s three pilots had not been closely monitoring the craft’s airspeed and experienced “automation surprise” when they discovered the plane was about to stall.

Recently, crash investigators from France are recommending that all pilots get mandatory training in manual flying and handling a high-altitude stall. In May, the FAA proposed that pilots be trained on how to recover from a stall, as well as expose them to more realistic problem scenarios.

But other new regulations are going in the opposite direction. Today, pilots are required to use their autopilot when flying at altitudes above 24,000 feet, which is where airliners spend much of their time cruising. The required minimum vertical safety buffer between planes has been reduced from 2,000 feet to 1,000 feet. That means more planes flying closer together, necessitating the kind of precision flying more reliably produced by automation than human beings.

The same situation is increasingly common closer to the ground.

The FAA is moving from an air traffic control system based on radar technology to more precise GPS navigation. Instead of time-consuming, fuel-burning stair-step descents, planes will be able to glide in more steeply for landings with their engines idling. Aircraft will be able to land and take off closer together and more frequently, even in poor weather, because pilots will know the precise location of other aircraft and obstacles on the ground. Fewer planes will be diverted.

But the new landing procedures require pilots to cede even more control to automation.

These are some of the challenges that the airline industry is facing as it relies on using more automation. The benefits of using more automation are quite significant, but it is enabling new kinds of catastrophic situations caused by human error.

The benefits of automation are not limited to aircraft. Automobiles are adopting more automation with each passing generation. Operating heavy machinery can also benefit from automation. Implementing automation in control systems enables more people with less skill and experience to operate those systems without necessarily knowing how to correct from anomalous operating conditions.

Is “automation addiction” a real problem or is it a symptom of system engineering that has not completely addressed all of the system requirements? As automation moves into more application spaces, the answer to this question becomes more important to define with a sharp edge. Where and how should the line be drawn for recovering from anomalous operating conditions; how much should the control system shoulder the responsibility versus the operator?