When I first started developing embedded software, I ran into an expression that seemed to be the answer for every problem – “It’s a software problem.” At first, this expression drove me crazy because it was blatantly wrong many times, but it was the only expression I ever heard. I never heard it was a hardware problem. If the polarity on a signal was reversed – it was a software problem. If a hardware sensor changed behavior over time – it was a software problem. In short, if it was easier, faster, or cheaper to fix any problem in the system with a change to the software – it was a software problem.
Within a year of working with embedded designs, I accepted the position that any problem that software could provide a fix or limit was defined as a software problem regardless of whether the software did exactly what the design documents specified. I stopped worrying about whether management would think the software developers were inept because in the long run, they seemed to understand that a software problem did not necessarily translate to a software developer problem.
I never experienced this type of culture when I worked on application software. There were clear demarcations between hardware and software problems. Software problems occurred because the code did not capture error return codes or because the code did not handle an unexpected input from the user. A spurious or malfunctioning input device was clearly a hardware problem. A dying power supply was a hardware problem. The developer of the application code was “protected” by a set of valid and invalid operating conditions. Either a key was pressed or it was not. Inputs and operating modes had a hard binary quality to them. At worst, the application code should not act on invalid inputs.
In contrast, many embedded systems need to operate based on continuous real world sensing that does not always translate to obvious true/false conditions. Adding to the complexity, a good sensor reading in one context may indicate a serious problem in a different operating context. In the context of a closed-loop control system, it could be impossible to definitely classify every possible input as good or bad.
Was this culture just in the teams on worked on or is it prevalent in the embedded community? Does it apply to application developers? Is it always a software problem if the software can detect, limit, or fix an undesirable operating condition?