Is assembly language a dead skillset?

Wednesday, February 2nd, 2011 by Robert Cravotta

Compiler technology has improved over the years. So much so that the “wisdom on the street” is that using a compiled language, such as C, is the norm for the overwhelming majority of embedded code that is placed into production systems these days. I have little doubt that most of this sentiment is true, but I suspect the “last mile” challenge for compilers is far from being solved – which prevents compiled languages from completely removing the need for developers that are expert at assembly language programming.

In this case, I think the largest last mile candidate for compilers is managing and allocating memory outside of the processor’s register space. This is a critical distinction because most processors, except the very small and slower ones, do not provide a flat memory space where every memory access possible takes a single clock cycle to complete. The register file, level 1 cache, and tightly coupled memories represent the fastest memory on most processors – and those memories represent the smallest portion of the memory subsystem. The majority of a system’s memory is implemented in slower and less expensive circuits – which when used indiscriminately, can introduce latency and delays when executing program code.

The largest reason for using cache in a system is to hide as much of the latency in the memory accesses as possible so as to be able to keep the processor core from stalling. If there was no time cost for accessing anywhere in memory, there would be no need to use a cache.

I have not seen any standard mechanism in compiled languages to layout and allocate an application’s storage elements into a memory hierarchy. One problem is that such a mechanism would make the code less portable – but maybe we are reaching a point in compiler technology where that type of portability should be segmented away from code portability. Program code could consist of a portable code portion and a target-specific portion that enables a developer to tell a compiler and linker how to organize the entire memory subsystem.

A possible result of this type of separation is the appearance of many more tools that actually help developers focus on the memory architecture and find the optimum way to organize it for a specific application. Additional tools might arise that would enable developers to develop application-specific policies for managing the memory subsystem in the presence of other applications.

The production alternate at this time seems to be systems that either accept the consequences of sub-optimally automated memory allocation or to impose policies that prevent loading applications onto the system that have not been run through a certification process that makes sure each program behaves to some set of memory usage rules. Think of running Flash programs on the iPhone (I think the issue of Flash on these devices is driven more by memory issues – which affect system reliability – than by dislike of another company).

Assembly language programming seems to continue to reign supreme for time sensitive portions of code that rely on using a processor’s specialized circuits in an esoteric fashion and/or rely on an intimate knowledge of how to organize the storage of data within the target’s memory architecture to extract the optimum performance from the system from a time and/or energy perspective. Is this an accurate assessment? Is assembly language programming a dying skillset? Are you still using assembly language programming in your production systems? If so, in what capacity?


43 Responses to “Is assembly language a dead skillset?”

  1. Chris Rowen says:

    Robert Cravotta’s piece on the role of assembly programming poses some good hard basic questions. It is already clear that the vast majority of applications, including embedded applications, have largely shifted to higher level programming languages for the bulk of development. And as the compilers get better, processors get faster, and programmers get even more squeezed for productivity, the fraction of all programming in high-level languages will continue to increase.

    So it’s useful to look at the places where lower level programming has held on, and ask the question: “Will these historically-unsuitable-for-compilers tasks continue in assembly code?” I see a couple of last bastions of assembly programming: DSPs, microcontrollers and low-level operating system code. Cravotta also mentions memory layout, but I’ll come back to that in a moment.

    DSPs: Traditional DSPs do have convoluted instruction sets, small partitioned X-Y memory systems, “exposed pipelines”, scant general registers and intense performance requirements – getting the last cycle out of a critical DSP kernel CAN make or break an application. And traditional DSPs have often been saddled with out-of-date compiler technology. Newer DSP architectures, however, have eliminated these architectural problems – with unified 32b address spaces, fully interlocking and bypassing pipelines and large sets of uniform registers. (The intense performance requirements remain ;-) .

    Most importantly, leading edge compilers support a full set of DSP data-types (scalar and vector, real and complex, integer, fractional and floating point) and automatic vectorization, even with unaligned data and embedding of control structures in inner loops. When the DSP architecture has special operations that don’t map readily into C or C++, these compilers provide intrinsic function mapping of the operations, including vectorization of special functions. The compiler can use 100% of the operations of these modern SIMD/VLIW DSPs.

    As a concrete example, over the past two years I’ve worked closely a wide range of Tensilica’s ConnX DSP customers, architected two generations of new DSP cores, overseen countless customer benchmarks and written numerous optimized DSP kernels. In that time, I have not written, nor seen any customer write, even one line of DSP assembly code. This doesn’t mean that DSP assembly code is unimportant. I look at tons of inner loop code in order to see that the compiler is doing the right things, and to verify that the algorithms are written in ways that best leverage the available instruction set. But this doesn’t mean the skills are out of date – quite the contrary – understanding assembly code is often essential to understanding the basic interface between hardware and software. No DSP architect or compiler writer can live without a deep understanding of the cycle by cycle interaction between a stream of assembly instructions and the hardware.

    Microcontrollers: Microcontrollers were long a favorite domain for assembly code, often to save on memory space. The steady march of Moore’s Law has substantially changed the trade-offs. In leading edge process technology, you can already fit almost than 512KB of the fastest SRAM into 1mm^2 of silicon area. You can’t afford to waste a lot of memory, but with modern compilers, especially compilers with user control over code size vs. performance tradeoffs, the compiled application can get close to space-optimized assembly code, at much higher productivity. Plus, the microcontroller architectures are getting easier for the compilers to use. They’re shifting towards 32b architectures with uniform register files and orthogonal instruction sets. There will be some role for lower level programming for legacy microcontrollers, but this is fading.

    Low-level OS code: The one place I routinely encounter assembly code development is in very select routines in operating systems, particularly in the initial boot code. In this code, many of the normal assumptions of compilers cannot be guaranteed. Memory and caches are not assumed to work. Registers may not yet be initialized. The stack pointer may not yet point to anywhere useful. A full C context may not have been established. For these transitional sequences, assembly code may still be the most convenient way to ensure only properly initialized resources are used. Even these sequences are rare, isolated to very select circumstances and even then, become less significant as better compilers and more sophisticated hardware manage these boot scenarios directly.

    A quick word on memory layout: Assembly languages typically do have explicit hooks for managing memory layout, but modern embedded compilers have closely analogous control mechanisms, giving the C programmer direct control over alignment of variables, explicit assignment of data structures to memory segments and automatic optimizations (sometimes guided by C pragmas) to disambiguate and efficiently parallelize memory references. All the power of assembly code is accessible from the compiler, particularly in conjunction with good interactive linker control that allows segments in the object to be easily mapped as needed to memory areas in the system memory map.

    My bottom line: The principles of tight code, efficient use of memory and direct access to the optimized processor features remain critical to embedded system development. Historically, programmers needed to understand and write assembly code to build optimal systems. With the progress of compilers and architectures, especially in performance-critical functions like baseband processing on DSPs, writing assembly code has become much less common, but control and optimization of the cycle-by-cycle behavior of these machines is more important than ever. So the thoroughly modern embedded developer must read DSP assembly code, but should never write it.

  2. Jon Titus says:

    Assembly-language programming gives people a better connection with the processor architecture. Some might think that an “old hat” approach, but I find the good programmers know more about processor architecture than others who simply program in a high-level language.

  3. D.T. @ LI says:

    As Robert pointed out, with the exception of a few specialty areas, assembly as a primary skill is indeed dying (dead?). As more and more experts, manufacturers and industries package and sell more and more generic HW and SW solutions to specific problems, there is less and less need for embedded/systems implementers to write assembly. We used to write full apps in assembly, then just libraries, and now it is just ISRs and a few optimization spots that get assembled.

    This is a “good thing”. For lower development times, portability, easier maintenance and most importantly, assembly is great for what it’s good for, but terrible as a general implementation language. “Development Cogitation Cycles”(TM) are better spent at higher abstraction levels when possible.


  4. B.Z. @ LI says:

    As long as microprocessor architecture continues to rely on classical structures, assembly language will not be dead, but it is exceedingly rare. Experts are found in the compiler and tool companies. The few times in the last twenty years when I really needed my assembly skills were centered around compiler bugs. Mostly they have been unusual combinations that led to optimization miscodings, but one was just the use of zero extension instead of sign extension for a type conversion.

  5. L.R. @ LI says:

    Yes, fewer and fewer engineers know how to program in assembly, but when it comes to bringing up a new custom board or to port an OS to a new architecture, the industry relies on a diminishing work force of experts who can read and write assembly and have a good understanding of both software and hardware disciplines.
    The younger generation seem to take a lot of the low-level stuff for granted.

  6. A.P. @ LI says:

    I agree that assembly is used less and less. I haven’t written an ISR in assembly in…decades, not since the mid-80s probably. Certainly not since I shifted to C. Since then, about the only assembly has been the start-up code but, more and more, I see that done in C as well.

    I still like to do debug in mixed-mode so that I can see what the compiler generated for code. I sometimes find bugs that way, not only where the code doesn’t do what I intended but, very occasionally, where the compiler itself has a bug. But the younger engineers seem trust the compilers and don’t want to see what the system is doing at that level. And, given the way optimizing compilers can twist code, I’m not sure I blame them.

  7. Gary Lynch says:

    When I was in school I drove old, unreliable cars, that
    frequently broke down leaving me stranded somewhere,
    sometimes miles from civilization. I kept a basic tool kit
    in the car and relied on my understanding of auto mechanic
    fundamentals (in particular: the fire triangle) to get
    myself out of many seemingly hopeless situations.

    Cars are much more complex now. I can’t use my thumbnail to
    gap the points because my car doesn’t have any. Although my
    grasp of the fundamentals remains solid, cars–as they grow
    more reliable–are growing ever less likely to fail in ways
    that I can fix on the road (Ironically: because of all the
    embedded micros that control them).

    Assembly language parallels this trend nicely. The more
    experience I garnered, the more I realized how writing in
    assembly language would solve some immediate problem, but
    create larger and tougher ones in the process and I shied
    away from it as much as possible.

    Knowing assembly language is still valuable from a
    trouble-shooting standpoint. Even after writing in C for 20
    years I can still make a mistake that only becomes apparent
    after I see what instructions the compiler turns my
    statement into. It also forces me into more intimate contact
    with the hardware so I can debug at that level when nothing
    else works.

    So for me it comes down to how much time you put into
    learning the skill vs the percentage of all the problems you
    will face that this investment will help with. Those who
    don’t see assembly language as worthwhile allow there may be
    (infrequent) bugs they can’t fix and be content to call AAA
    and watch helplessly from the roadside while someone else
    takes over.

    I am educating myself this month on a new DSP architecture
    to be used in our next project. The instruction set is
    downright hideous, but you can bet I’m going to understand
    it before the job is finished.

    Gary Lynch printf(“lynchg%cstacoenergy%ccom”, 55+9, 55-9)

  8. I.B. @ LI says:

    I agree with your points on the memory issues, but this have nothing to do with assembly. Assembly permits you to do the low-level local optimizations, but the whole memory issue is global by definition. If your system uses cache, the locality is the name of the game. You can achieve the data locality by defining your data structures, so the frequently used together data items will be close; you can improve the code locality by controlling your linker. Both techniques are implemented in C exactly the same way there are in assembly.
    If your system uses NUMA, the answers are essentially the same: use linker to put your code where you need it; use “memory partitions” mapped to the specific memory regions to allocate date memory of the sort you need. Totally unrelated to the assembler, again.
    An addition point is, in the modern pipelined CPU’s it’s very hard to choose the speed-optimal coding manually. As the result, the good C compilers are produce more efficient code than assembly programmer can. Just be sure your compiler’ optimization is tuned for your specific processor. This is where the processor-vendors’ compilers are worth their cost.
    Just to clarify: I used to program a lot in Assembly a long years ago. Last times I used this skill was to figure bugs in the assembler :(

  9. P.M. @ LI says:

    I don’t agree with the assertion that assembly is needed for fine-tuning memory allocation; most of my work has been with systems that have both zero waitstate and tens-of-waitstate memory. The linker allows placement of code and data into specific memory regions. Items can be moved between regions as performance requirements change, and C, C++ and assembler code need never know the difference. Of course, progammers need to know not to reference ‘slow’ code and data in critical sections.

    I spent about three years hand-optimizing C code, looking at the generated assembler and tweaking the C code to get better results. Over that time, I saw the compiler improve to such an extent that most ‘low hanging fruit’ is detected and optimized automatically. I would say that in general, a decent C compiler generates code equivalent to that produced by a good assembler programmer – given the constraints of any C calling standard that an assembler programmer may choose to ignore.

    There are still a few cases where assembler programming is essential. They generally involve taking advantage of instructions that can not be represented directly in C, for example instructions to perform saturated arithmetic. Within an OS kernel, it is often necessary to perform a full-register context switch, including moving from one stack to another – definitely a tricky operation in a high level language.

    Familiarity with assembler is also useful in debugging. Looking at the generated assembler code has shown me more than once that someone has stumbled over the perennial favorite bug for C programmers – pointer arithmetic errors. Adding one to a pointer and reaching the next byte in a sequence is a very special case, not the norm as programmers in a hurry seem to assume. In the asm code, “ADD R0, R0, #512″ sticks out like a sore Thumb (pun intended).

  10. S.T. @ LI says:

    I also find less and less assembly talent. I use both in developing general embedded product. For security related products I prefer assembly for control and speed on embedded systems. Most of us old timers have a library of assembly language functions to handle almost any development.

  11. T.Z. @ LI says:

    No more than basic hardware knowledge. Why is the LED not turning on? Start tracing. Read the datasheets and schematic. Check the code. Even when debugging, if it is in the special code you need to read and understand the assembly.

    I won’t comment about memory – I can see it but need to think about it, but “cache consciousness” was a popular topic at an old conference I attended. Mainly for high-performance or precision. You can use the MOVEM instruction on 68k derived architectures to move data at nearly DMA speeds but most memcpys don’t use them. If you need precise timing, you need to use assembly and insert NOPs. (I did an Atmel SD card SPI at full possible bandwidth by changing the order which I could not do with a compiler).

    I will say it is easier to write “the first version” in C, tell the compiler to emit an assembly output, and then hand optimize and refactor the resulting assembly. But you need to know the architecture and registers.

    Compilers are often good at specific optimizations – but cross some boundary and performance falls of a cliff. These are often hard to know in advance. One more pointer. one more byte of locals. slight changes to the compiler options.

    Conversely, most compilers import the assembly specialized instructions so _idle() generates one instruction instead of being a function call. But you are programming in assembly, just indirectly.

  12. N.M. @ LI says:

    Great comments that I heartily agree with. Reading the compiler’s assembly output is sooo useful because some bugs show up better in the source code and other bugs show up better in the generated assembly. You may as well use both for debugging.

    When I write assembly these days, it is usually just a fragment encapsulated in a C inline assembly construct. GCC has a great interface for this, for example. Compiler writers can add more and more convoluted additions but there always seem to be one or two instructions or instruction sequences that you need to write explicitly in assembly.

    It is certainly true that the younger programmers these days hate assembly and don’t even want to look at the compiler’s assembly output, which means I find certain bugs much much faster than they do. I find their attitude to assembly language a bit frustrating but there are lots of things I hate and avoid doing with no better rationale so I can’t complain too much.

    At only 41, I am not so comfortable with the title “old timer” but having 30 years experience with computers probably makes me a dinosaur.

  13. S.T. @ LI says:

    I go back to the Algo60 days. All those great compilers were initially a great block of assembly language modules. Compilers just standardized the linking with a “common” subset. GCC, the old Astec ->MSC,++,#,.NET. The old Data General library for the RDOS and RTOS were a bunch of assembly language libraries as was DEC RT11 etc. The bottom line is ,if a new processor comes along with a new set of assembly mnemonics, new assembly code is required to bring it up to “C???” or any other language. I doubt if assembly will ever go away.

  14. Bob Snyder says:

    On many 8-bit and 16-bit processors, it is possible to precisely calculate the amount of time needed to execute a section of assembly code. When writing the equivalent code in C, it is usually not possible to predict the execution time without looking at an assembly listing. And if you upgrade to a newer version of teh compiler, there is no guarantee that it will generate the same machine code. When precise timing is important, it seems that knowledge of assembly programming is needed even if you are writing your code in C.

  15. W.M. @ LI says:

    But how many know how to do micro-code for the processor (to make new ASM instructions)

  16. S.T. @ LI says:

    @W. – good point – I am not sure if any colleges even teach that today.
    It has been so long I am not if any processors today allow micro-coding. Data General and DEC were the two that I understand. Intel implemented RISC sets that are of any use.

  17. T.Z. @ LI says:

    non mortuus etiamnunc

    A dead language is usually more precise. ISO won’t come out with a new version for assembly or the roman rite.

    The Time Processing Unit, or TPU on the motorola (now freescale) processors has microcode and it is not uncommon to find people to alter it. I attended one of the first classes on how to program it way back when.

    The 68000 family has two sets of microcode but it isn’t accessible.

    On the other hand VHDL and other things to code for FPGAs, PALs, etc. is a form of microcode (or sections of the process are) and is very similar to microcode and is both a common and useful skill.

  18. S.T. @ LI says:

    I agree T.Z. that it is “Not Dead, Still around”. Assembly will never disappear completely it will just be neglected now and then.

    I consider FPGA, and any other PAL as well as VHDL processors different from the actual hard wired processors Intel, AMD, etc.
    We all benefit from the new technology. Many of coded processors supply a from of GCC like Altera Nios. They do have an assembler and show that speed differentials in a demo for coding in ASM than the compiler. I am using Cyclone from Altera and looking at the Nios for an application using the Macro ASM.

  19. J.B. @ LI says:

    Tried out the PIC10 microprocessor. How could you carefully manage the tiny RAM using a C compiler? It was simple and fun to use assembly.

    I’m surprised that compilers don’t have a mode for doing worst case timing analysis of code.

    RE microcode:
    The question should be: Is VHDL/Verilog the new assembly? IMHO it is slow process to teach assembly. It is probably more useful to student’s careers to teach them RTL. Some FPGA courses squeeze in the writing of a simple soft core processor. If only it was as simple to write a compiler for a unique processor.

  20. W.M. @ LI says:


    On the worst case timing, one is often left with dumping to a tool that will then take the compiler’s ASM and generate timing data.

    Agree VHDL/Verilog is very useful in a student’s knowledge base — as is a basic understanding of items like synthesis and place and route and resulting physical effects on delays and other operation.

  21. J.N. @ LI says:

    I’m one of those weird people who LIKES programming in assembly. I do higher level languages too, but I enjoy doing asm. It’s especially important when you’re trying to debug and/or optimize because you can understand the code the compiler is generating. It’s REALLY important when working with small processors, the compilers tend to be limited.

    As for the young developers hating it… I’m sure they do, but I don’t often meet other people, young or old, who claim they like asm programming. I’m not sure it’s just youth! Assembly language programming is very detail-oriented and requires a different mindset, the processes aren’t abstracted the way they are in high-level. With a bit of work you can create some key features of some of the higher-level languages, like abstraction, encapsulation, and so on. But it takes work and it takes preparation and you have to do it going in or it will fall apart because you need to be consistent. Yeah, that’s true of any programming but a lot of languages take care of much of the dirty work for you. In assembly you have to do ALL of it.

    When you learn a new processor type you have to grok the entire architecture, not just a few differences. The real fun comes, though, when you work on multiple architectures at the same time and have to keep them all straight. :)

    P.S. I don’t know if assembly is dying or not, but you can have my assembly when you pry it from my cold, dead hands. :)

  22. S.T. @ LI says:

    @J. – I am with you – I have 30 years of libraries in assembly that will handle just about anything.

    I know several engineers around the country that prefer assembly. Several difficulties arise from company policy and compiler preferences where assembly is not one of them.
    Posted by Sal (JT) Tuzzo

  23. T.Z. @ LI says:

    For a semi-famous example, Steve Gibson ( writes nearly everything in assembly. He has a new DNS benchmark that is under 300k (and interestingly doesn’t resize horizontally). His other products are all assembly.

    But I can see why companies might not like it. First, documenting or commenting it is high art. Second, it is non-portable – even when a processor is upgraded with new instructions or registers, it can’t be “recompiled” to take advantage. Third, even with things like gas from gcc, the syntax and naming conventions are different from processor to processor (anyone old enough to remember intel copyrighting “mov” so zilog had to use “ld” for the z80?).

  24. P.M. @ LI says:

    Just because you *can* program in assembler doesn’t mean you *should*. T.Z. identified a couple of good reasons for avoiding it. Another is significantly increased costs in development and maintenance. Your assembler code also creates a significant technical debt that will anchor you to a particular hardware platform. I don’t know about you guys, but I don’t want all my work to be thrown away just because a processor becomes obsolete.

    The justification for using assembler can completely disappear when going to a different processor. I had to fit a display glitch filter into the last 120 bytes of an 8051 ROM (the explosive gas sensor power supply injected occasional spikes of up to 80%FSD), but the best I could get out of the C51 compiler was 220 bytes. Three hours later I had the code down to 104 bytes of 8051 asm. Later when I joined ARM I tried compiling the C code just for laughs – it came out as 86 bytes of Thumb code with no extra work, and any halfway competent engineer could read and understand the program.

  25. N.F. @ LI says:

    I think our days, a full application can’t be written in ASM, BUT there are still a few places where is needed (very fast ISR, configuration of cache and MMU for example, and initialization code for Cores). But for an application to be ported from a device to another, and for easy and low cost maintenance AMS is not the language to chose.

  26. T.Z. @ LI says:

    As I pointed out, Steve Gibson does write full apps in assembler. And they take forever to come out. Good, cheap, quickly: pick 2. In this case they are good, but not cheap or quick given the engineer’s time. You can also hand stipple a huge wall mural too. Because something is possible doesn’t mean it is practical or reasonable.

    Someone noted “(Adobe) Flash is like Cliantro” – it is a spice, not an entree. Same with Assembly language. It makes existing things much better if used judiciously but is not designed to be the exclusive tool.

    Something similar happens with browser extensions that do complex and time-consuming math in what would be javascript. Most browsers provides a way to link to a shared object (dll/dylib/so) which can be called from javascript. You don’t want to write or maintain binary plug-ins, but a native compiled performance booster for the innermost loop makes things a lot better. (Example: lastpass).

  27. M.B. @ LI says:

    Assembly is not a dead skill set at all. Not even a year ago I was writing exception vectors for an ARM7 and I had to use assembly to do some of it. There have been more times than not I’ve also had to start tracing a crash by looking at opcodes. Can you get by without knowing assembly nowadays? Sure. But in my mind knowing assembly and being comfortable working with it is one of the requirements to go from being a good programmer to being a great software engineer.

  28. G.H. @ LI says:

    I don’t know how assembler could ever be dead. I mean someone, somewhere, at some point has to write the compiler, linker, and low level code for each new piece of silicon and the parts it needs to interface with no? In my mind assembly separates the programmer wheat from the chaff. As T.Z. alluded to, Quality, Lead time, Price. Pick two.

    Also, when you need to squeeze the extra performance or program space out of the inexpensive 8 bit micro driving your cost sensitive product, assembler may be the only answer.

  29. T.P. @ LI says:

    Hello to all,

    I’d like to add my comments to some of the things said so far.

    I’ve been programming exclusively in assembly when it comes to micro-controllers, and (I must admit) occasionally on the PC, too. But, I mostly use high-level for my PC programming, so don’t take this the wrong way, I *can* program in high level languages (which is what I use when writing assemblers/compilers, for example), but in the case of MCUs, just because I can doesn’t mean I should. :)

    >Just because you *can* program in assembler doesn’t mean you *should*

    Actually, IF you’re good at it (and that’s a big IF), you should. But some of us are already there (have been for years), so it makes no sense not to use it to our advantage. The product quality is the first thing you should consider. The end customer does not care about what tools were used to create the product, only how that product compares to the competition.

    Better product? How? For example, faster overall performance (in assembly) means more time spent in lower-power modes, so better battery life, so better product. Same for lower speed (so, lower power consumption, lower heat dissipation). Smaller overall code also means lower-cost MCU, so lower end-price. And, let’s not forget we’re all selling Green, lately (These arguments can’t hold for high-level, no matter how good you’re at it, because it’s mostly the compiler that sets the limits, not you.)

    As to the arguments about increased cost of development and maintenance, this is simply not so. If you know how to write good code (in any language) then this is a non-issue. I write just as effortlessly in assembly as in higher-level, so arguments about needing more time to do the same in assembly, in my case at least, don’t apply. (And, I know I’m not the only one out there feeling this way.) Maintenance issues can be equally or more costly in any high-level language app that’s not well written.

    By the way, on the issue of portability, I’ve converted complete and complex applications from one CPU to another practically overnight. But then again, my code does not look like it’s come out of a disassembler (like most code often seen posted here and there on the Net). So, good coding practices do make a difference (for any language).

    In all fairness, though, I should also say that assembly language isn’t ONE language, like any higher-level language. C, BASIC, Pascal, etc are pretty much the same anywhere. Assembly is a class of languages. So, it DOES make a difference which assembly language you use. Certain architectures and assembly languages have a natural feel that make you very comfortable programming in them. Others are just plain horrible, and do not lend themselves for readable well-maintained straight-forward code. I’d stay away from them (I won’t name them in fear of being sued by the respective manufacturers).

    A final thought about the need for assembly. All compilers emit assembly/machine code. People who also write compilers every so often (like me) should have extensive experience with programming in the particular assembly language for which they target their tool, so that they can ‘see’ how to make the compiler produce more efficient code. You can’t expect someone to write good assembly code simply by looking at the assembly language reference sheet of a CPU s/he never actually uses (at the assembly level, that is). Now, if we eventually run out of good assembly programmers (trust me, that won’t happen), the quality of compilers will drop as a consequence, and then people will naturally seek to get better performance and return to assembly (sort of like a vicious circle, isn’t it?).

  30. T.Z. @ LI says:

    @T.: “By the way, on the issue of portability, I’ve converted complete and complex applications from one CPU to another practically overnight.”

    Did you write the original one that you then ported? How many lines of assembly?

    It also depends on the processor – if you are doing code for something with <= 1K, and or has a trimmed register set (I'm thinking the ATtiny4/5/9/10), it will be written in assembly.

    It also helps if the product will be immutable (unlike the blu-ray players that update their firmware quarterly if not more often).

    And the architecture. RISC like architectures tend to have more efficient compilers, as opposed to something like an 68HC11. And you do want to optimize the innermost core where the processor spends 80%+ of its time (One palm pilot app was C except for a boyer-moore variant search and a huffman unpack that I couldn't do well in C because of the operators like shift and register usage).

    In another case I was able to hit 95% processor utilization and change a 200Hz rate into a 1K rate by carefully doing timings so I could start an ADC, do another operation, and pull the data after it had to be finished.

    It also seems like you are talking stand-alone and not calling an OS or library.

    The converse of my original statement is also true, "Just because you can program [things like interrupt handlers] in C or other high level language doesn't mean you should".

  31. J.H. @ LI says:

    We wrote code with assembly language for a part of our operating system as there are something C can not do in addition to performance reason.
    Jack Huang

  32. T.P. @ LI says:

    @TZ: Porting: Old code was mine, too. The argument made earlier was that porting with a high-level language is simple, while with assembly you have to throw away all your code. So, my response is: no, if you must change processors, you don’t actually throw away any code, as the old code becomes the basis for your new version (think of it re-factoring). Having a mostly-compatible new architecture certainly helps. Going to a completely different architecture, on the other hand, would be similar to switching from C to Pascal (or, worse to COBOL), for example. Like I said, there is no ONE assembly language, so switching to a completely different architecture and related assembly language would be a similar proposition as switching from one high-level to another, so again porting times would be increased in a similar manner.

    A full-featured RTOS and various libraries are used in my example (these are also fully written in assembly, again by me, but they weren’t counted in the porting time. Porting the RTOS etc was an independent job — as it affects a whole bunch of applications, so it has to be counted on its own — and by the way that took about a week plus some fine tuning & CPU-specific optimization over a longer period as the new code was actually used in real life).

    I will give you code size, but I think this will take the attention off what really matters, because code size in assembly is not comparable to anything in high-level (C, for example). So, a smaller number (common for assembly apps) will immediately make someone programming in higher-level think this is a trivial application (based on size alone). Not so (for example, I once wrote a complete 6809 and partial OS9 simulator, 100% in 80×86 assembly language and the whole thing was <10KB object uncompressed, and ran several times faster than the actual part, back when PC speeds were counted in two-digit MHz). A "Hello World" C app (for comparison) gives me 51KB object.

    So, with that in mind, back to an example: code size for one app was about 30KB object code (over 90% code), not counting the underlying RTOS and various application-independent libraries. A C version of the same app would not even fit in the 64K (8-bit MCU) code space, so a larger class MCU (16 or 32-bit) would have been required. Regarding updates, I regularly produce new firmware updates, so this is not a fixed app (if this matters). It has been a live app for over 12 years, constantly expanding and/or improving, as most of one's projects.

    One other note, which I forgot to mention in my previous post: I suspect (because I have contrary experience on this) when people say it takes longer to program in assembly, they automatically (or subconsciously) count the time it would take for all the needed (normally, compiler-provided) library code to be written from scratch, given that most, if not all, assemblers come without any standard libraries. In that sense, they are right. But that would be an unfair comparison. Because if I gave one a C compiler (for example) without any library functions (not even a single #include file), and s/he had to write everything from scratch, they would not be as happy about C either. So, one must consider that someone who has been working with assembly has a rather complete library available, so writing new code in assembly should not count the time a beginner might spend setting up or re-inventing libraries, just like writing code in high-level takes a certain set of libraries for granted. Same goes for any OS functionality. Creating such libraries has to be done once, so it should not be counted as time spent on each application.

  33. P.M. @ LI says:

    Interesting conversation. It is certainly true that the answer to headline question “Is Assembly Language a dead skillset” is a categorical “No” for two distinct reasons -Tools vendors need to provide compilation, debug and trace tools that are intimately tied to the underlying processor instruction set; and there are places (as I said before) where it is categorically impossible to achieve certain functionality without direct access to the processor’s instruction set, because the desired code can not be directly represented by higher level languages.

    Several assertions have been made by proponents of 100% assembler coding that may have been true at one time, but with modern compilers and processors are not universally true.

    Assertion: Hello World C app gives 51KB object. Well, maybe – as an application on a desktop computer. In fact if you had said 1MB, I would have been inclined to believe you. I would counter that with a C++ program running on a Cortex M3 that tracks hundreds of simultaneous asynchronous transactions through nine life cycle states, with a dozen different interrupt sources, complex hardware programmable data structures to be set up for data transfers to and from a Host PC, and communications with another on-chip processor – 11K Bytes of code. That code had two ASM functions. WaitForException and SwapDataPacketEndianness.

    Assertion: You have to code in assembler to get performance code. Not always; for general data movement, comparison, simple arithmetic you will find that it is difficult to improve on the output from a decent compiler. I’m using the ARM SDK3.x compiler as my ‘golden standard’, because that’s the first compiler that I have to resort to dirty tricks in order to get better code. Areas where in general you will be able to improve on the compiler are places where the compiler has to play safe, but you know stuff that can’t be expressed in C, so you can take shortcuts. In C, if you use a variable, then write to a location through a pointer, then use the first variable again you may find that the compiled code does a seemingly redundant read from memory. That’s pointer-aliasing. The compiler has no idea whether the value you poked into memory overwrote the variable that you’re about to use.

    Ok, one of my assertions: Writing assembler costs more in development and maintenance. Hypothetical situation – given that you’re going to get about the same code anyway, would you rather write ten lines of C, or 100 lines of assembler? [My assumption: it takes an average of ten asm instructions to represent one line of C/C++ - I know this is highly application and processor-specific] Ok, you went with assembler, fair enough. Now you need to add a major new feature. Even though you did a sterling job of compartmentalizing your code, you still have to modify 1000 lines of assembler. Chances are, that equates to less than 100 lines of a high level language that need to be reviewed and edited.

    I’m a big believer in getting code working, then profiling and optimizing it, albeit with a handy set of rules-of-thumb that tend to generate efficient code in the first place. I think this is the fastest route to a working product. Highly optimized code tends to contain more than its fair share of errors. I didn’t invent this saying, but I always tell people to “brag about your cleverest code in the comments. Not only does it make you feel good, but it shows us where to start looking first when debugging”. It seems to me that writing assembler is like committing to 100% premature optimization, and you carry the weight of that optimization for good or ill throughout the life of the product.

    My last point concerns a factor that I think was behind the original question. Good software talent is a scarce resource. If you’re a one-man band writing everything in Assembler then this doesn’t affect you, but try putting together a 50 person team. You’ll find it easier hiring good C/C++ engineers than ASM engineers.

  34. D.K. @ LI says:

    For many years I built RTOS code in assembler, as that was the only way we could do it… until I worked on aircraft cabin software on embedded controllers for a commercial airline. We had 4 different boards based on Motorola architecture. By then we had optimized Ada (I HATED Ada) complilers for even that architcture. As an experiment I taught my software engineer wife how to build kernel-evel RTOS software and had her build it in Ada… while I built it in assembler. It took us both about 3 days (not really complicated). Hers worked as well as mine, but mine was 4 bytes! smaller (approximately 2K total). I spent a bunch of time analyzing the differences, and on this simple software set, there were very few. The optimizers in the compiler did about what I could do. I spent a few days trying to optimize my stuff further, but there really was not much I could do. Does that mean that I thought Assembly language was dead? No.. it meant that I was probably going to use C or C++ in the future where I could (not freaking Ada… that came later for about 5 years, but is a different story). But the smart guys who built that compiler knew assembly, the cache architecture and how to optimize high-lvel code. They knew the target architecture inside and out.. and they absolutely proved that for some applications assembly is not dead, at least for the guys that build the compilers. I still work on small hand-helds, and the computing power and RAM are way beyond the desktops I worked on not too long ago… so I stay with high-level stuff, as production feeds the bulldog. That said, when I suspect a buggy compiler (and other than FPGAs that has not happend much lately) I’ll revert to machine language debugging. Normally I find I made an error… not the compiler.

  35. A.V. @ TI says:

    I think assembly has its place, especially for real time and time sensitive applications. I also think more and more developers are adopting a single set of MCUs, becoming experts with them, and leaving all others by the way side. It likely takes a lot less time to spend the time, and focus on a single set, figure out the ins and outs of a single set than trying to reinvent the wheel with each new technology.

  36. D.D. @ LI says:

    There are many excellent points in the above comments (for example, I would never touch AVR or PIC assembly again) but there are still realms where hand-optimized ASM will produce the 200-500% performance gains that a C compiler just cannot. From recent investigations: GPU programming the new CLA engine in TI’s Piccolo chips.

    But in general, I agree that the realm of ASM programming will be relegated to maintenance of legacy systems and the very occasional new design. As for new engineers being oblivious/dismissive of ASM: good. That’s job security for me.

  37. T.Z. @ LI says:

    In my case I did use AVR in assembler, but for optimal bit-bangs.

    In the first case I was doing an I2C waveform at 400kHz using the USI so rewrote all the code paths to take just the right number of clocks (except one closing instance which was slightly longer because I couldn’t optimize).

    The second case was for SPI to an SD card, also at maximum rate so I used NOPs to make the loop precisely long enough to transfer the data right as it was ready for the next byte in a tight loop.

    But everything else was in C, and even these originally – I took the compiler output, and the instruction timing info and tuned it.

  38. D.H. @ LI says:

    I think you identify one area where assembly must be used. I had a similar challenge with an 80251 chip mask that the C51 compiler would not let me get all the required functionality into. It required hand optimization to fit the features into the system resources. As long as cost restricts resources in embedded systems or tools lag silicon in cutting edge systems there will be a requirement for assembler.

    So assembler is not dead, and assembly language programmers may be endangered but they are not extinct ;-)

  39. T.Z. @ LI says:

    I don’t think they are even endangered but assembly programmers are becoming specialists …

    “You are building 10 million of these. I can write it to fit into the $5 chip for $20k, or you can outsource it for $5k, but it will require the $7 chip, do the math”.

    It will be at the edge where the extra high performance or resource usage minimization is required. But that is a very long edge.

    It won’t be for consumer PCs and laptops or maybe even smartphones, yet even there, each extra cycle burns the battery. Even the opensource Ogg/Theora had an integer optimized ARM version that I believe was heavy in assembler for both speed and power.

    Until WebM is commonly in hardware, I think there will be lots of assembler core routines.

    Or for that matter, specialized processors like DSPs that don’t really program well in C or other high level languages.

  40. C.L. @ LI says:

    Every time I think “I’ll probably never have to do this again”, I end up having to do ‘something’ in assembler. Usually it’s a few instructions in embedded in a C function, but, I had to hack some assembler in the last system I worked on about 6 months ago.

    C. (8080, 8086, 6809, 68000, DSP-32, 56000, IBM-370, Cray III, PIC, ARM, PPC) L.

  41. T.M. @ LI says:

    Assembly may not be dead, but I don’t want to spend too much time studying one variant of it. I always start with C, and tweak critical routines with assembly when necessary.

  42. O.A.Z. @ TI says:

    Even if you do not program in assembly you need to understand the processors architecture and instruction set to be able to write efficient C-code. In addition to pure C-level trimming to the processor architecture you can also use intrinsics to help the compiler to do a better job.

    Finally you also need to be able to debug your code – which sometimes means to dig into assembly.

  43. B.B. says:

    The resume of the author is quite impressive. I haven’t seen a concrete refutation yet so I’m inclined to believe him. Thank everyone for sharing.

Leave a Reply