Extreme multiprocessing is an interesting topic because it can mean vastly different things to different people depending on what types of problems they are trying to solve.
At one end of the spectrum, there are multiprocessing designs that maximize the amount of processing work that the system performs within a unit of time while staying within an energy budget to perform that work. These types of designs, often high-compute, parallel processing, work station, or server systems, are able to deliver a higher processing throughput rate at lower power dissipation than if they used a hypothetical single core processor that ran at significantly faster clock rates. The multiple processor cores in these types of systems might operate in the GHz range.
While multiprocessing architectures are an approach to increase processing throughput while maintaining an energy budget, for the past few years, I have been unofficially hearing from high performance processor suppliers that some of their customers are asking for faster processors despite the higher energy budget. These designers understand how to build their software systems using a single instruction-stream model. The contemporary programming models and tools are falling short for enabling software developers to scale their code across multiple instruction streams. The increased software complexity and risks outweigh the complexity of managing the higher thermal and energy thresholds.
At the other end of the spectrum, there are multiprocessing designs that rely on multiple processor cores to partition the workload among independent resources to minimize resource dependencies and design complexity. These types of designs are the meat and potatoes of the embedded multiprocessing world. The multiple processor cores in these types of systems might operate in the 10’s to 100’s MHz range.
Let me clarify how I am using multiprocessing to avoid confusion. Multiprocessing designs use more than a single processing core, working together (even indirectly) to accomplish some system level function. I do not assume what type of cores the design uses, nor whether they are identical, similar, or dissimilar. I also do not assume that the cores are co-located in the same silicon die, chip package, board, or even chassis because a primary difference for each of these implementation options are energy dissipation and latency of the data flow. The design concepts are similar between each scale as long as the implementation meets the energy and latency thresholds. To further clarify, multicore is a subset of multiprocessing where the processing cores are co-located in the same silicon die.
I will to try to identify the size, speed, energy, and processing width limits for multiprocessing systems for each of these types of designers. In the next extreme processing article, I will explore how scaling multiprocessing upwards might change basic assumptions about processor architectures.