UCSD Turns On the Light on Dark Silicon

Friday, August 27th, 2010 by Max Baron

The session on SoCs at Hot Chips 22 featured only one academic paper among several presentations that combined technical detail with a smidgeon of marketing. Originating from a group of researchers from UCSD and MIT, the presentation titled “GreenDroid: A Mobile Application Processor for a Future of Dark Silicon,” introduced the researchers’ solution to the increase of dark silicon as the fabrication of chips evolves toward smaller semiconductor technology nodes.

The reference to dark silicon seems to have been picked up by the press when in 2009 Mike Muller ARM’s CTO, described the increasing limitations imposed by power consumption, on driving and utilizing the increasing numbers of transistors provided by technology nodes down to 11nm. As described by the media, Mike Muller’s warning spoke about power budgets that could not be increased to keep up with the escalating number of transistors provided by smaller geometries.

Why have power budgets? The word “budget” seems to imply permission that designers can increase power by an arbitrary setting of a higher budget. Carrying power increases to extreme levels however will generate temperatures that will destroy the chip or drastically reduce its lifetime. Thus, a fixed reference die whose power budget is almost fixed due to the die’s fixed dimensions will reach a semiconductor technology node where only a small percent of its Moore’s Law–predicted transistors can be driven. The remaining transistors are the dark silicon.

The solution presented at Hot Chips 22 by UCSD cannot increase the power budget of a SoC but it can employ more dark silicon that would otherwise remain unused. The basic idea was simplicity itself: instead of employing a large power-hungry processor that expends a lot of unnecessary energy in driving logic that may not be needed for a particular application–why not create a large number of very efficient small C-cores (UCSD term) that could execute very short sequences of the application code very efficiently?

Imagine a processor tile such as encountered in MIT’s original design that through further improvement became Tilera’s first tile-configured chip. UCSD is envisioning a similar partition using tiles but the tiles are different. The main and comparatively power-hungry processor of UCSD’s tile is still in place but now, surrounding the processor’s data cache, we see a number of special-purpose compiler-generated C-cores.

According to UCSD, these miniature Tensilica-like or ARC-like workload-optimized ISA cores can execute the short repetitive code common to a few applications more efficiently than the main processor. The main processor in UCSD’s tile – a MIPS engine – still needs to execute the program sequences that will not gain efficiency if they are migrated to C-cores. We don’t know whether the C-cores should be considered coprocessors to the main processor such as might be created by a Critical Blue approach, or slave processors.

UCSD’s presentation did not discuss the limitations imposed by data cache bandwidths on the number of C-cores that by design cannot communicate with one another and must use the cache to share operands and results of computations. Nor did the presentation discuss the performance degradation and delays related to loading instructions in each and every C-core or the expected contention on accessing off-chip memory. We would like to see these details made public after the researchers take the next step in their work.

UCSD did present many charts describing the dark silicon problem plus charts depicting an application of C-cores to Android. A benchmark comparison chart was used to illustrate that the C-core approach could show up to 18x better energy efficiency (13.7x on average). The chart would imply that one could run up to 18x more processing tiles on a dense chip that had large area of dark silicon ready for work, but the presentation did not investigate the resulting performance – we know that in most applications the relationships will not be linear.

I liked the result charts and the ideas but was worried that they were not carried out to the level of a complete SoC plus memory to help find the gotchas in the approach. I was disappointed to see that most of the slides presented by the university reminded me of marketing presentations made by the industry. The academic presentation reminded me once more that some universities are looking to obtain patents and trying to accumulate IP portfolios while their researchers may be positioning their ideas to obtain the next year’s sponsors and later, venture capital for a startup.

Tags: , , , , ,

Leave a Reply