Ultra-Scalable 3D Clos Network on Chips


Summary

Many-core chip multiprocessors (CMP) are likely to consist of tens or even a few hundred processing elements on a single die, presenting many challenges in terms of both design and implementation. Paramount among the many considerations for implementing a CMP is how to implement the on-chip interconnection network for maximum efficiency in terms of both energy and delay/throughput. Thus, in the project proposed here, we aim to explore ways in which multiprocessing systems can be interconnected using a multistage network where any two nodes (either memory or processing elements) are only a few “hops” from one another in terms of the number of routers along the network path. In fact, the number of hops, on average, is far fewer than required for a mesh network fabric.

It is also important to point out that a multistage network such as the Clos architecture is more conducive to a shared memory parallel processing system. This point motivates the work proposed here in that many-core processors, as extensions of multi-core processors, can operate with a more simplified programming model when implemented as shared memory systems. Thus, we propose the Clos Network-on-Chip (CNOC) architecture for the realization of efficient many-core CMP systems and aim to explore, through this research, the benefits of such systems.

The objective of the research project is to explore, through rigorous simulation and experimentation, the potential of multistage Clos networks for the on-chip interconnection fabric of emerging high-performance chip multiprocessors. Even though there exist a number of NOC studies, methods for obtaining optimum Clos NOC interconnections that provide efficient communications of CNOC do not exist. A novel contribution of this work will be the resulting power and performance optimization techniques for the CNOC interconnects. In particular, we plan to develop three novel techniques to improve the NOC interconnects: packet switching, low power circuits, and 3D CNOC interconnect optimization. These three methods are expected to provide an efficient CNOC interconnect solution. The proposed research plan will focus on the development of the aforementioned three CNOC interconnect technologies. Furthermore, a 3D CNOC test vehicle will be fabricated and tested, which will be used to verify the three proposed techniques.

As part of this project we aim to study the feasibility of applying today’s advanced packet switching techniques to CMP networks-on-chips, explore circuit level techniques that ensure power requirements are met while achieving the highest possible performance, and utilize 3D integrated circuit technology for more optimal CMP behavior.