Fault tolerance is traditional requirement in specific applications, such as space, automotive or avionic. Technology scaling enabled complex system integration and performance increase, but induced additional reliability threats, that need to be addressed even in previously less critical applications.
Introducing fault tolerance in computing system architecture could result in significant hardware/power overhead and performance penalty. Therefore, it is important to address this issue in optimized way. In this talk several methods, at the different abstraction levels, will be introduced and evaluated. Some of them are focused on selective fault tolerance, providing certain trade-off between the induced overhead and fault protection. Additionally, dynamic fault tolerant methods will be presented, which enable fault tolerance features in adaptive way, only when it is required. The talk will include also the practical examples of fault tolerant chips implemented at IHP.
Prof. Dr. Milos Krstic received the Dr-Ing. degree in electronics from Brandenburg University of Technology, Cottbus, Germany in 2006. Since 2001 he has been with IHP Microelectronics, Frankfurt (Oder), Germany, where he leads the team in the Wireless Communication Systems Department. From 2016 he is also professor for “Design and Test Methodology” at the University of Potsdam. For the last few years, his work was mainly focused on fault tolerant architectures and design methodologies for digital systems
We have developed a single chip, 64-core high-performance shared memory manycore that demonstrates performance better than 95% of peak. A novel task-oriented PRAM-like programming model is employed. Scheduling is carried out exclusively by an on-chip hardware scheduler. A logarithmic 100x256 network connects all 64 cores (and some IO cores) to the 256 banks of on-chip shared memory, effectively allowing all cores to access memory simultaneously. The programmer is not bothered by allocating cores, by locality, and by load balancing. A 65nm version is working flawlessly.
Ran Ginosar has earned BSc in EE&CS at the Technion—Israel Institute of Technology in 1978, and PhD at Princeton University in 1982. He has conducted research at Bell Labs and Intel. He is a full Professor of the EE department at the Technion. His research interests focus on computer architecture, VLSI, asynchronous circuits and clock synchronization