Source-synchronous I/O Links using Adaptive Interface Training for High Bandwidth Applications

by Ashok Kumar Jaiswal

Institution: Technische Universit├Ąt Darmstadt
Department: Fachbereich Elektrotechnik und Informationstechnik
Degree: PhD
Year: 2014
Record ID: 1106111
Full text PDF: http://tuprints.ulb.tu-darmstadt.de/4079/


Mobility is the key to the global business which requires people to be always connected to a central server. With the exponential increase in smart phones, tablets, laptops, mobile traffic will soon reach in the range of Exabytes per month by 2018. Applications like video streaming, on-demand-video, online gaming, social media applications will further increase the traffic load. Future application scenarios, such as Smart Cities, Industry 4.0, Machine-to-Machine (M2M) communications bring the concepts of Internet of Things (IoT) which requires high-speed low power communication infrastructures. Scientific applications, such as space exploration, oil exploration also require computing speed in the range of Exaflops/s by 2018 which means TB/s bandwidth at each memory node. To achieve such bandwidth, Input/Output (I/O) link speed between two devices needs to be increased to GB/s. The data at high speed between devices can be transferred serially using complex Clock-Data-Recovery (CDR) I/O links or parallely using simple source-synchronous I/O links. Even though CDR is more efficient than the source-synchronous method for single I/O link, but to achieve TB/s bandwidth from a single device, additional I/O links will be required and the source-synchronous method will be more advantageous in terms of area and power requirements as additional I/O links do not require extra hardware resources. At high speed, there are several non-idealities (Supply noise, crosstalk, Inter- Symbol-Interference (ISI), etc.) which create unwanted skew problem among parallel source-synchronous I/O links. To solve these problems, adaptive trainings are used in time domain to synchronize parallel source-synchronous I/O links irrespective of these non-idealities. In this thesis, two novel adaptive training architectures for source-synchronous I/O links are discussed which require significantly less silicon area and power in comparison to state-of-the-art architectures. First novel adaptive architecture is based on the unit delay concept to synchronize two parallel clocks by adjusting the phase of one clock in only one direction. Second novel adaptive architecture concept consists of Phase Interpolator (PI)-based Phase Locked Loop (PLL) which can adjust the phase in both direction and achieve faster synchronization at the expense of added complexity. With an increase in parallel I/O links, clock skew which is generated by the improper clock tree, also affects the timing margin. Incorrect duty cycle further reduces the timing margin mainly in Double Data Rate (DDR) systems which are generally used to increase the bandwidth of a high-speed communication system. To solve clock skew and duty cycle problems, a novel clock tree buffering algorithm and a novel duty cycle corrector are described which further reduce the power consumption of a source-synchronous system.