AbstractsEngineering

Data Transfer and Management through the IKAROS framework

by Nikolaos Gkikas




Institution: KTH Royal Institute of Technology
Department:
Year: 2015
Keywords: parallel file systems; distributed file systems; IKAROS file system; elastic-transfer; grid computing; storage systems; I/O limitations; exascale; low power consumption; low cost devices; synchronous; blocking; asynchronous; non-blocking; event-driven; JSON.; parallella filsystem; distribuerade filsystem; IKAROS filsystem; elastic-transfer; grid computing; lagringssystem; I/O-begränsningar; exa-skala; låg energiförbrukning; lågkostnadsenheter; synkron; blockerande; asynkron; icke-blockerande; händelsedriven; JSON; Engineering and Technology; Electrical Engineering, Electronic Engineering, Information Engineering; Communication Systems; Teknik och teknologier; Elektroteknik och elektronik; Kommunikationssystem
Record ID: 1341306
Full text PDF: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-166740


Abstract

Given the current state of input/output (I/O) and storage devices in petascale systems, incremental solutions would be ineffective when implemented in exascale environments. According to the "The International Exascale Software Roadmap", by Dongarra, et al. existing I/O architectures are not sufficiently scalable, especially because current shared file systems have limitations when used in large-scale environments. These limitations are: Bandwidth does not scale economically to large-scale systems,I/O traffic on the high speed network can impact on and be influenced by other unrelated jobs, andI/O traffic on the storage server can impact on and be influenced by other unrelated jobs. Future applications on exascale computers will require I/O bandwidth proportional to their computational capabilities. To avoid these limitations C. Filippidis, C. Markou, and Y. Cotronis proposed the IKAROS framework. In this thesis project, the capabilities of the publicly available elastic-transfer (eT) module which was directly derived from the IKAROS, will be expanded. The eT uses Google’s Gmail service as an utility for efficient meta-data management. Gmail is based on the IMAP protocol, and the existing version of the eT framework implements the Internet Message Access Protocol (IMAP) client-server connection through the ‘‘Inbox’’ module from the Node Package Manager (NPM) of the Node.js programming language. This module was used as a proof of concept, but in a production environment this implementation undermines the system’s scalability and there is an inefficient allocation of the system’s resources when a large number of concurrent requests arrive at the eT′s meta-data server (MDS) at the same time. This thesis solves this problem by adopting an asynchronous non-blocking event driven approach to implement the IMAP client-server connection. This was done by integrating and modifying the ‘‘Imap’’ NPM module from the NPM repository to suit the eT framework. Additionally, since the JavaScript Object Notation (JSON) format has become one of the most widespread data-interchange formats, eT′s meta-data scheme is appropriately modified to make the system’s meta-data easily parsed as JSON objects. This feature creates a framework with wider compatibility and interoperability with external systems. The evaluation and operational behavior of the new module was tested through a set of data transfer experiments over a wide area network environment. These experiments were performed to ensure that the changes in the system’s architecture did not affected its performance. ; Givet det nuvarande läget för input/output (I/O) och lagringsenheter för system i peta-skala, skulle inkrementella lösningar bli ineffektiva om de implementerades i exa-skalamiljöer. Enligt ”The International Exascale Software Roadmap”, av Dongarra et al., är nuvarande I/O-arkitekturer inte tillräckligt skalbara, särskilt eftersom nuvarande delade filsystem har begränsningar när de används i…