DMS: Data Management Services
Distributed computing in grid-style wide-area cross-domain environments presents unique challenges to data management because of the heterogeneous, dynamic nature of applications and resources. In these environments, it is desirable that data be provided with application-tailored performance optimizations and techniques for improved reliability, and that data provisioning be managed in an automated manner, according to application requirements and adapting to changing environments.
A set of data management services is proposed to provide control and configuration of application-tailored data sessions using Grid Virtual File System (GVFS), GridFTP, and Secure Copy (SCP). GVFS employs proxies to virtualize Network File System (NFS) sessions via interception and modification of remote procedure calls. In our data management architecture, a File System Service (FSS) runs on every client and server and controls the local file system proxies; a Data Scheduler Service (DSS) provides centralized scheduling and control of data sessions through interactions with the FSSs; and a Data Replication Service (DRS) manages datasets and their replicas for fault tolerance and load balancing. Using these services, GVFS-based data sessions can be dynamically created on a per-application basis, and application-tailored customizations can be applied, including: the selection of block-based or whole-file based data transfer, the configuration of cache parameters and consistency protocols, the use of copy-on-write based check-pointing and replication-based failover, and the configuration of security mechanisms. These services support the interoperability with other grid middleware based on WSRF standards, and also employ the web service security standards to provide secure interactions and grid authentication and access control.
|