MRG Grid Features
MRG Grid provides a broad set of features across both High Throughput Computing and High Performance Computing, including:
- Management Tools - Powerful browser-based management tools for managing daemons and machines, security, compute jobs, scalability settings, priorities, and more. Also provides sophisticated monitoring capabilities.
- Desktop Cycle-Harvesting - Desktop cycle-harvesting allows you to leverage the unused capacity of desktops to add processing power to your grid.
- ClassAds - A flexible language for policy and meta-data description.
- Policies - Flexible, customizable policies specified by jobs and resources via ClassAds.
- Virtualization - Allows for submission of a virtual machine (VM) as a user job, supporting migration of the VM.
- Federated Grids/Clusters - A mechanism known as flocking allows independent pools to use each others' resources, controllable by customizable policies.
- Multiple Standards-Based APIs - Web Service interface provides job submission and management functionality; CLI provides a highly scriptable, with consistent output, interface to all functionality.
- Workflow Management - The ability to specify job dependencies, via DAGMan, allows for construction and execution of complex workflows.
- High Availability - The Negotiator and Collector, via HAD, and the Schedd, via Schedd Fail-over, can have their state replicated to allow for graceful fail-over upon service disruption.
- Database Support - All data about jobs and resources can be stored in a database via Quill.
- Compute On-Demand (COD) - The ability for a node or set of nodes to be claimed by a user in such a way that others may use the claimed nodes until the user needs them.
- Dynamic Pool Creation - Through a technology known as Glide-ins, nodes can be dynamically added to a pool to service user jobs.
- Priority Based Scheduling
- Priority scheduling is performed at the granularity of a user.
- Fair-share scheduling can be performed on groups of users.
- Priority management is controllable by administrators.
- Accounting - User and group resource utilization is tracked and accessible to administrators.
- Security
- Authentication, multiple mechanisms (kerberos, ssl, shared secret, claimtobe, filesystem, remote-filesystem)
- Privacy, network encryption (blowfish, 3des)
- Integrity, of network traffic (md5)
- Authorization, through flexible configuration policies
- Account Remapping
- Allows for execution across administrative domains.
- Enhance security by using a restricted pool of users to run jobs on execute machines.
- Privilege Separation - Only a single, specialized, audited component requires root/administrator permissions on execute nodes.
- Parallel Universe
- Provides an extensible framework for running parallel (including MPI) jobs.
- Co-allocation of compute nodes is done automatically.
- Framework implementation for MPICH1, MPICH2, and LAM provided.
- Java Universe - Explicit support of jobs written in Java.
- Time Scheduling for Job Execution (Cron) - Allows a job or multiple jobs to be started at specific times, with customizable policy for failures such as missed deadlines.
- Backfill - Allows otherwise unused nodes to run jobs provided by BOINC.
- File Staging - Support for automatic file staging, e.g. job input, and online file io (i.e. file streaming from submit to execute nodes) via Chirp and remote syscalls, in the absense of a shared filesystem.
- Dedicated and Undedicated Node Management - Allows for dedicated resources (clusters) to be augmented with otherwise undedicated (desktops) using flexible policies.
- Master-Worker (MW) - A C++ framework allowing a single master process to allocate and manage multiple worker processes, which process data based on master specified policies.
- Condor-C - Allows for jobs in one queue to be moved to another queue.
- Hawkeye - Allows for automated monitoring of one or more pools.