The scheduler is the core of a high-performance computing cluster:
- It allocates computing resources to the jobs that are submitted to the cluster.
- It maintains the status of those jobs to ensure orderly completion or termination.
We provide a scheduler with the Compute Cluster Pack. Our scheduler understands four types of jobs:
- Sequential, typically for set-up operations: The job contains one task, one sequential executable, e.g. to copy some input data.
- Parallel with message-passing, common in engineering and scientific applications: The job contains one task, i.e. the MPI start-up program mpiexec. This in turn will start as many instances of a named MPI application as there are cores allocated to it. For instance:
o Job submit /numprocessors:16 mpiexec –wdir \\myshare myapp.exe
will start 16 instances of myapp.exe, a parallel application, on 16 cores across a number of nodes allocated by the scheduler. \\myshare will be the working directory for all those myapp.exe. Most finite element applications (e.g. for car crash, fluid dynamics simulations) have a parallel computing core based that uses message-passing.
- Parametric sweeps, common in financial applications: The job contains multiple independent tasks. Typically, they are exactly the same executable but with different parameters, hence the name. Montecarlo simulations used for financial trading are mostly parametric sweeps.
- Task flows, again common in engineering: A sequence of tasks of any kind that automates a particular workflow. Note that we handle just simple dependencies and no exceptions. For instance, a setup task must execute before a parallel computation and a visualization task must execute afterwards.
Our scheduler comes with a COM API that makes it simple to use it programmatically. For instance, you can write a macro in Excel that calls that COM API and passes a section of the spreadsheet to an executable running on CCS for computing. This makes the computing power of a large number of machines available on the desktop with relatively little effort.
There are a number of commercial and open-source HPC schedulers that offer more advanced features (e.g. they support cycle stealing too), but may also be more difficult to set up and use, or more expensive than ours:
- Platform Computing’s LSF, which integrates with ours
- Altair’s PBS Pro
- Sun’s Grid Engine.