- Fault Tolerance, Resilience and Redundancy provide high-availability of JobScheduler for a number of outage scenarios:
- High Availability requires the system including JobScheduler, database, storage etc. to be available, not just one component.
- High Availability is oriented towards specific outage scenarios, not towards any possible failure.
- Master / Agent Resilience includes a number of measures for operational robustness:
- Master / Agent Reconciliation allows continued execution of tasks in case of recoverable Network Connection Loss.
- Master Service Recovery includes supported measures after a Master Service Failure.
- Database Service Recovery includes the capability to recover in case of Database Connection Loss.
- Master / Agent Redundancy includes a number of architecture decisions:
- Master Clusters provide redundancy of Master instances in a network.
- Agent BundlesClusters can be used to compensate the outage of a server that runs an Agent.
- Recovery Strategies provide an overview of means how to restore the scheduling service
|