Skip to end of metadata
Go to start of metadata

Introduction

  • An outage of the Controller does not necessarily affect the execution of workflows by Agents. Agents will continue to execute workflows. However, if a workflow includes jobs that are executed on different Agents then the workflow will be put on hold as the Controller is required to switch Agents during execution of the workflow.

  • The Controller holds any workflow-related configuration and orchestrates Agents. At design-time Agents receive the workflow configuration from the Controller and at run-time Agents return execution results and JS7 - Order State Transitions to the Controller.
  • The Controller passes execution results to the JS7 - History Service that updates the JOC Cockpit database.
  • If no connection from a Controller is available then
    • Agents will act autonomously. Execution results are stored with the Agents' journal.
    •  JOC Cockpit will not be updated from workflow execution results.
  • Testing by SOS includes to perform tests for the scenario when the Controller is not available for 24 hours and the Agent executes all scheduled orders. When the Controller is started again then job execution results are updated to the JOC Cockpit history and become visible with the GUI.

JOC Cockpit Behavior

  • Users do not receive up-to-date information:
    • The JOC Cockpit will not receive updated information about the state of orders and of workflow execution results.
    • The GUI will report the Controller being unreachable and has no information about the status of Agents.
  • Any interaction with a Controller such as to deploy workflows and to cancel/suspend orders is delayed.
    • This means that such requests are held in memory with the JOC Cockpit Proxy Service that will try to forward the requests when the Controller becomes available.
    • It is not recommended to restart the JOC Cockpit in this situation as pending requests would be lost and deployments would have to be repeated.
  • The JOC Cockpit Proxy Service will try to re-establish the connection to the Controller. When this is successful then the GUI automatically updates the status of the Controller with its Dashboard.

Agent Behavior

  • Workflows are deployed to Agents just once at design-time. At run-time, when the Agent receives orders for workflows some time in advance, e.g. a week ahead by the JS7 - Daily Plan Service, then it can execute them autonomously.
  • In case of a Controller outage
    • Agents will continue to execute workflows for the scheduled date and time. 
    • the Agent's journal will grow. The journal holds execution results of workflows and order state transitions.
  • Depending on the load of workflows the journal files that are stored with the ./state directory can grow to some Gigabytes. There is no harm about this as long as sufficient storage is available.
  • When the connection between Controller and Agent becomes available then the Agent will report back order state transitions and execution results to the Controller and the Agent's journal will shrink.

Troubleshooting



  • No labels
Write a comment…