Introduction

  • Initial Operation is performed after installation of the JS7 Controller, Agent and JOC Cockpit as described in the JS7 - Initial Operation article.
  • In the event of issues while registering the Controllers, the Controller will not be able to submit the job-related configurations to Agents. In addition, the execution of workflows that include jobs running on a number of Agents will not be possible, as the switching of Agents during workflow execution is performed by the customer and forwarded by the Controller.

Troubleshooting

After registering the Controller its status can be checked from the JS7 - Dashboard. When registering a Controller a number of misconfigurations may occur.

Operations to register a Controller are performed from the JOC Cockpit "Manage Controllers/Agents" page which is available from the user menu in the upper right-hand corner of the GUI. Administrative permissions are required to be able to see and to use this page.

User Errors

Same Controller is used as a Standalone Controller and as a member of a Controller Cluster

  • Problem: Assume that the JOC Cockpit uses a Controller registered as a Standalone Controller. If the same Controller is registered as a Controller Cluster member then the JOC Cockpit will throw an error such as:
    • JocObjectAlradyExistException: com.sos.joc.exceptions.JocObjectAlreadyExistException: Controller(s) with id 'controller' already exists 
    • This error message will also be noted in the JOC Cockpit's log file, e.g. in the JS7_JOC_DATA/logs/joc.log file - see the JS7 - Log Files and Locations article.
  • SolutionIt is not possible to register the same Controller twice, as a Standalone Controller and as a Controller cluster.

Users who wish to migrate their Standalone Controller to a Controller Cluster should observe the following procedure:

  • Check that you have a license key ready as the clustering for Controllers is only available with a commercial license - see the JS7 - License article.
  • Verify that execution of workflows has been completed with the Standalone Controller and that no orders are running.
  • Remove the Standalone Controller from the JOC Cockpit GUI.
  • Shutdown the Standalone Controller and remove the Controller's journal files in its JS7_OONTROLLER_DATA/state directory.
  • Follow the steps listed in the JS7 - Initial Operation for Controller Cluster article.
  • Redeploy scheduling objects such as workflows from the JOC Cockpit to the Controller Cluster.

Controller Instances with different Controller IDs are used as a Controller Cluster

  • Problem: If a Controller Cluster is registered with different Controller IDs for the Controller instances, then this will raise an error:
    • as joc.log ControllerInvalidResponseDataException:  
      com.sos.joc.exceptions.ControllerInvalidResponseDataException: The cluster members must have the same Controller Id: http://<host1>:<port1> -> controller_ID1, HTTP://<host2>:<port2> -> controller_ID2.
    • This error message will also be noted in the JOC Cockpit's log file, e.g. in the JS7_JOC_DATA/logs/joc.log file - see the JS7 - Log Files and Locations article.
  • SolutionIt is not possible to register Controller instances with different Controller IDs to work as a Controller Cluster. Instead check which Controller instance is using the wrong Controller ID and rerun the installation. During installation the Controller ID can be specified as described in the JS7 - Controller Installation On Premises and JS7 - Controller Installation for Containers articles.

Secondary Controller instance is not configured for use with a Cluster

  • Problem: If the Secondary Controller instance is not configured for use with a Cluster but runs as a standalone instance then the Primary Controller instance's logs will include error messages such as:
    • 2022-01-27T00:00:08,669 WARN js7.cluster.ClusterCommon - 'ClusterStartBackupNode' command failed with HTTP 400 Bad Request: POST http://example.com:4444/controller/api/command => ClusterNodeIsNotBackup: The cluster node to be appointed is not configured as a backup node
    • This error message will also be noted in the JOC Cockpit's log file, e.g. in the JS7_JOC_DATA/logs/joc.log file - see the JS7 - Log Files and Locations article.
  • Solution: The Secondary Controller is missing the following setting in its JS7_CONTROLLER_DATA/config/controller.conf file - see the JS7 - Initial Operation for Controller Cluster, chapter: Check Cluster Settings article: 
    • js7.journal.cluster.node.is-backup=yes
    • Due to the missing setting, the Secondary Controller will now be acting as a standalone instance which is a different operating mode. To initialize the Secondary Controller instance for cluster operation apply the following steps:
      • Shutdown the Secondary Controller instance.
      • Delete the contents of the Secondary Controller instance's JS7_CONTROLLER_DATA/state directory.
      • Start the Secondary Controller instance.
      • In the JOC Cockpit GUI navigate to the User Menu->Manage Controllers/Agents page. Edit the Controller entry, check the connection status from the available buttons and submit. As a result the JOC Cockpit will forward this information to both Controller instances.
      • If the Controller instances are not coupled after approx. 120s then:
        • Shut down the Primary Controller instance.
        • Delete the contents of the Secondary Controller instance's JS7_CONTROLLER_DATA/state directory.
        • Shut down the Cluster Watch Agent.
        • Delete the contents of the Cluster Watch Agent's JS7_AGENT_DATA/state directory.
        • Start the Cluster Watch Agent.
        • Start the Primary Controller instance.
        • Coupling of Controller instances should occur within 60s.

License missing when configuring the Controller Cluster

Further Information

For troubleshooting during ongoing operation see the JS7 - How to troubleshoot Controller journals article.