Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Introduction

  • The outage Relevant information is also available in the JS7 - Impact of an Agent will stop the execution of workflows with that Agent. Workflows running with Agents that remain available are unaffected.
  • Workflows that miss their scheduled execution date will be executed when the Agent becomes available.

Controller Behavior

...

Troubleshooting

The JS7 Agent executes workflows, stores information about execution results and order state transitions

...

  • the Controller will try to reconnect to the respective Agent.
  • the Controller reports the missing Agent connection to the JOC Cockpit.

...

JOC Cockpit Behavior

  • JOC Cockpit receives information about Agents from a Controller. There is no direct connection to an Agent.
  • In case of outage of an Agent
    • the Controller reports the information to JOC Cockpit that will display the missing Agent connection with its Dashboard.
    • any interaction with an Agent such as deployment of workflows or requests for submission/cancellation of orders to that Agent are delayed. The Controller accepts such requests and will forward them to the Agent when the Agent becomes available. It is therefore not required to repeat such requests as they are promised to be executed on availability of an Agent.

Agent Behavior

The agent is the component in JS7 that executes the workflow and stores all the information in the form of journals and pass the results to the Controller. The outage of the agent affects the execution of the workflows. The workflow will not be executed if the agent is not available. But will execute all the outstanding orders whenever the agent is available.

The agent outage can be handled either by resolving the issue with the current agent (by restarting) or by moving the /state directory to the running agent. The journals for the agent are stored in the /state directory. So if we want the workflow execution of one agent to be forwarded to another agent then we need to copy the journals of the unavailable agent to another agent.

So, if agent1 is facing the outage and the agent2 is running (on the same server or another server) then follow the below steps to forward the execution of orders from agent1 to agent2:

...

in its journal and passes results to the Controller. The outage of an Agent prevents the execution of workflows.

Troubleshooting starts from the fact that users reproduce and locate a problem in order to better know the nature of the problem:

  • As a first step check the Agent's log file agent.log and watchdog.log, see JS7 - Log Files and Locations.
    • Warnings and errors can be found from the output qualifiers WARN and ERROR in a log file.
    • Example
      • 2021-10-10T09:53:04,939 WARN js7.base.session.SessionApi - HttpAgentApi(https://apmacwin:4345): HTTP 401 Unauthorized: POST https://apmacwin:4344/controller/api/session => InvalidLogin: Login: unknown user or invalid password
  • Due to log rotation, log files from previous days are available in a compressed .gz format on a daily basis, see the JS7 - Log Rotation article for details.
    • For Unix the zcat command can be used to directly access compressed log files.
    • For Windows compressed files have to be extracted, for example using 7-zip.
  • Note that an Agent instance can report problems related to other products such as Controller instances and the JOC Cockpit. In this situation it is recommended that the product's log files are also checked.
  • Should warnings or error messages not be evident then users should do some research: the Product Knowledge Base and the Change Management System offer a search box, browsers offer access to search engines.
  • Having completed analysis of a problem and being certain that the problem is related to a product defect and not to resources of the IT environment then:
  • Should the agent.log file not provide sufficient information to reproduce a problem then users should consider increasing the log level, as described in the JS7 - Log Levels and Debug Options article.

In some situations, for example if computer memory is not sufficient for the heap size of the Agent's Java Virtual Machine, the outage of an Agent instance can be handled by restarting the instance. However, problems indicating insufficient resources typically require better sizing of resources.

If the problem is related to server resources and if operation of the Agent cannot be continued on the same server then relocation of the Agent instance can be a last means to fight an outage. Relocation includes copying/moving the Agent instance's JS7_AGENT_DATA/state directory to an Agent instance on a new server. This directory holds the Agent instance's journal. To relocate an Agent instance, the journal files can be copied to the new Agent instance. Refer to the JS7 - How to relocate an Agent article for the steps to apply

...

Note: If required to copy the journals from agent2 to agent1 once the agent1 is available otherwise the Controller will not be able to couple with the Agent

The SOS has performed the successful test with the scenario when the Agent was not available for 24 hours and we copied the files from the state folder of agent1 to agent2. Changed the URL in the JOC Cockpit for the Agent. Then the agent2 executed all the outstanding orders from the previous day and also the new orders.