Page History
Table of Contents |
---|
Introduction
- The outage Relevant information is also available in the JS7 - Impact of an Agent will stop the execution of workflows with that Agent. Workflows running with Agents that remain available are unaffected.
- Workflows that miss their scheduled execution date will be executed when the Agent becomes available.
Controller Behavior
...
- outage article.
- For information about the behavior in case of outages see the JS7 - FAQ - What happens to workflows in case of outage of an Agent article.
Troubleshooting
The JS7 Agent executes workflows, stores information about execution results and order state transitions
...
- the Controller will try to reconnect to the respective Agent.
- the Controller reports the missing Agent connection to the JOC Cockpit.
...
JOC Cockpit Behavior
- JOC Cockpit receives information about Agents from a Controller. There is no direct connection to an Agent.
- In case of outage of an Agent
- the Controller reports the information to JOC Cockpit that will display the missing Agent connection with its Dashboard.
- any interaction with an Agent such as deployment of workflows or requests for submission/cancellation of orders to that Agent are delayed. The Controller accepts such requests and will forward them to the Agent when the Agent becomes available. It is therefore not required to repeat such requests as they are promised to be executed on availability of an Agent.
Agent Behavior
The agent is the component in JS7 that executes the workflow and stores all the information in the form of journals and pass the results to the Controller. The outage of the agent affects the execution of the workflows. The workflow will not be executed if the agent is not available. But will execute all the outstanding orders whenever the agent is available.
The agent outage can be handled either by resolving the issue with the current agent (by restarting) or by moving the /state directory to the running agent. The journals for the agent are stored in the /state directory. So if we want the workflow execution of one agent to be forwarded to another agent then we need to copy the journals of the unavailable agent to another agent.
So, if agent1 is facing the outage and the agent2 is running (on the same server or another server) then follow the below steps to forward the execution of orders from agent1 to agent2:
...
in its journal and passes results to the Controller. The outage of an Agent prevents the execution of workflows.
Troubleshooting starts from the fact that users reproduce and locate a problem in order to better know the nature of the problem:
- As a first step check the Agent's log file
agent.log
andwatchdog.log
, see JS7 - Log Files and Locations.- Warnings and errors can be found from the output qualifiers
WARN
andERROR
in a log file. - Example
2021-10-10T09:53:04,939 WARN js7.base.session.SessionApi - HttpAgentApi(https://apmacwin:4345): HTTP 401 Unauthorized: POST https://apmacwin:4344/controller/api/session => InvalidLogin: Login: unknown user or invalid password
- Warnings and errors can be found from the output qualifiers
- Due to log rotation, log files from previous days are available in a compressed .gz format on a daily basis, see the JS7 - Log Rotation article for details.
- For Unix the
zcat
command can be used to directly access compressed log files. - For Windows compressed files have to be extracted, for example using 7-zip.
- For Unix the
- Note that an Agent instance can report problems related to other products such as Controller instances and the JOC Cockpit. In this situation it is recommended that the product's log files are also checked.
- Should warnings or error messages not be evident then users should do some research: the Product Knowledge Base and the Change Management System offer a search box, browsers offer access to search engines.
- Having completed analysis of a problem and being certain that the problem is related to a product defect and not to resources of the IT environment then:
- customers with a commercial license should use the Support Resources including the SOS ticket system.
- users with the open source license are invited to use Community Resources.
- Should the
agent.log
file not provide sufficient information to reproduce a problem then users should consider increasing the log level, as described in the JS7 - Log Levels and Debug Options article.
In some situations, for example if computer memory is not sufficient for the heap size of the Agent's Java Virtual Machine, the outage of an Agent instance can be handled by restarting the instance. However, problems indicating insufficient resources typically require better sizing of resources.
If the problem is related to server resources and if operation of the Agent cannot be continued on the same server then relocation of the Agent instance can be a last means to fight an outage. Relocation includes copying/moving the Agent instance's JS7_AGENT_DATA/state
directory to an Agent instance on a new server. This directory holds the Agent instance's journal. To relocate an Agent instance, the journal files can be copied to the new Agent instance. Refer to the JS7 - How to relocate an Agent article for the steps to apply
...
Note: If required to copy the journals from agent2 to agent1 once the agent1 is available otherwise the Controller will not be able to couple with the Agent
The SOS has performed the successful test with the scenario when the Agent was not available for 24 hours and we copied the files from the state folder of agent1 to agent2. Changed the URL in the JOC Cockpit for the Agent. Then the agent2 executed all the outstanding orders from the previous day and also the new orders.