Page History
...
- specify the
JITL
job class andcom.sos.jitl.jobs.monitoring.MonitoringJob
Java Java class name and add arguments specifying the variables that are expected to be carried by incoming filesrequired arguments.
Example
Download (upload .json): pdmMonitoring.workflow.json
Using the Example
It is recommended to use the example is used as a starting point and that to adjust the parameterization is modified:
Explanation:
- A JS7 - Cycle Instruction is used in order to repeatedly perform health status checks.
- Users should adjust cycles to their monitoring needs.
- A JS7 - Retry Instruction is used in order to retry execution, for example of the MailJob included in case that e-mail cannot be sent.
- The MonitoringJob is used to perform the health status check.
- The MailJob is used to send notices and alerts by mail. This is an option - users might apply other means to forward notices and alerts.
...
Find a sample report file for download that indicates an alert: monitor.2022-08-17.09-16-44.9Z.alert.json
Code Block | ||||
---|---|---|---|---|
| ||||
{ "controllerStatus" : { "active" : { "id" : 3, "surveyDate" : "2022-08-17T08:57:43.000+00:00", "controllerId" : "testsuite", "title" : "SECONDARY CONTROLLER", "host" : "controller-2-0-secondary", "url" : "https://controller-2-0-secondary:4443", "clusterUrl" : "https://controller-2-0-secondary:4443", "role" : "BACKUP", "isCoupled" : false, "startedAt" : "2022-08-16T18:09:27.000+00:00", "version" : "2.5.0-SNAPSHOT+fd0eb39", "javaVersion" : "17.0.4+8-alpine-r0", "os" : { "name" : "Linux", "architecture" : "amd64", "distribution" : "3.10.0-957.1.3.el7.x86_64" }, "securityLevel" : "MEDIUM" }, "volatileStatus" : { "id" : 2, "surveyDate" : "2022-08-17T09:16:45.064+00:00", "controllerId" : "testsuite", "title" : "PRIMARY CONTROLLER", "host" : "controller-2-0-primary", "url" : "https://controller-2-0-primary:4443", "clusterUrl" : "https://controller-2-0-primary:4443", "role" : "PRIMARY", "isCoupled" : true, "startedAt" : "2022-08-16T18:09:26.004+00:00", "version" : "2.5.0-SNAPSHOT+fd0eb39", "javaVersion" : "17.0.4+8-alpine-r0", "os" : { "name" : "Linux", "architecture" : "amd64", "distribution" : "3.10.0-957.1.3.el7.x86_64" }, "securityLevel" : "MEDIUM", "componentState" : { "severity" : 0, "_text" : "operational" }, "connectionState" : { "severity" : 0, "_text" : "established" }, "clusterNodeState" : { "severity" : 0, "_text" : "active" } }, "permanentStatus" : { "id" : 2, "surveyDate" : "2022-08-16T18:12:47.169+00:00", "controllerId" : "testsuite", "title" : "PRIMARY CONTROLLER", "host" : "controller-2-0-primary", "url" : "https://controller-2-0-primary:4443", "clusterUrl" : "https://controller-2-0-primary:4443", "role" : "PRIMARY", "startedAt" : "2022-08-16T18:09:26.004+00:00", "version" : "2.5.0-SNAPSHOT+fd0eb39", "javaVersion" : "17.0.4+8-alpine-r0", "os" : { "name" : "Linux", "architecture" : "amd64", "distribution" : "3.10.0-957.1.3.el7.x86_64" } } }, "jocStatus" : { "active" : { "id" : 2, "memberId" : "joc-2-0-primary:97c88ccc3975703ebd0b7277d394ec8768f88b31775e8df038572d2547c240a0", "title" : "PRIMARY JOC COCKPIT", "current" : true, "host" : "joc-2-0-primary", "url" : "https://joc-2-0-primary:4443", "startedAt" : "2022-08-16T18:10:27.000+00:00", "version" : "2.5.0-SNAPSHOT", "connectionState" : { "severity" : 0, "_text" : "established" }, "componentState" : { "severity" : 0, "_text" : "operational" }, "clusterNodeState" : { "severity" : 0, "_text" : "active" }, "controllerConnectionStates" : [ { "role" : "PRIMARY", "state" : { "severity" : 0, "_text" : "established" } }, { "role" : "BACKUP", "state" : { "severity" : 0, "_text" : "established" } } ], "os" : { "name" : "Linux", "architecture" : "amd64", "distribution" : "3.10.0-957.1.3.el7.x86_64" }, "securityLevel" : "MEDIUM", "lastHeartbeat" : "2022-08-17T09:16:37.000+00:00" }, "passive" : [ { "id" : 1, "memberId" : "joc-2-0-secondary:97c88ccc3975703ebd0b7277d394ec8768f88b31775e8df038572d2547c240a0", "title" : "SECONDARY JOC COCKPIT", "current" : false, "host" : "joc-2-0-secondary", "url" : "https://joc-2-0-secondary.sos:7543", "startedAt" : "2022-08-16T18:10:27.000+00:00", "version" : "2.5.0-SNAPSHOT", "connectionState" : { "severity" : 0, "_text" : "established" }, "componentState" : { "severity" : 0, "_text" : "operational" }, "clusterNodeState" : { "severity" : 1, "_text" : "inactive" }, "controllerConnectionStates" : [ { "role" : "PRIMARY", "state" : { "severity" : 0, "_text" : "established" } }, { "role" : "BACKUP", "state" : { "severity" : 0, "_text" : "established" } } ], "os" : { "name" : "Linux", "architecture" : "amd64", "distribution" : "3.10.0-957.1.3.el7.x86_64" }, "securityLevel" : "MEDIUM", "lastHeartbeat" : "2022-08-17T09:16:37.000+00:00" } ] }, "agentStatus" : [ { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_001", "agentName" : "primaryAgent", "url" : "https://agent-2-0-primary:4443", "version" : "2.5.0-SNAPSHOT", "state" : { "severity" : 0, "_text" : "COUPLED" }, "healthState" : { "severity" : 0, "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 1, "isClusterWatcher" : true, "disabled" : false }, { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_002", "agentName" : "secondaryAgent", "url" : "https://agent-2-0-secondary:4443", "version" : "2.5.0-SNAPSHOT", "state" : { "severity" : 0, "_text" : "COUPLED" }, "healthState" : { "severity" : 0, "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_004", "agentName" : "wintestAgent", "url" : "http://192.11.0.146:4245", "version" : "2.4.0", "state" : { "severity" : 0, "_text" : "COUPLED" }, "healthState" : { "severity" : 0, "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_005", "agentName" : "apmaccsAgent", "url" : "http://192.11.3.3:4449", "state" : { "severity" : 2, "_text" : "UNKNOWN" }, "healthState" : { "severity" : 2, "_text" : "NO_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : true }, { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_006", "agentName" : "apmacwinAgent", "url" : "http://192.11.2.2:4245", "state" : { "severity" : 2, "_text" : "UNKNOWN" }, "healthState" : { "severity" : 2, "_text" : "NO_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : true }, { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_101", "agentName" : "agent17", "url" : "http://centostest_primary.sos:7775", "version" : "2.4.0-beta.20220714", "state" : { "severity" : 0, "_text" : "COUPLED" }, "healthState" : { "severity" : 0, "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_009", "agentName" : "oracleAgent", "url" : "http://minos.sos:4445", "version" : "2.4.0-beta.20220714", "state" : { "severity" : 0, "_text" : "COUPLED" }, "healthState" : { "severity" : 0, "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "subagents" : [ { "isDirector" : "PRIMARY_DIRECTOR", "agentId" : "agent_cluster_001", "subagentId" : "director_primary_001", "url" : "https://diragent-2-0-primary:4443", "version" : "2.5.0-SNAPSHOT", "state" : { "severity" : 0, "_text" : "COUPLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "isDirector" : "NO_DIRECTOR", "agentId" : "agent_cluster_001", "subagentId" : "subagent_primary_001", "url" : "https://subagent-2-0-primary:4443", "version" : "2.5.0-SNAPSHOT", "state" : { "severity" : 0, "_text" : "COUPLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "isDirector" : "NO_DIRECTOR", "agentId" : "agent_cluster_001", "subagentId" : "subagent_secondary_001", "url" : "https://subagent-2-0-secondary:4443", "version" : "2.5.0-SNAPSHOT", "state" : { "severity" : 0, "_text" : "COUPLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "isDirector" : "NO_DIRECTOR", "agentId" : "agent_cluster_001", "subagentId" : "subagent_third_001", "url" : "https://subagent-2-0-third:4443", "version" : "2.5.0-SNAPSHOT", "state" : { "severity" : 0, "_text" : "COUPLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false } ], "controllerId" : "testsuite", "agentId" : "agent_cluster_001", "agentName" : "AgentCluster001", "healthState" : { "severity" : 0, "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false }, { "subagents" : [ ], "controllerId" : "testsuite", "agentId" : "agent_014", "agentName" : "winutf8Agent", "url" : "http://192.11.0.146:4445", "version" : "2.4.0", "state" : { "severity" : 0, "_text" : "COUPLED" }, "healthState" : { "severity" : 0, "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED" }, "orders" : [ ], "runningTasks" : 0, "isClusterWatcher" : false, "disabled" : false } ], "orderSnapshot" : { "pending" : 0, "scheduled" : 1262, "inProgress" : 0, "running" : 1, "prompting" : 0, "suspended" : 0, "waiting" : 770, "blocked" : 0, "failed" : 0, "terminated" : 1 }, "orderSummary" : { "failed" : 0 } } |
...
The Job Documentation including the full list of arguments can be found under: https://www.sos-berlin.com/doc/JS7-JITL/MonitoringJob.xml
Authentication
The Job makes use of the JS7 - REST Web Service API that is available from JOC Cockpit.
- The job is executed with an Agent and requires a network connection to JOC Cockpit.
- The job has to authenticate with JOC Cockpit, for the related configuration see JS7 - JITL Common Authentication.
Arguments
The MonitoringJob class accepts the following arguments:
...
Name | Required | Default Value | Purpose | Example |
---|---|---|---|---|
controller_id | no | Optionally specifies the identification of the Controller to be checked. By default the current Controller is used. | controller_prod | |
| yes | Specifies the directory to which the job will store health status report files (.json). This directory has to exist prior to running the job and has to be in reach of the Agent that runs the job.
|
| |
monitor_report_max_files | yes | The number of report files created will be limited to this value. Older report files will be removed when this value is exceeded | 25 | |
from | yes | Specifies the e-mail address that is used to send mail for notices and alerts. The argument is used by the job to create the | js7@example.com | |
max_failed_orders | no | The maximum number of failed orders that are considered acceptable for a health status check. If this number is exceeded then the By default the number of failed orders is not considered for successful/unsuccessful health status checks. | 3 |
Return Variables
The MonitoringJob class returns the following variables for use by subsequent jobs:
Name | Data Type | Purpose | Example |
---|---|---|---|
monitor_report_date | String | The date and time for which the health status check has been performed. The date format is | controller_prod |
monitor_report_file | String | The path to the report file created for the health status check. | /var/sos-berlin.com/js7/agent/monitor/monitor.2022-08-15.17-35-36.5.json |
subject | String | The subject of an e-mail for use with a later MailJob. | JS7 Monitor: Notice from: js7@sos-berlin.com at: 2022-08-15.17-35-36.5 |
body | String | The body of an e-mail for use with a later MailJob, by default the value is the same as for the | JS7 Monitor: Notice from: js7@sos-berlin.com at: 2022-08-15.17-35-36.5 |
result | Number | The number of problems identified during the health status check. A value 0 indicates absence of problems, other values indicate existence of problems. | 0 |
...