Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • specify the JITL job class andcom.sos.jitl.jobs.monitoring.MonitoringJob Java Java class name and add arguments specifying the variables that are expected to be carried by incoming filesrequired arguments.

Example

Download (upload .json): pdmMonitoring.workflow.json

Using the Example

It is recommended to use the example is used as a starting point and that to adjust the parameterization is modified:


Explanation:

  • A JS7 - Cycle Instruction is used in order to repeatedly perform health status checks.
    • Users should adjust cycles to their monitoring needs.
  • JS7 - Retry Instruction is used in order to retry execution, for example of the MailJob included in case that e-mail cannot be sent.
  • The MonitoringJob is used to perform the health status check.
  • The MailJob is used to send notices and alerts by mail. This is an option - users might apply other means to forward notices and alerts.

...

Find a sample report file for download that indicates an alert: monitor.2022-08-17.09-16-44.9Z.alert.json

Code Block
titleSample Report File
collapsetrue
{
  "controllerStatus" : {
    "active" : {
      "id" : 3,
      "surveyDate" : "2022-08-17T08:57:43.000+00:00",
      "controllerId" : "testsuite",
      "title" : "SECONDARY CONTROLLER",
      "host" : "controller-2-0-secondary",
      "url" : "https://controller-2-0-secondary:4443",
      "clusterUrl" : "https://controller-2-0-secondary:4443",
      "role" : "BACKUP",
      "isCoupled" : false,
      "startedAt" : "2022-08-16T18:09:27.000+00:00",
      "version" : "2.5.0-SNAPSHOT+fd0eb39",
      "javaVersion" : "17.0.4+8-alpine-r0",
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM"
    },
    "volatileStatus" : {
      "id" : 2,
      "surveyDate" : "2022-08-17T09:16:45.064+00:00",
      "controllerId" : "testsuite",
      "title" : "PRIMARY CONTROLLER",
      "host" : "controller-2-0-primary",
      "url" : "https://controller-2-0-primary:4443",
      "clusterUrl" : "https://controller-2-0-primary:4443",
      "role" : "PRIMARY",
      "isCoupled" : true,
      "startedAt" : "2022-08-16T18:09:26.004+00:00",
      "version" : "2.5.0-SNAPSHOT+fd0eb39",
      "javaVersion" : "17.0.4+8-alpine-r0",
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM",
      "componentState" : {
        "severity" : 0,
        "_text" : "operational"
      },
      "connectionState" : {
        "severity" : 0,
        "_text" : "established"
      },
      "clusterNodeState" : {
        "severity" : 0,
        "_text" : "active"
      }
    },
    "permanentStatus" : {
      "id" : 2,
      "surveyDate" : "2022-08-16T18:12:47.169+00:00",
      "controllerId" : "testsuite",
      "title" : "PRIMARY CONTROLLER",
      "host" : "controller-2-0-primary",
      "url" : "https://controller-2-0-primary:4443",
      "clusterUrl" : "https://controller-2-0-primary:4443",
      "role" : "PRIMARY",
      "startedAt" : "2022-08-16T18:09:26.004+00:00",
      "version" : "2.5.0-SNAPSHOT+fd0eb39",
      "javaVersion" : "17.0.4+8-alpine-r0",
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      }
    }
  },
  "jocStatus" : {
    "active" : {
      "id" : 2,
      "memberId" : "joc-2-0-primary:97c88ccc3975703ebd0b7277d394ec8768f88b31775e8df038572d2547c240a0",
      "title" : "PRIMARY JOC COCKPIT",
      "current" : true,
      "host" : "joc-2-0-primary",
      "url" : "https://joc-2-0-primary:4443",
      "startedAt" : "2022-08-16T18:10:27.000+00:00",
      "version" : "2.5.0-SNAPSHOT",
      "connectionState" : {
        "severity" : 0,
        "_text" : "established"
      },
      "componentState" : {
        "severity" : 0,
        "_text" : "operational"
      },
      "clusterNodeState" : {
        "severity" : 0,
        "_text" : "active"
      },
      "controllerConnectionStates" : [ {
        "role" : "PRIMARY",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      }, {
        "role" : "BACKUP",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      } ],
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM",
      "lastHeartbeat" : "2022-08-17T09:16:37.000+00:00"
    },
    "passive" : [ {
      "id" : 1,
      "memberId" : "joc-2-0-secondary:97c88ccc3975703ebd0b7277d394ec8768f88b31775e8df038572d2547c240a0",
      "title" : "SECONDARY JOC COCKPIT",
      "current" : false,
      "host" : "joc-2-0-secondary",
      "url" : "https://joc-2-0-secondary.sos:7543",
      "startedAt" : "2022-08-16T18:10:27.000+00:00",
      "version" : "2.5.0-SNAPSHOT",
      "connectionState" : {
        "severity" : 0,
        "_text" : "established"
      },
      "componentState" : {
        "severity" : 0,
        "_text" : "operational"
      },
      "clusterNodeState" : {
        "severity" : 1,
        "_text" : "inactive"
      },
      "controllerConnectionStates" : [ {
        "role" : "PRIMARY",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      }, {
        "role" : "BACKUP",
        "state" : {
          "severity" : 0,
          "_text" : "established"
        }
      } ],
      "os" : {
        "name" : "Linux",
        "architecture" : "amd64",
        "distribution" : "3.10.0-957.1.3.el7.x86_64"
      },
      "securityLevel" : "MEDIUM",
      "lastHeartbeat" : "2022-08-17T09:16:37.000+00:00"
    } ]
  },
  "agentStatus" : [ {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_001",
    "agentName" : "primaryAgent",
    "url" : "https://agent-2-0-primary:4443",
    "version" : "2.5.0-SNAPSHOT",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 1,
    "isClusterWatcher" : true,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_002",
    "agentName" : "secondaryAgent",
    "url" : "https://agent-2-0-secondary:4443",
    "version" : "2.5.0-SNAPSHOT",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_004",
    "agentName" : "wintestAgent",
    "url" : "http://192.11.0.146:4245",
    "version" : "2.4.0",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_005",
    "agentName" : "apmaccsAgent",
    "url" : "http://192.11.3.3:4449",
    "state" : {
      "severity" : 2,
      "_text" : "UNKNOWN"
    },
    "healthState" : {
      "severity" : 2,
      "_text" : "NO_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : true
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_006",
    "agentName" : "apmacwinAgent",
    "url" : "http://192.11.2.2:4245",
    "state" : {
      "severity" : 2,
      "_text" : "UNKNOWN"
    },
    "healthState" : {
      "severity" : 2,
      "_text" : "NO_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : true
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_101",
    "agentName" : "agent17",
    "url" : "http://centostest_primary.sos:7775",
    "version" : "2.4.0-beta.20220714",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_009",
    "agentName" : "oracleAgent",
    "url" : "http://minos.sos:4445",
    "version" : "2.4.0-beta.20220714",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ {
      "isDirector" : "PRIMARY_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "director_primary_001",
      "url" : "https://diragent-2-0-primary:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    }, {
      "isDirector" : "NO_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "subagent_primary_001",
      "url" : "https://subagent-2-0-primary:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    }, {
      "isDirector" : "NO_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "subagent_secondary_001",
      "url" : "https://subagent-2-0-secondary:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    }, {
      "isDirector" : "NO_DIRECTOR",
      "agentId" : "agent_cluster_001",
      "subagentId" : "subagent_third_001",
      "url" : "https://subagent-2-0-third:4443",
      "version" : "2.5.0-SNAPSHOT",
      "state" : {
        "severity" : 0,
        "_text" : "COUPLED"
      },
      "orders" : [ ],
      "runningTasks" : 0,
      "isClusterWatcher" : false,
      "disabled" : false
    } ],
    "controllerId" : "testsuite",
    "agentId" : "agent_cluster_001",
    "agentName" : "AgentCluster001",
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  }, {
    "subagents" : [ ],
    "controllerId" : "testsuite",
    "agentId" : "agent_014",
    "agentName" : "winutf8Agent",
    "url" : "http://192.11.0.146:4445",
    "version" : "2.4.0",
    "state" : {
      "severity" : 0,
      "_text" : "COUPLED"
    },
    "healthState" : {
      "severity" : 0,
      "_text" : "ALL_SUBAGENTS_ARE_COUPLED_AND_ENABLED"
    },
    "orders" : [ ],
    "runningTasks" : 0,
    "isClusterWatcher" : false,
    "disabled" : false
  } ],
  "orderSnapshot" : {
    "pending" : 0,
    "scheduled" : 1262,
    "inProgress" : 0,
    "running" : 1,
    "prompting" : 0,
    "suspended" : 0,
    "waiting" : 770,
    "blocked" : 0,
    "failed" : 0,
    "terminated" : 1
  },
  "orderSummary" : {
    "failed" : 0
  }
}

...

The Job Documentation including the full list of arguments can be found under: https://www.sos-berlin.com/doc/JS7-JITL/MonitoringJob.xml

Authentication

The Job makes use of the JS7 - REST Web Service API that is available from JOC Cockpit. 

  • The job is executed with an Agent and requires a network connection to JOC Cockpit.
  • The job has to authenticate with JOC Cockpit, for the related configuration see JS7 - JITL Common Authentication.

Arguments

The MonitoringJob class accepts the following arguments:

...



NameRequiredDefault ValuePurposeExample
controller_idno

Optionally specifies the identification of the Controller to be checked. By default the current Controller is used.

controller_prod

monitor_report_dir

yes

Specifies the directory to which the job will store health status report files (.json). This directory has to exist prior to running the job and has to be in reach of the Agent that runs the job. 

    • An absolute or relative path can be specified.
    • An expression can be used., for example  env('JS7_AGENT_DATA') ++ '/monitor' 

env('JS7_AGENT_DATA') ++ '/monitor'

/var/sos-berlin.com/js7/agent/monitor

C:\ProgramData\sos-berlin.com\js7\agent\monitor

monitor_report_max_filesyes
The number of report files created will be limited to this value. Older report files will be removed when this value is exceeded25
fromyes

Specifies the e-mail address that is used to send mail for notices and alerts. The argument is used by the job to create the subject and body return variables.

js7@example.com
max_failed_ordersno

The maximum number of failed orders that are considered acceptable for a health status check. If this number is exceeded then the result return variable will carry a non-zero value indicating a failed health status check.

By default the number of failed orders is not considered for successful/unsuccessful health status checks.

3

Return Variables

The MonitoringJob class returns the following variables for use by subsequent jobs:

NameData TypePurposeExample
monitor_report_dateString

The date and time for which the health status check has been performed. The date format is yyyy-MM-dd.HH-mm-ss.K, for example 2022-07-31.23-12-59.Z indicating UTC time

controller_prod
monitor_report_fileStringThe path to the report file created for the health status check./var/sos-berlin.com/js7/agent/monitor/monitor.2022-08-15.17-35-36.5.json
subjectString

The subject of an e-mail for use with a later MailJob.

JS7 Monitor: Notice from: js7@sos-berlin.com at: 2022-08-15.17-35-36.5
bodyString

The body of an e-mail for use with a later MailJob, by default the value is the same as for the subject.

JS7 Monitor: Notice from: js7@sos-berlin.com at: 2022-08-15.17-35-36.5
resultNumberThe number of problems identified during the health status check. A value 0 indicates absence of problems, other values indicate existence of problems.0

...