Introduction

  • The JS7 Controller and Agent make use of a journal to store infrastructure information and transactional data.
    • The journal is located in the <data>/state directory of the Controller's or Agent's data directory.
    • The journal consists of a number of files.
  • There can be situations when changes to a journal are required:
    • Examples
      • If users find inconsistent journals due to problems with the storage layer.
      • If users find orders holding an inconsistent state that cannot be cancelled from the JOC Cockpit GUI.
      • If users would not properly decomission an Agent but drop the Agent's machine. In this situation the Controller holds the information about the unreachable Agent. 
    • Changes to a journal are considered critical and must be performed with care.
  • The proceeding includes
    • to shutdown the Controller or Agent instance for which the journal should be updated,
    • to take a backup of the journal,
    • to apply changes to the journal by use of the Journal Update Utility,
    • to start the Controller or Agent instance.

Journal Update Utility

The Journal Update Utility is provided for download and can be used to update Controller and Agent journals.

  • The utility is provided from a Java class that can be operated on any platform using Java LTS releases starting from Java 1.8.

Download

ProductPlatformDownload URLHashSigTSR
Controller, Agentanyhttps://download.sos-berlin.com/JobScheduler.2.0/update-journal.jarsha245sigtsr

Usage

Invoking the Journal Update Utility without arguments displays the usage clause:

Usage
Usage: java -jar update-journal.jar [options] [switches]

  Options:
    --controller-id=<identifier>        | required: Controller ID for which the journal is updated
    --journal-dir=<directory>           | required: directory that holds the journal
    --backup-dir=<directory>            | optional: directory to hold journal backup archives, default: --journal-dir
    --agent-id=<agent-id>[,<agent-id>]  | optional: Agent ID to be removed from a Controller's journal
    --order-id=<order-id>[,<order-id>]  | optional: Order ID to be removed from a Controller's or Agent's journal
    --workflow=<name>[,<name>]          | optional: Workflow to be removed from a Controller's or Agent's journal
    --old-cert=<path>                   | optional: path to the current certificate 
                                              (built-in:signature-certificate-valid-until-2025-10-12.pem; see https://kb.sos-berlin.com/x/QbiKCw)
    --new-cert=<path>                   | optional: path to the new certificate 
                                              (built-in:signature-certificate-valid-until-2030-06-15.pem; see https://kb.sos-berlin.com/x/QbiKCw)
  Switches:
    -h | --help                         | displays usage
    -c | --check                        | optional: the journal remains unchanged. It only returns the number of hits
    -r | --remove                       | optional: removes the indicated Agent ID or Order ID from the journal
    -u | --update-cert                  | optional: replaces --old-cert with --new-cert
    -p | --problem                      | optional: removes problems from a Controller journal
see https://kb.sos-berlin.com/x/R4KmBw for more information.

Options

  • --controller-id
    • Specifies the Controller ID for which changes are applied.
    • The Controller ID must be specified. This applies to use of a Controller's and Agent's journal.
  • --journal-dir
    • Specifies the directory in which a Controller's or Agent's journal is available. Typically the state sub-directory of the data directory is used.
    • Permissions to read and to write to the directory and files are required.
  • --backup-dir
    • Specifies an existing directory to which backups of journal files will be added. Write permissions for the directory are required.
    • A sub-directory in the indicated directory will be created following the scheme: <backup-directory>/update-journal.<agent|controller>.<host>.<yyyy-MM-ddThh-mm-ss>
    • Example: /var/backups/js7/update-journal.controller.centostest_primary.2023-12-06T02-14-23
  • --agent-id
    • Specifies the Agent ID of an Agent that should be removed from a Controller's journal.. More than one Agent ID can be specified separated by comma.
    • Removing an Agent ID from the journal will remove any orders and workflows related to the given Agent from the same.
      • Workflows are removed only if the first job of the workflow is assigned the Agent to be removed.
      • To remove workflows holding later jobs assigned the specified Agent use the --workflow option.
    • One of the options --agent-id, --order-id, --workflow or the switch --problem has to be specified.
  • --order-id
    • Specifies the Order ID that should be removed from the journal. More than one Order ID can be specified separated by comma.
    • One of the options --agent-id, --order-id, --workflow or the switch --problem has to be specified.
  • --workflow
    • Specifies the workflow that should be removed from the journal. More than one workflow can be specified separated by comma.
    • This option expects the name of a workflow, not its path. Regular expression syntax can be used to specify a number of workflows, for example:
      • my-workflow.*   removes any workflows with a name starting with my-workflow followed by any characters (right truncation).
      • .*my-workflow   removes any workflows with a name starting with any characters ending with my-workflow (left truncation).
      • my-.*workflow   removes any workflows with a name starting with my- followed by any characters and ending with workflow.
    • One of the options --agent-id, --order-id, --workflow or the switch --problem has to be specified.
  • --old-cert
  • --new-cert

Switches

  • -h | --help
    • Displays usage.
  • -c | --check
    • Checks if the indicated Agent, Order ID or Workflow is available from the journal. The operation does not modify the journal.
    • The --check switch has to be combined with one of --remove or --update-cert switches.
  • -r | --remove
    • Removes the indicated Agent, Order ID or Workflow from the journal.
    • The switch can be combined with the --check switch, should no immediate removal be performed but existence of the related item in the journal should be checked.
  • -u | --update-cert
    • Is reserved for updating certificates in Controller's or Agent's journal.
    • The switch can be combined with the --check switch, should no immediate replacement of certificates be performed but existence of the related certificate in the journal should be checked.
  • -p | --problem
    • Removes problems from the Controller's journal. Such problems can include confirming loss of an instance in a Controller Cluster or loss of an instance in an Agent Cluster.

Scenarios

For any scenarios a restart of JOC Cockpit is required as the Proxy Service might hold information about related Agents and Order IDs in its cache.

Removing an Order or Workflow

The scenario includes that changes from the Update Journal Utility are applied:

  • to the Controller.
    • If a Controller Cluster is used then changes have to be applied to both Active and Standby Controller instances.
  • to the Agent:
    • The same changes as to the Controller have to be applied to Standalone Agents and Cluster Agents.
    • If an Agent Cluster is used then changes have to be applied to both Active and Standby Director Agent instances.

Removing an Agent

If an Agent should be decomissioned, then the standard proceeding includes the steps explained from the JS7 - How to take an Agent out of Operation article.

If the standard proceeding has not been performed while the Agent was running & connected, and instead the Agent's server or installation directory have been removed then changes to the Controller's journal are required in order to remove the Agent from the JS7 inventory.

The scenario includes that changes are applied: to the Controller by the Update Journal Utility. If a Controller Cluster is used, then changes have to be applied to both Active and Standby Controller instances.

Remove a Problem

The Controller keeps track of problems such as missing confirmation for loss of a node in a Controller Cluster or Agent Cluster. Such confirmations can be provided by the JOC Cockpit GUI. Alternatively the Journal Update Utility allows to drop such problems from a Controller's journal.

Removal of a problem should be performed for the journal of the active Controller instance in a Controller Cluster. The journal directory should then be copied to the standby Controller instance.

Fallback

To revert changes by the Update Journal Utility apply the following procedure:

  • remove the contents of the state sub-directory that holds the updated journal,
  • copy from the backup directory for a Controller: 
    • cp -P /var/backups/js7/update-journal.controller.centostest_primary.2023-12-06T02-14-23/* /var/sos-berlin.com/js7/controller/state
  • copy from the backup directory for an Agent:
    • cp -P /var/backups/js7/update-journal.agent.centostest_primary.2023-12-06T04-12-53/* /var/sos-berlin.com/js7/agent/state

The journal directory holds symlinks:

  • the symlinks controller-journal or agent-journal will be included in backups.
    • The symlinks point to the journal file with the earlierst timestamp in its file name.
    • It is important that the command option cp -P is used to preserve symlinks when copying from a backup directory.
  • For your information: just in case that a symlink would have to be recreated for the Controller journal use the commands:
    • cd /var/sos-berlin.com/js7/controller/state
    • ln -s -r controller--1701814807215000.journal controller-journal
  • For your informaiton:. just in case that a symlink would have to be recreated for the Agent journal use the commands:
    • cd /var/sos-berlin.com/js7/agent/state
    • ln -s -r agent--1701814807215000.journal agent-journal

Examples

The following examples illustrate typical use cases.

Remove Orders

Check if Order is available from Controller and Agent Journal

Example for Update of Journal
# run on Controller server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/controller/state \
    --order-id="#2023-11-28#T19891282200-root" \
    --check

# run on Agent Server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/agent/state \
    --order-id="#2023-11-28#T19891282200-root" \
    --check

# checks if the indicated order is available from the Controller's and Agent's journal on the respective servers
# specifies the path to the journal directory and the quoted Order ID

Remove Order from Controller and Agent Journal

Example for Update of Journal
# run on Controller server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/controller/state \
    --backup-dir=/var/backups/js7 \
    --order-id="#2023-11-28#T19891282200-root,#2023-11-30#T37072277904-root" \
    --remove

# run on Agent Server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/agent/state \
    --backup-dir=/var/backups/js7 \
    --order-id="#2023-11-28#T19891282200-root" \
    --remove

# removes the indicated orders from the Controller's and Agent's journal on the respective servers
# specifies the path to the journal directory
# creates backups of journal files in the related backup directories
# specifies a quoted, comma separated list of Order IDs to be removed

Remove Workflows

Check if Workflow is available from Controller Journal

Example for Update of Journal
# run on Controller server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/controller/state \
    --workflow=some-workflow \
    --check

# checks if the indicated workflow is available from the journal
# specifies the path to the journal directory and the workflow to be removed

Remove Workflow from Controller Journal

Example for Update of Journal
# run on Controller server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/controller/state \
    --backup-dir=/var/backups/js7 \
    --workflow=some-workflow,some-other-workflow \
    --remove

# removes the indicated workflows from the Controller's journal
# specifies the path to the journal directory 
# specifies a comma separated list of workflows to be removed
# creates a backup of journal files in the indicated directory

Remove Agent

Check if Agent is available from Controller Journal

Example for Update of Journal
# run on Controller server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/controller/state \
    --agent-id=agent_001 \
    --check

# checks if the indicated Agent is available from the journal
# specifies the path to the journal directory and the Agent ID to be removed

Remove Agent from Controller Journal

Example for Update of Journal
# run on Controller server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/controller/state \
    --backup-dir=/var/backups/js7 \
    --agent-id=agent_001,agent_002 \
    --remove

# removes the indicated Agents, related orders and workflows from the Controller's journal
# specifies the path to the journal directory 
# specifies a comma separated list of Agent IDs to be removed
# creates a backup of journal files in the indicated directory

Remove Problem

Remove problem from Controller Journal

Example for Update of Journal
# run on Controller server
java -jar update-journal.jar \
    --controller-id=controller \
    --journal-dir=/var/sos-berlin.com/js7/controller/state \
    --remove \
    --problem

# Removes a problem from the Controller journal
# specifies the path to the journal directory

Resources