Introduction
- The JS7 Controller and Agent make use of a journal to store infrastructure information and transactional data.
- The journal is located in the
<data>/state
directory of the Controller's or Agent's data directory. - The journal consists of a number of files.
- The journal is located in the
- There can be situations when changes to a journal are required:
- Examples
- If users find inconsistent journals due to problems with the storage layer.
- If users find orders holding an inconsistent state that cannot be cancelled from the JOC Cockpit GUI.
- If users would not properly decomission an Agent but drop the Agent's machine. In this situation the Controller holds the information about the unreachable Agent.
- Changes to a journal are considered critical and must be performed with care.
- Examples
- The proceeding includes
- to shutdown the Controller or Agent instance for which the journal should be updated,
- to take a backup of the journal,
- to apply changes to the journal by use of the Journal Update Script,
- to start the Controller or Agent instance.
Journal Update Script
The Journal Update Script is provided for download and can be used to update Controller and Agent journals.
- The script is available for Linux, MacOS®, AIX®, Solaris® using bash, dash, ksh and zsh POSIX-compatible shells.
- The script is intended as a baseline example for customization by JS7 users and by SOS within the scope of professional services.
Download
Download: update-journal.sh
Make the script executable using the command: chmod +x ./update-journal.sh
Usage
Invoking the script without arguments displays the usage clause:
Usage: update-journal.sh [Options] [Switches] Options: --controller-id=<identifier> | required: Controller ID for which the journal is updated, default: controller --journal-dir=<directory> | required: directory that holds the journal, default: . --backup-dir=<directory> | required: directory to hold journal backups, default: /tmp --agent-id=<agent-id>[,<agent-id>] | optional: Agent ID to be removed from a Controller's journal --order-id=<order-id>[,<order-id>] | optional: Order ID to be removed from a Controller's or Agent's journal --workflow=<name>[,<name>] | optional: Workflow to be removed from a Controller's or Agent's journal Switches: -h | --help | displays usage -a | --agent | specifies that an Agent's journal will be updated -c | --controller | specifies that a Controller's journal will be updated -k | --check | checks if the journal includes the indicated Agent ID and Order ID -r | --remove | removes the indicated Agent ID or Order ID from the journal -p | --problem | removes problems from a Controller journal
Options
--controller-id
- Specifies the Controller ID for which changes are applied.
- The Controller ID must be specified. This applies to use of a Controller's or Agent's journal.
--journal-dir
- Specifies the directory in which a Controller's or Agent's journal is available. Typically the
state
sub-directory of the data directory is used. - Permissions to read and to write to the directory and files are required.
- Specifies the directory in which a Controller's or Agent's journal is available. Typically the
--backup-dir
- Specifies an existing directory to which backups of journal files will be added. Write permissions for the directory are required.
- A sub-directory in the indicated directory will be created following the scheme:
<backup-directory>/update-journal.<agent|controller>.<host>.<yyyy-MM-ddThh-mm-ss>
- Example:
/var/backups/js7/update-journal.controller.centostest_primary.2023-12-06T02-14-23
--agent-id
- Specifies the Agent ID of an Agent that should be removed from a Controller's journal.. More than one Agent ID can be specified separated by comma.
- Removing an Agent ID from the journal will remove any orders and workflows related to the given Agent from the same.
- Workflows are removed only if the first job of the workflow is assigned the Agent to be removed.
- To remove workflows holding later jobs assigned the specified Agent use the
--workflow
option.
- One of the options
--agent-id
,--order-id
,--workflow
or the switch--problem
has to be specified.
--order-id
- Specifies the Order ID that should be removed from the journal. More than one Order ID can be specified separated by comma.
- One of the options
--agent-id
,--order-id
,--workflow
or the switch--problem
has to be specified.
--workflow
- Specifies the workflow that should be removed from the journal. More than one workflow can be specified separated by comma.
- This option expects the name of a workflow, not its path. Regular expression syntax can be used to specify a number of workflows, for example:
my-workflow.*
removes any workflows with a name starting withmy-workflow
followed by any characters (right truncation)..*my-workflow
removes any workflows with a name starting with any characters ending withmy-workflow
(left truncation).my-.*workflow
removes any workflows with a name starting withmy-
followed by any characters and ending withworkflow.
- One of the options
--agent-id
,--order-id
,--workflow
or the switch--problem
has to be specified.
Switches
-h | --help
- Displays usage.
-a | --agent
- Specifies that an Agent journal will be updated.
- Consider to apply updates to an Agent journal to both Director Agent instances if an Agent Cluster is used.
-c | --controller
- Specifies that a Controller journal will be updated.
- Consider to apply updates to a Controller journal to both Controller instances if a Controller Cluster is used.
-k | --check
- Checks if the indicated Agent or Order ID is available from the journal. The operation does not modify the journal.
-r | --remove
- Removes the indicated Agent or Order ID from the journal.
-p | --problem
- Removes problems from the Controller journal. Such problem can include to confirm loss of an instance in a Controller Cluster or loss of an instance in an Agent Cluster.
Scenarios
For any scenarios a restart of JOC Cockpit is required as the Proxy Service might hold information about related Agents and Order IDs in its cache.
Remove an Order or Workflow
The scenario includes that changes from the Update Journal Script are applied:
- to the Controller.
- If a Controller Cluster is used then changes have to be applied to both Active and Standby Controller instances.
- to the Agent:
- The same changes as to the Controller have to be applied to Standalone Agents and Cluster Agents.
- If an Agent Cluster is used then changes have to be applied to both Active and Standby Director Agent instances.
Remove an Agent
If an Agent should be decomissioned then the standard proceeding includes:
- to remove any orders from the Agent,
- to revoke any deployed objects such as Workflows and File Order Sources from the Agent,
- to remove assignments of the Agent to related jobs in any Workflows and File Order Sources,
- to delete the Agent from the inventory using the Configuration→Manage Controllers/Agents page.
If the above steps have not been performed and instead the Agent's server or installation directory have been removed then changes to the Controller's journal are required in order to remove the Agent from the JS7 inventory.
The scenario includes that changes from the Update Journal Script are applied:
- to the Controller.
- If a Controller Cluster is used then changes have to be applied to both Active and Standby Controller instances.
Remove a Problem
The Controller keeps track of problems such as missing confirmation for loss of a node in a Controller Cluster or Agent Cluster. Such confirmations can be provided by the JOC Cockpit GUI. Alternatively the Journal Update Script allows to drop such problems from a Controller's journal.
Removal of a problem should be performed for the journal of the active Controller instance in a Controller Cluster. The journal directory should then be copied to the standby Controller instance.
Fallback
To revert changes by the Update Journal Script apply the following procedure:
- remove the contents of the
state
sub-directory that holds the updated journal, - copy from the backup directory for a Controller:
cp -P /var/backups/js7/update-journal.controller.centostest_primary.2023-12-06T02-14-23/* /var/sos-berlin.com/js7/controller/state
- copy from the backup directory for an Agent:
cp -P /var/backups/js7/update-journal.agent.centostest_primary.2023-12-06T04-12-53/* /var/sos-berlin.com/js7/agent/state
The journal directory holds symlinks:
- the symlinks
controller-journal
oragent-journal
will be included in backups.- The symlinks point to the journal file with the earlierst timestamp in its file name.
- It is important that the command option
cp -P
is used to preserve symlinks when copying from a backup directory.
- For your information: just in case that a symlink would have to be recreated for the Controller journal use the commands:
cd /var/sos-berlin.com/js7/controller/state
ln -s -r controller--1701814807215000.journal controller-journal
- For your informaiton:. just in case that a symlink would have to be recreated for the Agent journal use the commands:
cd /var/sos-berlin.com/js7/agent/state
ln -s -r agent--1701814807215000.journal agent-journal
Examples
The following examples illustrate typical use cases.
Remove Orders
Check if Order is available from Controller and Agent Journal
# run on Controller server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/controller/state \ --order-id="#2023-11-28#T19891282200-root" \ --controller \ --check # run on Agent Server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/agent/state \ --order-id="#2023-11-28#T19891282200-root" \ --agent \ --check # checks if the indicated order is available from the Controller's and Agent's journal on the respective servers # specifies the path to the journal directory and the quoted Order ID
Remove Order from Controller and Agent Journal
# run on Controller server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/controller/state \ --backup-dir=/var/backups/js7 \ --order-id="#2023-11-28#T19891282200-root,#2023-11-30#T37072277904-root" \ --controller \ --remove # run on Agent Server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/agent/state \ --backup-dir=/var/backups/js7 \ --order-id="#2023-11-28#T19891282200-root" \ --agent \ --remove # removes the indicated orders from the Controller's and Agent's journal on the respective servers # specifies the path to the journal directory # creates backups of journal files in the related backup directories # specifies a quoted, comma separated list of Order IDs to be removed
Remove Workflows
Check if Workflow is available from Controller Journal
# run on Controller server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/controller/state \ --workflow=some-workflow \ --controller \ --check # checks if the indicated workflow is available from the journal # specifies the path to the journal directory and the workflow to be removed
Remove Workflow from Controller Journal
# run on Controller server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/controller/state \ --backup-dir=/var/backups/js7 \ --workflow=some-workflow,some-other-workflow \ --controller \ --remove # removes the indicated workflows from the Controller's journal # specifies the path to the journal directory # specifies a comma separated list of workflows to be removed # creates a backup of journal files in the indicated directory
Remove Agent
Check if Agent is available from Controller Journal
# run on Controller server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/controller/state \ --agent-id=agent_001 \ --controller \ --check # checks if the indicated Agent is available from the journal # specifies the path to the journal directory and the Agent ID to be removed
Remove Agent from Controller Journal
# run on Controller server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/controller/state \ --backup-dir=/var/backups/js7 \ --agent-id=agent_001,agent_002 \ --controller \ --remove # removes the indicated Agents, related orders and workflows from the Controller's journal # specifies the path to the journal directory # specifies a comma separated list of Agent IDs to be removed # creates a backup of journal files in the indicated directory
Remove Problem
Remove problem from Controller Journal
# run on Controller server ./update-journal.sh \ --controller-id=controller \ --journal-dir=/var/sos-berlin.com/js7/controller/state \ --controller \ --remove \ --problem # Removes a problem from the Controller journal # specifies the path to the journal directory