Page History
...
- File Watching is used to automatically start workflows in case the event of arrival of a file in a directory.
- Agents watch directories fore incoming for files and add a file order for each incoming file.
- File orders are assigned workflows based on configurable rules.
- File Watching is a flexible means for enterprise application integration (EAI) as it enables e.g. 3rd :
- It enables for example 3rd-party applications to launch
- workflows simply by creating files.
- It allows to trigger workflow execution on a number of Agents based on arrival of a file.
- File Watching can be clustered by assigning a JS7 - Agent Cluster.
- The active Director Agent instance performs file watching.
- In case of fail-over or switch-over the active role switches and file watching will be performed by the active Director Agent instance.
Feature Video
This video explains file watching with JS7.
Widget Connector | ||
---|---|---|
|
Trigger Files and Data Files
Trigger Files are typically zero-byte files that are created to trigger execution of a workflow. A workflow can remove such files at any point in time.
- Trigger files optionally can carry variables that are added to the file order in a workflow.
- For details see JS7 - JITL FileOrderVariablesJob.
Data Files are processed by jobs in a workflow e.g. , for example to import reporting data into a data warehouse. Such files are (re-) moved only after processing by the respective related jobs.
For both Trigger Files and Data Files the workflow is responsible to remove for moving or removing an incoming file before completion of the workflow.
...
The File Order Source is a configuration object that holds the directory and file name pattern to be watched for by an Agent and that assigns resulting file orders to a workflow.
- The Order ID of file orders for example is created like this:
#<ISO date>#F<Seconds since Jan 1st 1970>-<ID of file order source>:<file name>
, where:#<ISO date>#
is the date of appearance of the file, e.g.2021-03-17
., enclosed by #. The date is calculated for the time zone assigned the File Order Source.F
is a qualifier to indicate a File Order Source.<Seconds since Jan 1st 1970>-
is what is it says and is appended is followed by a hyphen.<File Order source ID>:
is the unique identifier of the File Order Source configuration object followed by a colon.<file name>
is the name of the incoming file.
- Such file orders can be considered triggers for workflow execution, which implies that the workflow can be executed with by the same monitoring Agent or with different Agentsby any other Agent.
File Order Sources are managed with in the Configuration -> Inventory view like this:
Explanations:
Explanation:
Workflow Name
: A workflow is assigned to which an order is added per incoming file.Agent
: An Agent is assigned that performs file watching. Standalone Agents and Agent Clusters can be assigned. In an Agent Cluster the active Director Agent performs file watching, Subagents do not perform file watching.Directory
: The directory that the Agent watches for incoming files. This setting expects JS7 - Expressions for Variables, if a string is specified it should be quoted using single quotes.- Unix: A path can be specified such as
/tmp/incoming
- Windows: A path can be specified with backslashes or forward slashes,
C:\tmp\incoming
andC:/tmp/incoming
are equivalent. - Unix, Windows: OS environment variables can be used that are known to the Agent, for example from its Instance Start Script. Environment variables and constant strings can be concatenated using the
++
operator and considering quoting for constant strings like this:- Unix:
env('HOME') ++ '/incoming'
- Windows:
env('TMP') ++ '/incoming'
- Unix:
- Unix: A path can be specified such as
Pattern
: The pattern to match an incoming file The file name pattern is not a wildcard expression such as *.csv, instead it represents a Java Regular Expression.
The pattern has to match the path of an incoming file.Display feature availability StartingFromRelease 2.1.0
The pattern has to match the name of an incoming file..Display feature availability StartingFromRelease 2.5.0 - Consider the following examples:
- match any files:
.*
- match files with a .csv extension:
.*\.csv$
2021-03-27
including support for long-short months and leap years: ^(?:(?=[02468][048]00|[13579][26]00|[0-9][0-9]0[48]|[0-9][0-9][2468][048]|[0-9][0-9][13579][26])- match files that end with a date in yyyy-mm-dd format:
.*\d{4}
- match any files:
File Handling
Checking for Incoming Files
-\d{2}-\d{2}$
Delay
: The delay in seconds for which a file is checked to be stable and does not change its size or timestamp. This guarantees that only files are picked up that have been completely written.
Using Files in Workflows
Using File Paths
For incoming files the JS7 File Order Source provides a built-in variable file
that holds the path to the incoming file.
This variable is declared with the workflow like this:
If the file
variable is mandatory for a workflow, its declaration cannot be omitted.
The variable can be used in subsequent jobs, for example to create an environment variable that is used from a job script:
- The below example assigns the environment variable
FILE
the value of the built-infile
variable. - This configuration step is the same for jobs executed with Unix or Windows.
In the job script the environment variable FILE
can be used like this:
- The below example shows a Windows job script using
%FILE%
. - For Unix an environment variable is used from
$FILE
or${FILE}
.
Using File Names
The built-in file
variable holds the full path of the incoming file, for example /tmp/incoming/test.txt
.
Users who want to use a file base name such as test.txt
instead of the full path can configure their workflow like this:
Explanation:
- Two workflow variables are declared:
- The built-in
file
variable is declared without a default value. - A new
filename
variable is declared (an arbitrary name can be chosen for the variable).- The assigned value is a generic expression that extracts the file name from the incoming file's full path:
replaceAll( $file, '^.*[/\\]([^/\\]+)$', '$1' )
- The built-in
- The assignment of environment variables for a job can make use of both declared variables.
In the job script the environment variables FILE
and FILENAME
can be used like this:
- The below example shows a Windows job script using
%FILE%
and%FILENAME%
. - For Unix environment variables are used from
$FILE
or${FILE}
and$FILENAME
or${FILENAME}
.
File Handling
Checking for Incoming Files
Timing
The JS7 Agent applies the following procedure for timely acceptance of incoming files:
- For Linux environments the Inotify interface is used. This notifies the JS7 Agent in near real-time about incoming files.
- Consider that Inotify cannot report file events for NFS mounted directories as it requires Kernel Support which works for files that are written by the local system to the NFS mount and will not work for files written by remote systems.
- In addition to receipt of file events the Agent performs polling by default every 60s.
- For other Unix environments including MacOS, AIX etc. the Agent performs polling every 10s based on the implementation provided with the relevant Java Virtual Machine.
- For Windows environments the relevant API is used, and notifies the JS7 Agent in near real-time.
- In addition to receipt of file events the Agent performs polling by default every 60s.
Users who want to modify the polling interval can adjust the following setting in the Agent's .<agent-data>/config/agent.conf
file. To apply changes to Agent settings a restart of the Agent is required.
Code Block | ||
---|---|---|
| ||
js7.filewatch.poll-timeout = 10s |
It is possible to use smaller values, however, this might increase system load if polling in short intervals is applied to a larger number of directories.
Clustering
A JS7 - Agent Cluster can be assigned a File Order Source for file watching.
This offers clustering for file watching:
- When the active Director Agent instance will fail-over or switch-over then the new active Director Agent instance will pick up file watching.
- Director Agent instances perform file watching independent from the fact that the included Subagent is enabled or disabled.
Steady State Check
When an incoming file arrives in a directory then the file might not be have been completely written at the point in time of appearance.
- In Windows environments, files that are being written by a process cannot be accessed by jobs.
- In Unix environments, parallel read and write operations to files by jobs and processes are possible - but not desired, as results are unpredictable.
The best way for a client application to handle this situation is to make a file appear in an atomic transaction as e.g. , for example to use a temporary file name and to rename the file after it has been completely written. Agents trigger files by file name patterns and therefore will become aware of a file only when the file name path matches the pattern.
Should it not be possible for a client process that writes a file to rename the file after completion then the Agent implements a delay to check the steady state of an incoming file.
- An interval of (default: 2 seconds) can be specified for which the Agent waits . The Agent will wait for this time and then checks check if the time stamp or the size of the file changed.
- If the file did change has changed then the Agent will wait for the next interval to apply before applying the same check.
Ghost Appearance
If a file appears then the Agent will create a file order. If the file disappears later on then this has no impact on the file order. However, if a file with the same name appears once again while the current file order is in progress then the Agent will prepare a new file order that will be visible only after the initial file order has completed the workflow.
It is not considered good practice for a client application to make files with the same name appear and disappear in short sequence, however. However, the situation will be handled by the Agent that creates additional file orders.
...
If an Agent was not available when a file had has been added to a directory then after start of the Agent it will pick up the file and will create a file order.
If the Controller is not available at the point in time when a file order is created then the Agent:
- can process a workflow for the file order provided that all jobs in this workflow are assigned to this Agent,
- has to wait for the Controller to become available in order to forward the file order.
Moving Files and Removing Files on Workflow Completion
JS7 will does not move or remove files, it's the workflow's responsibility to guarantee that on completion of a workflow the incoming file is not no longer present any longer.
- A job in a workflow can:
- move an incoming file to some an archive location that is not subject to file watching,
- remove an incoming file.
If an incoming file is still present on completion of the workflow then the file order is moved to a FAILED COMPLETED state.
- The FAILED COMPLETED state does not indicate that previous jobs in the workflow were not successful. It indicates that clean-up for the incoming file did failstill is available having passed all instructions of the workflow.
- Users have to cancel failed file orders and have to move the remaining file to some other location and make file orders in COMPLETED state leave the workflow.