Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • File Watching is used to automatically start workflows in the event of arrival of a file arrival in a directory.
    • Agents watch directories for files and add a file order for each incoming file.
    • File orders are assigned workflows based on configurable rules.
  • File Watching is a flexible means for enterprise application integration (EAI):
    • It enables e.g. for example 3rd-party applications to launch workflows simply by creating files.
    • It allows to trigger workflow execution on a number of Agents based on arrival of a file.
  • File Watching can be clustered by assigning a JS7 - Agent Cluster.
    • The active Director Agent instance performs file watching.
    • In case of fail-over or switch-over the active role switches and file watching will be performed by the active Director Agent instance.

Feature Video

This video explains file watching with JS7.

...

Trigger Files are typically zero-byte files that are created to trigger execution of a workflow. A workflow can remove such files at any point in time.

Data Files are processed by jobs in a workflow, for example to import reporting data into a data warehouse. Such files are (re-) moved only after processing by their respective related jobs.

For both Trigger Files and Data Files the workflow is responsible for moving or removing an incoming file before completion of the workflow.

...

File Order Sources are managed in the Configuration -> Inventory view like this:

Image Modified


Explanation:

  • Workflow Name: A workflow is assigned to which an order is added per incoming file.
  • Agent: An Agent is assigned that performs file watching. Standalone Agents and Agent Clusters can be assigned. In an Agent Cluster the active Director Agent performs file watching, Subagents do not perform file watching.
  • Directory: The directory that the Agent watches for incoming files:. This setting expects JS7 - Expressions for Variables, if a string is specified it should be quoted using single quotes.
    • Unix: A path can be specified such as /tmp/incoming
    • Windows: A path can be specified with backslashes or forward slashes such as , C:\tmp\incoming and and C:/tmp/incoming are equivalent.
    • Unix, Windows: OS environment variables can be used that are known to the Agent, for example from its Instance Start Script. Environment variables and constant strings can be concatenated using the ++ operator and considering quoting for constant strings like this:
      • Unix: env("'HOME"') ++ '/incoming'
      • Windows: env("'TMP"') ++ '/incoming'
  • Pattern: The pattern to match an incoming file is not a wildcard expression such as *.csv, instead it represents a Java Regular Expression.
    • Display feature availability
      StartingFromRelease2.1.0
      The pattern has to match the path of an incoming file
    including the directory hierarchy, not just the file name.
    • .
    • Display feature availability
      StartingFromRelease2.5.0
      The pattern has to match the name of an incoming file..
    • Consider the following examples:
      • match any files: .*
      • match files with a .csv extension: .*\.csv$
      • match files that end with a date in yyyy-mm-dd format: .*\d{4}-\d{2}-\d{2}$
  • Delay: The delay in seconds for which a file is checked to be stable and does not change its size or timestamp. This guarantees that only files are picked up that have been completely written.

Using Files in Workflows

Using File Paths

For incoming files the JS7 File Order Source provides a built-in variable $file file that holds the path to the incoming file. This variable does not have to be declared but can be used

This variable is declared with the workflow like this:

Image Added


If the file variable is mandatory for a workflow, its declaration cannot be omitted. 

The variable can be used in subsequent jobs, for example to create an environment variable that is used from a job script:

  • The below example assigns the environment variable FILE the the value of the built-in $file file variable.
  • This configuration step is the same for jobs executed with Unix or Windows.

...

  • The below example shows a Windows job script using %FILE%.
  • For Unix an environment variable is used from $FILE or ${FILE}.

Using File Names

The built-in file variable holds the full path of the incoming file, for example /tmp/incoming/test.txt.

Users who want to use a file base name such as test.txt instead of the full path can configure their workflow like this:

Image Added


Explanation:

  • Two workflow variables are declared:
    • The built-in file variable is declared without a default value.
    • A new filename variable is declared (an arbitrary name can be chosen for the variable).
      • The assigned value is a generic expression that extracts the file name from the incoming file's full path:
      • replaceAll( $file, '^.*[/\\]([^/\\]+)$', '$1' )
  • The assignment of environment variables for a job can make use of both declared variables.

Image Added


In the job script the environment variables FILE and FILENAME can be used like this:

  • The below example shows a Windows job script using %FILE% and %FILENAME%.
  • For Unix environment variables are used from $FILE or ${FILE} and $FILENAME or ${FILENAME}.

Image Added

File Handling

Checking for Incoming Files

...

The JS7 Agent applies the following procedure for the timely acceptance of incoming files:

  • For Linux environments the Inotify interface is used. This notifies the JS7 Agent in near real-time about incoming files.
    • Consider that Inotify cannot report file events for NFS mounted directories as it requires Kernel Support which works for files that are written by the local system to the NFS mount and will not work for files written by remote systems.
    • In addition to receipt of file events the Agent performs polling by default every 60s.
  • For other Unix environments including MacOS, AIX etc. the Agent performs polling every 2s 10s based on the implementation provided with the relevant Java Virtual Machine.
  • For Windows environments the relevant API is used, and notifies the JS7 Agent in near real-time.
    • In addition to receipt of file events the Agent performs polling by default every 60s.

Users who want to modify the polling interval can adjust the following setting in the Agent's .<agent-data>/config/agent.conf file. To apply changes to Agent settings a restart of the Agent is required.

Code Block
titleAgent Configuration Item
js7.filewatch.poll-timeout = 10s

It is possible to use smaller values, however, this might increase system load if polling in short intervals is applied to a larger number of directories.

Clustering

JS7 - Agent Cluster can be assigned a File Order Source for file watching.

This offers clustering for file watching:

  • When the active Director Agent instance will fail-over or switch-over then the new active Director Agent instance will pick up file watching.
  • Director Agent instances perform file watching independent from the fact that the included Subagent is enabled or disabled.

Steady State Check

When an incoming file arrives in a directory then the file might not have been completely written at the point in time of appearance.

...

  • The COMPLETED state does not indicate that previous jobs in the workflow were not successful. It can indicates that the incoming file still is available having passed all nodes instructions of the workflow.
  • Users have to make completed file orders leave the workflow and move the remaining file to some other location and make file orders in COMPLETED state leave the workflow.