Skip to end of metadata
Go to start of metadata

Problem

Within a job chain file_order_source starts an order when the file is created, not when the file is ready.

In some file transfer scenarios the receiver of a file has no knowledge about when the sender creates a file. In case of a large file, it is possible for the receiver to try to read a file before the sender has finished writing it. If the receiver then attempts to use the file at this moment, he will get a corrupted, incomplete file.

Solutions

Using file_order_source

There are three ways how to use file_order_source:

  • The sender creates a file named abc.txt~. After the transfer is completed, the sender renames the file to abc.txt. You would use a regular expression such as ^.*\.txt$ to check for the presence of files.
  • The sender creates a file named abc.txt. When it is ready, a second file with 0 byte will be created. The name of the second file is abc.txt.trigger. Here, you would use a regular expression such as ^.*\.txt.trigger$. Note that with this approach you have the disadvantage that the name of the trigger file is listed under scheduler_file_path, not the name of the file that should be executed.
  • Set-up a job chain where the file size is checked in the first node. Then carry out a setback if the file size is changing. This can be done with the job JobSchedulerExistsFile

See also: Directory Monitoring with File Orders

Using the JobSchedulerExistsFile job

  • This job has the advantage over file_order_source solutions that it allows the use of parameters, for example for the name of the target directory, and it allows you to configure the polling rate.
  • The JobSchedulerExistsFile job also checks whether the file size is constant - i.e. Is the file still being written? - and will only proceed if the file size is not changing.
  • The JobSchedulerExistsFile job has three parameters to manage the check steady state behaviour.
    • check_steady_state_of_files: If true, job will check the steady state.
    • check_steady_state_interval: Interval in seconds between two checks
    • steady_state_count: If set, this is maximum number of intervals. If the maximum is reached, the task will be terminated with an error.

See also: Job JobSchedulerExistsFile

Related Downloads

You can download example files covering both file_order_source and JobSchedulerExistsFile job solutions


Write a comment…