How to organise secure file transfer when files become available on a remote host

page under construction

use of <process> is deprecated, examples have to be re-worked

Organising secure file transfer with the JobScheduler

Starting point: We have an explicit need to invoke an FTP get file operation when a file becomes available on a remote server, assuming that

the remote server operating system is not supported by the JobScheduler.
we don't wish to rely on a program that uses the JobScheduler API package running on the remote node to send us a notification (we can't maintain programs running on this remote node).

Question: Is there any way to monitor a remote file directory or folder on the job schedule so that a notification is received on the main JobScheduler to start the FTP get file program or script?

Proposed Solution 1: Signalling

Should the remote host be responsible to signal the existence of a file ready for transfer, then concerning the remote host don't be afraid of the JobScheduler API, it might be as simple as this:

 telnet [scheduler_host][scheduler_port]
 <start_job job="ftp_get_files"></params></start_job>

This command is most likely available for the remote host and it's simple: by means of a TCP stack you could send an XML command that tells the JobScheduler what should be done, e.g. to start a job. Major parts of the API are available as XML commands that could be sent via TCP or UDP to the JobScheduler. You do not have to use telnet as the ultimate TCP solution, it's just an example that works at the command line.

Use the JobScheduler API without installation

We deliver Java classes that could be used without a JobScheduler being installed, e.g. on the command line like this:

 java  -cp ./sos.scheduler.jar:./sos.util.jar sos.scheduler.SOSSchedulerCommand [scheduler_host] [scheduler_port] [xml command]

Java is available for most operating systems, therefore the remote host could use this command line to send any XML command to the JobScheduler without he need to have a JobScheduler being installed on the remote host. A standard JRE and the above .jar archives would be sufficient for the remote host.

Use an alternative TCP/UDP stack

You could use any programming or scripting language that supports TCP or UDP to send XML commands.
The above use cases do not assume that you have to maintain the resp. scripts or programs on the remote host as standard operating system utilities could be used for this.
We recommend to use UDP in the above solution as this protocol is non-blocking, i.e. the remote host will not suffer (and have to recover) from errors due to network downtimes. Therefore use of signalling should be optional in the communication between remote host and JobScheduler to speed up processing but is no replacement for polling.

Proposed solution 2: Polling

We recommend this solution as it is save and easy to implement to check if a file exists on a remote host:

Check via FTP

See the standard jobs in the chapter for FTP processing at JITL - JobScheduler Integrated Template Library and the job documentation in JobScheduler FTP Receive.
The existing code could be adapted to return distinct exit codes that signal if files are available or absent in a remote directory.

Check via SSH

This solution requires a shell script or operating system command on the remote host that is launched via SSH by a job in order to check if a file is ready to be transferred. The script or command could be implemented to return a distinct exit code which signals that files are ready for transfer.

A (pseudo) configuration for the job could be:

 <job >
   <process file="ssh" param="-l [user]@[host] [script_or_command]">     
     <delay_after_error error_count="1" delay="60"/>    
     <delay_after_error error_count="10" delay="600"/>    
     <delay_after_error error_count="50" delay="stop"/>    
          
     <commands on_exit_code="success">
       <start_job job="ftp_get_files">
          <params/>
       </start_job>
     </commands>
      
     <commands on_exit_code="1"/> <!-- other exit codes are automatically handled as  errors, otherwise you could start individual jobs to handle errors ->
   
   </process>
 </job>

Starting successor jobs based on exit codes is not restricted to SSH or to the use of <process file=""/>, you could configure any job to start arbitrary successor jobs and orders (from JobScheduler release 1.2.7 on).

Concurrency Issues

A simple way to avoid concurrency would be to have the client put files with a name as filename~ and rename them to filename. This would make the files appear atomically.

Alternative solutions

More sophisticated solutions are required in the following cases:

What should happen if one input file cannot be fetched for a longer period and subsequent files are provided by the remote host?
How should multiple input files be handled that are to be processed in a single transaction by the client, e.g. for import to a database?
- We implemented commercial solutions for these issues based on traffic light rules.
- If this should be of any interest to you then please contact us.

Space shortcuts

Page tree