18 November 1999.

  RxRsync ver 1.01: Rexx procedures for the rsync differencing protocol 

Absract:
   RxRsync contains several OS/2 classic REXX procedures that implement
   the rsync "differencing" protocol. This document describes their use.

                        ---------------------

I) Introduction

The Rsync protocol is a "client/server" differencing protocol that 
does not require that both parties have the same copy of a prior 
version.  Instead, the client sends a synopsis of a prior 
version, information which the server can use to create a difference 
file (it uses this synopsis in lieu of the actual contents of the 
prior version).

Thus, rsync trades off some extra exchange of information (the synopsis
the client sends to the server), in return for removing the 
need for both sides of the transaction having identical copies of 
the prior version. Of course, the efficiency of rsync is a function 
of how close the prior version (on the client side) is to the current
version (on the server side).  Nevertheless, in even the worst case
(comparison of two different random files), the penalty is small; wheras
in the best case (slight changes), 10 to 1 reductions in total (both ways)
message size are not uncommon.

The rsync protocol has several steps. Assume a client wants a current
version of a file; and that the client has a prior version of the
file available. Then:

1) A "client" creates a "synopsis" of it's prior version.  
2) The client requests the "server" for the current version, and
   sends a copy of this synopsis along with this request
3) The server creates a "difference" file by comparing the synopsis
   to it's copy of the current version.
4) The server send this difference file back to the client
5) The client combines the difference file with it's copy of the prior
   version to create an exact duplicate of the new version.

Note that the server is NOT expected to have a copy of the prior version!

The procedures in RxRsync can be used to implement this chain of events.
About all you need to do is worry about the communication steps (steps
2 and 4).

                        ---------------------

II) Installing RxRsync

First, unzip RXRSYNC.ZIP to an empty temporary directory.

  1) Copy RXRsync.DLL to your LIBPATH (say, copy it to x:\OS2\DLL).
     You will also need a copy of REXXUTIL, but that's part of
     most OS/2 installations.
     
  2) Write a REXX program, and  either:
       a)include a copy of RxRSYNC.REX (say, put it at the end of your 
         rexx program file)
       b) Load RxRSYNC.RXL into "macrospace", and load a few dlls.
       See the notes in section IV below for the details.

  3) Within your REXX programs, call the procedures.


Alternatively, you can use the RXSYNC.CMD program -- it's an "all rexx"
implementation of rsync.  It's very slow, but it should work on
any REXX system.

                        ---------------------

III) Description of procedures
     
There are three  REXX procedures:

     Rsync_Synopsis: creates a synopsis of an old version.
     Rsync_Gdiff: uses this synopsis, and the new version, to create
                   a gdiff-formatted difference file
     rsync_ungdiff: uses a gdiff-formatted difference file, and the old 
                    version, to build a copy of the new version

Rsync_Synopsis: create a synopsis of an old version

   Syntax:

     status=Rsync_Synopsis(oldver_file,synopsis_file,comment,quiet,blocksize)

   where:
       oldver_file: a fully qualified file name (the old version)
       synopsis_file: a fully qualified file name (the synopsis file)
       comment: an optional comment 
       quiet: set to 1 to suppress runtime status messages
              This is an optional parameter (the default is 0).
       blocksize: blocksize to be used when creating synopsis file.
                  Sizes between 500 and 1000 seem to work best.
                  This is an optional parameter (the default is 500)

   and
       status: A status message of the form:
                 stat multi-word message
               where
                  stat= OK for success
                        ERROR for failure

   Notes:
     * Examples of status returned values:
           ERROR no such old version
           OK 5151 bytes written to  C1FILE.RSY
     *  The synopsis_file will be created in "overwrite mode" (prior
        versions of this file will first be deleted).
     *  The comment can be up to 80 characters long. If not specified,
        a timestamp is used

Rsync_Gdiff: use a synopsis to create a GDIFF-formatted difference file
   
   Syntax:

      status=Rsync_Gdiff(synopsis_file,newver_file,diff_file,quiet)

    where:
        synopsis_file: a synopsis file 
        newver_file: a fully qualified file name (the new version)
        diff_file: a fully qualified file name  (the difference file)
        quiet: set to 1 to suppress runtime status messages
            This is an optional parameter (the default is 0).
     and
        status: A status message

        The status message is either:
          OK  md4_value
        or
          ERROR Error message

        The md4_value is a 32 hex character MD4 hash of the newver_file.
        See Rsync_unGdiff for an example of how it can be used.

   Notes:
      * as with the synopsis_file, the diff_file is created in overwrite mode

rsync_unGdiff:  create a duplicate of a new file from a difference file

   Syntax:

         status=rsync_unGdiff(oldver_file,diff_file,newver_file,amd4,quiet)

   where:
        oldver_file: a fully qualified file name (the old version)
        diff_file: a fully qualified file name  (the difference file)
        newver_file: a fully qualified file name (the "duplicate" new ver)
        amd4: (optional) md4 of the "server's new" version of the file.
        quiet:(optional) set to 1 to suppress runtime status messages
              The default is 0).

   and
        status: A status message

   Notes:
      * as with the synopsis_file, the newver_file is created in 
        overwrite mode
      * the status has the same structure as status in Rsync_Synopsis

      * if you specify amd4, and the md4 hash of the newver_file (that
        is created) does NOT match amd4, then an error message is
        generated (and newver_file is NOT created).


     * rsync_unGdiff can be used for ANY "gdiff-formatted" difference
       file -- not just "gdiff-formatted" difference files produced by
       rsync_Gdiff.

                        ---------------------

IV) The rxRsync.Dll dynamic link library.

Since REXX is very slow at repetitive math, the above rexx procedures 
use several procedures in rxRsync.dll. 

RxRsyncLoad: Loads the rxRsync procedures.

  For example:

  if rxfuncquery('rx_md4')=1  then do
      call RXFuncAdd 'RXRsyncLoad', 'RXRSYNC', 'RxRsyncLoad'
      call RxRsyncLoad
  end
  if rxfuncquery('rx_md4')=1  then do
     return "ERROR could not load  RxRsync.DLL"
  end

RxRsyncDrop: unload the rxRsync procedures
   
   For example:
        call RxRsyncDrop

RX_RSYNC32: Compute a 32 bit rolling checksum of a string
  
   For example: 
       csum32=rx_rsync32('some kind of string of any length')
       csum32 will be an 8 character hex number

RX_MD4: Compute an MD4 hash of a string
   For example: 
       amd4=rx_md4('some kind of string of any length')
       amd4 will be a 32 character hex number

RX_RSYNC32_MD4: Compute a 32 bit rolling checksum, and an md4 hash
   For example: 
       csum32_md4=rx_rsync32_md4('some kind of string of any length')
       csum32 will be an 20 characters. The first 4 are the rolling
       checksum, characters 5 to 20 are the MD4. 
       Thus:  csum32=c2x(substr(csum_32_md4),1,4)
              amd4=c2x(substr(csum_32_md4),5,16)
        
RX_Rsync_Gdiff: Compute a gdiff-formatted "difference", given a
                "current instance and a "synopsis".

    status=rx_Rsync_Gdiff(newverfile,synopsis,outfile,use4)
    where:
        newverfile: a filename, pointing to the "current instance"
        synopsis: the string containing the  "synopsis" of the "old 
                  instance" (as may be produced by Rsync_Synopsis)
        outfile: a file name, the "difference" file will be written
                 to this file name (in overwrite mode)
        use4: Optional. If set to 1, then only the first 4 characters of
              the md4 checksum are used to verify. This is useful
              when using the http version of rsync (smaller request
              headers, with some small risk of an incorrect "undifferencing",
              which may necessitate a re-request).
     status is a status message. It can be:
         OK md4_value
           or
        ERROR error message
        The md4_value is a 32 hex character md4 hash of newverfile

                        ---------------------

V) Notes and disclaimer

  * RSYNCtst.CMD demonstrates the use of these procedures as text inclusions.

  * RSyncts2.CMD  demonstrates their use as a macrospace library.

  * !!! If you use the "macrospace" version, you MUST be sure to 
    load several dlls, and to load the macrospace library.  

    RsyncTs2 contains a simple procedure (LOAD_LIBS) that will do this.

  * The Rsync protocol was invented by Andrew Tridgell. For more information,
    see http://samba.org.au/rsync/  

  * A description of the GDIFF format can be found at:
        http://www.w3.org/TR/NOTE-gdiff-19970901.html

  * Users of the SRE-http web server (http://www.srehttp.org) can use
    the sreRsync "pre-reply procedure", and the DoGET.CMD http requester,
    as an http implementation of rsync.

  * Structure of a synopsis file
     Comment  -- 80 characters (i.e.; a requested file name)
     1 space
     Blocksize  -- 6 digit integer (i.e; 500)
     1 space
     #Blocks    -- 8 digit character (N)
     1 space
     md4        -- 32 digit md4
     3 spaces
     chksum1||md41||..||chksumN||md4N -- chksum and md4 values 
                                         (machine integer format, 
                                         high order bytes first)

     Note: this is subject to change (it may be standardized to a
           be compatible with unix implementations of rsync)

  * Contents of rxRsync.zip

        read.me       -- a small read.me file
        RxRSYNC.RXL   -- REXX "macrospace" version of the three REXX procedures
        rxrsync.rex   -- REXX code version of the three REXX procedures
        rxsync.cmd    -- An all REXX implementation of Rsync (for demo purposes)
        rsyncts2.cmd  -- Demo of rsync, using RxRSYNC.DLL
        rsynctst.cmd  -- Demo of rsync, using (local copy of) rxrsync.rex
        rxrsync.doc   -- this documentation file
        RxRsync.dll   -- a Rexx callable dll containing several procedures
        dllsrc.zip    -- Source code (watcom fortran, and rexx) used to create
                         RxRsync.dll and RxRsync.rxl

                        ---------------------

Disclaimer:

   This is freeware that is to be used at your own risk -- the 
   author and any potentially affiliated institutions disclaim all 
   responsibilties for any consequence arising from the use, misuse, or abuse 
   of this software (or pieces of this software).

   You may use this (or subsets of this) program as you see fit,    
   including for commercial purposes; so long as  proper attribution
   is made, and so long as such use does not in any way preclude 
   others from making use of this code.

Contact:
   Daniel Hellerstein (danielh@crosslink.net or danielh@econ.ag.gov)