23 November 1999. GDIFF: compute the difference between two files Abstract: GDIFF is an OS/2 program that will compute differences between 2 files, and save the results to a "GDIFF formatted" difference file. GDIFF can also "undifference" a "new" file, given an "original" file and a GDIFF formatted difference file. GDIFF will work on any type of file, it is NOT limited to text files. Bonus: rxGDIFF.DLL -- A rexx callable procedure! ------------------- Installation: Just copy GDIFF.EXE to your path (say, to x:\OS2\APPS, where x: is your boot drive). GDIFF.EXE does not require EMX! If you want to call GDIFF from REXX, then copy rxGdiff.dll to your libpath (say, to x:\os2\dll). ------------------- Syntax of the exectuable: The basic syntax is: x:>GDIFF oldfile newfile diff_file diff_File will be created, and will contain the "GDIFF formatted" differences between oldfile and newfile. Example: x:>GDIFF version1.doc version2.doc version2.dif To recreate newfile from oldfile and diff_file, and save the results to newfile_dup, use: x:>GDIFF -u oldfile diff_file newfile_dup Example: x:>GDIFF -u version1.doc version2.dif version2.new The more complete syntax is: Syntax: x:>GDiff oldfile newfile [out_file] [-options] If out_file is not specified, output is written to stdout GDIFF recognizes the following options. Please note that options may appear anywhere (they do not have to be after the filenames), and that there should be no spaces between the - and the option: -MD4 -- just compute an MD4 of oldfile and newfile (or just the oldfile). example: x:>GDIFF -md4 bigprog.exe -u -- undifference. Newfile should be a Gdiff difference file. Note that GDIFF can be used with ANY "Gdiff formatted" difference file, not just difference files computed by this version of GDIFF. -b=nnn -- use a blocksize of nnn (0GDIFF -v bigdoc.new bigdoc.old bigdoc.dif x:>GDIFF -u bigdoc.new bigdoc.dif > bigdoc.nu2 x:>GDIFFF bigdoc.old bigdoc.new bigdoc.df2 -b=200 x:>GDIFF -u -q bigdoc.new bigdoc.df2 bigdoc.nu3 x:>GDIFF -md4 bigdoc.new bigdoc.nu3 Notes: * GDiff uses the RSYNC algorithim to compute differences. It is possible, though VERY unlikely, for the RSYNC algorithim to produce an incorrect difference file. For circumstances where a microscopic chance of failure is important, we recommend using the -v option. * Specification of the GDIFF "difference file" format can be found at: http://www.w3.org/TR/NOTE-gdiff-19970901.html * Information on the Rsync algorithim can be found at http://www.samba.org/rsync/ * GDIFF has been tested on a two versions of a 36M file (same contents, but with contents rearranged). It took about a minute (on a Pentium 333). * GDIFF does not work well with compressed files (such as .ZIP files) -- these tend to change globally even if only a small set of the archived files are different. * GDIFF.FOR contains the GDIFF source code (Fortran 77+, compiled with Watcom 11.0b). * REXX callable RSYNC procedures can be found in the rxRsync package (http://www.srehttp.org/apps/rxrsync/). --------------------- Using rxGDIFF.DLL As an aide to REXX programmers, a rxGdiff procedure can be used to compute GDIFF difference files. rxGdiff is contained in the rxGdiff.DLL file. rxGdiff is called as: stat=rxGdiff(oldfile,newfile,outfile,opt1,opt2,...,optn) Where: oldfile: the "original file" newfile: the "new file", or the "difference file" outfile: output file to create; either a difference file or a "undifference" (a duplicate of the newfile) opt1 ... optn: options (as described above) stat : status code. 0 means success, otherwise an integer error code (as described below) Examples: stat=rxGdiff("info.old","info.new","info.dif","-q") stat=rxGdiff("info.old","info.dif","info.new","-u","-q") stat=rxGdiff("info.new","-md4") In order to use rxGdiff, you need to load it. You can use the following: if rxfuncquery('rxGdiff')=1 then do call RXFuncAdd 'RXGdiffLoad', 'RXGDIFF', 'RxGdiffLoad' call RxGdiffLoad end if rxfuncquery('rxGdiff')=1 then do say "ERROR could not load RxGdiff.DLL" exit end See RxGdiff.CMD for an example of how to use RxGDIFF. --------------------- Error codes: On successful completion, GDIFF (and rxGdiff) will set a completion code, with values: 0 - success 1 - failed to specify input file 2 - problem computing md4 (oldfile) 3 - problem computing md4 (newfile) 4 - bad -blocksize option 5 - verification requires an output file 31 - could not open old file 32 - problem rewinding old file 33 - old file is empty 34 - unable to read entire oldfile into memory 35 - unable to allocate memory while creating synopsis 36 - verification failure 37 - unable to open temporary file 41 - problem allocating memory to create diff file 42 - problem opening newfile 43 - internal write problem 46 - error writing to ouptut file 47 - problem allocating memory to read newfile 48 - problem reading newfile (possibly eof) 49 - unable to open output file 51 - unable to open "difference file" 52 - not a gdiff formatted difference file 53 - unable to read from difference file 54 - unimplemented "large move" gdiff command encountered 55 - error in ungdiff procedure 56 - illegal gdiff code For example, you could invoke GDIFF.EXE through REXX, using: oldfile='old_version_of_a_file' newfile='new_version_of_a_file' dif_file='difference_file_to_create' address cmd 'GDIFF 'oldfile' 'newfile' 'dif_file if rc<>0 then say "GDIFF failure with code= 'rc or, more elegantly (using rxGdiff): stat=rxgdiff(oldfile,newfile,dif_file) if stat<>0 then say "rxGDIFF failure with code= 'stat --------------------- Disclaimer: This is freeware that is to be used at your own risk -- the author and any potentially affiliated institutions disclaim all responsibilties for any consequence arising from the use, misuse, or abuse of this software (or pieces of this software). You may use this (or subsets of this) program as you see fit, including for commercial purposes; so long as proper attribution is made, and so long as such use does not in any way preclude others from making use of this code. Contact: Daniel Hellerstein (danielh@crosslink.net or danielh@econ.ag.gov)