File differencing for numerical output

(2014-11-10)

In software unit tests we have the problem that the usual diff command fails when comparing output files generated by numerical algorithms. Tolerable numerical differences in the output are still hard differnces. Numdiff may help.

It is free software and it source code can be downloaded from http://savannah.nongnu.org/download/numdiff.

Among other features it allows to compare numbers in output files with a given tolerance.

Example

Assume we have two files: a.txt

x=3.150303334
y=4.59993e-3

and b.txt

x=3.15030333
y=4.59994e-3

We assume that the differences are tolerably small, and we want to have a differencing tool which acknowledges this. But this is the output of the system diff command:

$ if diff a.txt b.txt ; then echo EQUAL ; else echo DIFFER ; fi
1,2c1,2
< x=3.150303334
< y=4.59993e-3
---
> x=3.15030333
> y=4.59994e-3
DIFFER

With numdiff we can do the following:

$ numdiff a.txt b.txt
----------------
##1       #:1   <== x=3.150303334
##1       #:1   ==> x=3.15030333
@                                                     @@
----------------
##2       #:1   <== y=4.59993e-3
##2       #:1   ==> y=4.59994e-3
@                                                     @@

+++  File "a.txt" differs from file "b.txt"

Ok, we had this. So let us introduce the (relative) tolerance:

$ numdiff -r 1.0e-3 a.txt b.txt
----------------
##1       #:1   <== x=3.150303334
##1       #:1   ==> x=3.15030333
@                                                     @@
----------------
##2       #:1   <== y=4.59993e-3
##2       #:1   ==> y=4.59994e-3
@                                                     @@

+++  File "a.txt" differs from file "b.txt"

Ups? Ok. The manual says:

-s, --separators=IFS

   Specify the set of characters  to use as delimiters while splitting
   the  input lines  into fields  (The  default set  of delimiters  is
   space, tab and newline).  If IFS is prefixed with 1: or 2: then use
   the given  delimiter set only for  the lines from the  first or the
   second file respectively

So we should add = to the delimiter list:

$ numdiff -s "\n ="  -r 1.0e-5 a.txt b.txt 

+++  Files "a.txt" and "b.txt" are equal

This is what we expected.

A stronger tolerance gives:

$ numdiff -s "\n ="  -r 1.0e-16 a.txt b.txt 
----------------
##1       #:2   <== 3.150303334
##1       #:2   ==> 3.15030333
@ Absolute error = 4.0000000000e-9, Relative error = 1.2697190019e-9
----------------
##2       #:2   <== 4.59993e-3
##2       #:2   ==> 4.59994e-3
@ Absolute error = 1.0000000000e-8, Relative error = 2.1739461253e-6

+++  File "a.txt" differs from file "b.txt"

Very informative. Note that the return code from numdiff can be used as with diff (see above) for reacting on the various results.