This issue addresses differences between compilations (without explicit double precision flags) of the SHARP
branch. The case with double precision flags will be addressed in a separate issue. I used three compilers on hydro-c1 at UCAR with the following flags:
- ifort (IFORT) 15.0.2 20150121
• sac_77
: ifort -O3 -f77rtl
• snow19
: ifort –O3
• UTIL
: ifort -O3 -warn all -check all
- GNU Fortran (Debian 4.9.2-10) 4.9.2
• sac_77
: gfortran -O3 -fno-align-commons -ffree-line-length-none
flags77
• snow19
: gfortran -O3 -fno-align-commons -ffree-line-length-none
flags2
• UTIL
: gfortran -O3 -fno-align-commons -ffree-line-length-none
- PGI 15.7-0 64-bit target on x86-64 Linux -tp sandybridge
• sac_77
: pgf77 -O3
• snow19
: pgf90 –O3
• UTIL
: pgf90 -O3 -Kieee
Note that the Makefile
in this commit does not use any flags for PGI because the FC
and FC77
variables do not match pgf90
or pgf77
as expected in the if
statements that turn on these flags.
These compilers produced different results, particularly in the wintertime.
![swe_orig_diff](https://cloud.githubusercontent.com/assets/5132040/23090619/a4360f78-f556-11e6-9d82-e7316d5b0dc0.png)
Fig. 1. SWE timeseries: The lower set of lines shows swe in mm, plotted on the left axis. The upper set of lines shows the difference in swe in mm between the simulation using PGI
and the other two compilers.
@anewman89 suggested checking a change he had made to include a tiny
offset in aesc19.f
for when the precision of an ascii restart file caused problems in the comparison. tiny
is an intrinsic function with tiny(x) - Returns the smallest positive number that can be represented on the current computer for real argument x, but in the current commit, tiny
does not call an argument x
; it is treated as an uninitialized variable.
When I tried to print out the values of tiny
to the screen, the PGI
compiled code produced different results than when PGI
compiled the same code without a print statement. This suggests a memory error.. I ran the program with valgrind
, which identified a leak at the line where tiny
is used:
==40656== Conditional jump or move depends on uninitialised value(s)
==40656== at 0x40421E: aesc19_ (aesc19.f:25)
==40656== by 0x402857: pack19_ (PACK19.f:218)
==40656== by 0x40C252: exsnow19_ (exsnow19.f:142)
==40656== by 0x40F2ED: MAIN_ (multi_driver.f90:282)
==40656== by 0x401CC3: main (in ~/NWS_hydro_models/Snow17SacUH/src_bin/Snow17SacUH.exe)
The values printed for tiny were inconsistent. Here is a subset of the values printed to the screen in valgrind
:
tiny= -1.7005811E+38
tiny= 0.000000
tiny= -1.7005811E+38
tiny= 8.5123166E-02
tiny= -1.7005811E+38
tiny= 8.3787210E-02
tiny= -1.7005811E+38
tiny= 0.1242791
tiny= -1.7005811E+38
tiny= 0.000000
tiny= -1.7005811E+38
tiny= 0.1852178
tiny= -1.7005811E+38
tiny= 0.000000
Three equivalent independent solutions that successful matched the PGI
compiled output to that of the other compilers (bringing the black line down to match the red and blue lines in Fig. 1, and reproducing simulated streamflow to within +/-0.01 cfs) were:
- Use the
-Msave
flag on f90
and f77
: This initializes all undefined values at zero.
- Set
tiny = 0.0
before line 15 in aesc.f
.
- Remove
tiny
completely.
The consistency of these options with ifort
and gfortran
implies that ifort
and gfortran
initialize tiny
to 0.0 when compiled.
@andywood noted that the precision of the ascii state files has been increased since @anewman89 ran into the issue that required tiny
in the first place. I am doing frequent restarts in my application, so I tested that a version of the code without tiny
still has exact restarts. I did this by running two simulations from 2005-10-01 to 2006-09-30:
- a continuous run for the entire period
- a run that stops and restarts from a state file every day.
I compared the output files from the two scenarios, and they are equivalent. This suggests that for better reproducibility, tiny
can and should be removed. Doing so is conveniently also truer to the original code.