Different behavior of stifbs: AIX versus Solaris


David Buchan
10-30-2004, 12:00 PM
Hi guys,

I have been using stifbs, driven by odeint, to solve a set of stiff equations. I wrote the program on AIX in FORTRAN and it compiles (XLF 8.1) and runs great.

But recently, I tried to port the code over to a SUN station, and although it compiles and runs, the behavior in stifbs is somewhat different.

What I see is, stifbs does not increase the stepsize on the SUN station as it does on the AIX platform. Also, the SUN version ends up running odeint up to 10000 iterations (MAXSTP) and I get a PAUSE: too many steps in odeint. These are likely related phenomena.

My equations are extremely stiff because the first and last equation differ in rates by many many orders of magnitude.

Also, and this may be part of the problem, I put everything into real(16), because I'm carrying a lot of digits. I did put all the exponents as 'Q' instead of 'E' and converted all the intrinsic functions over to real(16).

This is likely not enough information for anyone to suggest a fix, but does anybody have some idea where to look?

I checked for differences in conditional precedences because the F77 code given in NR has implicit precedences - especially in stifbs where it decides on step changes. I put in all the brackets so there is no doubt that the conditionals are being evaluated in the same order on the AIX and SUN platforms.

*** addendum to post ***

We did try dialling down the aggressiveness of the SUN's FORTRAN optimizer, but to no avail.

Also, my book says it's correct to V2.10, but I think my NR disk is 2.08. I don't believe there are any corrections relevant to me though.

Finally, I'm using rzextr instead of pzextr - I have a fondness for rational approximation.

*** end of addendum ***

Thanks,
Dave Buchan

David Buchan
11-05-2004, 12:02 AM
The short answer:

Two things:

1) In XL FORTRAN (on AIX), the max function handles arithmetic exceptions differently than max in Solaris FORTRAN.

AIX: max(some real value, NaNQ) results in some real value

Solaris: max(some real value, NaNQ) results in NaNQ

Now THAT was hard to find!!

2) In extended precision (real(16)), Solaris FORTRAN can handle values up to something like 1E4096, while XLF can only go up to roughly 1E301. I don't have my notes in front of me at home here, but you get the idea.

My program was set up in such a way that arithmetic exceptions would necessarily occur due to overflow, but I was catching them by the old trick of:

if (value.ne.value) then exception has occurred

Unfortunately on Solaris, it happily chugged along because it took a lot longer to reach an overflow condition. odeint ran through all 10000 of max iterations.

Yes, the whole concept was a little messy. I thought of a way around all this now that I know what is wrong.

Long answer:

The problem is a nuclear criticality problem. Keffective, the measure of proximity to criticality, can be very close to one, thus all the sig figs I'm carrying in extended precision. I think double precision may in fact be acceptable for most problems I have though.

Background: if Keff is one, reactor is critical and power is constant*. With Keff less than one, power decreases exponentially. Greater than one and power increases exponentially.

I have to find a value of Keff that will give me a desired power at some time in the future. That's not such a problem when the future is a short time away, but if I want to predict 3 years into the future, the integration coming out of odeint can easily overflow (literally meltdown!).

I find an appropriate keff through simple bisection because it is so robust. But of course, I will get overflows when keff is too large. I have an adaptive timestep routine outside odeint as well, and the timestep grows to very large values as time progresses because some other physics effects settle down to allow it. With a large timestep, I can easily reach an overflow with keff still very close to one.

The solution is rediculously simple: In my physical situation, I don't allow reactor powers over 1.0 full power, so in odeint, I just check to see if power has exceeded, say 1.2, and if so, set t to final time, and return.

End result: no messy overflow checks and a lot faster code, since I don't have to run odeint through all the way to overflow condition.

*This is an oversimplification for the sake of illustration. In reality, reactor power can change up or down, even if keff is exactly one. The reason is due to the existence of delayed neutron precursors. These are fission fragments that spit out a neutron at some time after fission. There's a wide variety of these precursors, with some emitting a neutron very quickly, and some that take ages...thus the extremely stiff set of equations. Typically, you group these species according to decay rate, assigning a representative decay rate for each group.

Bill Press
11-06-2004, 10:26 PM
David, thanks for posting both the query and the solution. This may be helpful to others in the future.
Cheers,
Bill P.

sunosx
10-02-2007, 03:28 PM
stifb certainly behave differently on AIX and Solaris. I recently got a new Sun Fire v210 (http://www.anysystem.com/sun-fire-v210.html) setup, your solution will be very helpful. Thanks