DrFessor
08-12-2013, 03:12 AM
NR3's ranq1 constructor initializes v=4101842887655102017, and then computes v^j, and finally calls v=uint64().
I wonder what all this is about. Why not initialize v=j and done? I further wonder about the case j=4101842887655102017 which finally yields v=0 which never changes because v=0 corresponds to a cycle of length 1, i.e. in this case the numbers are not random at all.
Further on, I wonder if there are more numbers than v=0 that yield very short cycles. How should one avoid getting caught in such a cycle?
Does ranq1 cycle through all values between Min=1 and Max=2^64-1=18446744073709551615?
vigna
04-12-2014, 09:57 AM
I think the authors want to avoid that the generator is started in a state made mostly of zeroes (e.g., if the user passes 0 or 1 as a seed). This because the first iterations of a generator based on linear recurrences on a seed with many zeroes tend to have many zeroes. The Mersenne Twister, for instance, started from a seed with just a single bit set to one needs ~100,000 iterations to get back to normality.
In this case the preoccupation is slightly excessive, as with 64 bits of state and the multiplication scrambling the state bits you need less than a dozen iterations to get back to normality.
The solution, in particular, does not seem good to me as it leaves the (really unlikely) possibility that the generator starts from 0, thus returning always zero, which you really don't want to happen.
This generator will return all 2^64-1 distinct nonzero values started from any nonzero state (see the Marsaglia paper on xorshift to understand why).
Personally, I use a different strategy:
1) If the seed is zero, I replace it with ULLONG_MAX (this is a value that the user will very unlikely put in; you can choose another one).
2) I scramble the seed using murmurhash3's avalanching step:
x ^= x >> 33;
x *= 0xff51afd7ed558ccdULL;
x ^= x >> 33;
x *= 0xc4ceb9fe1a85ec53ULL;
x ^= x >> 33;
This is an invertible function in which every bit of the output depends on every bit of the input modulo a very small bias. The probability that you get a seed with many zeroes is very low (the user should supply the inverse of such a seed, which is most unlikely).
In this way, *whichever* seed you pass the generator will have period 2^64-1, and initial states with many zeroes will be extremely unlikely. There's of course the undesirable side effect that 0 and ULLONG_MAX give the same sequence, but it seems a minor problem to me.