How to Generate a Power Law Distribution. [Archive]

Alamino

02-18-2008, 09:43 AM

Does anybody know how to generate a variable distributed as a power law? More specifically, I need to generate a discrete variable x = 1,...,N following Zips's law

P(x) = \frac{x^{-s}}{Z},

with s a constant and Z the normalization.

davekw7x

02-23-2008, 07:21 PM

...P(x) = \frac{x^{-s}}{Z},

// Test program using Numerical Recipes 3 Ran class to
// generate pseudo-random samples from Power-Law (Discrete Zipf)
// distribution.
//
// davekw7x
// February 2008
//

// However you get to your nr3 stuff
//
#include "../code/nr3.h"
#include "../code/ran.h"

typedef Doub (*udPoint)();// pointer to function that returns double

//
// Discrete Zipf distribution:
// p(k) is proportional to (v+k)**(-s) where s > 1, k >= 0.
//
// An implementation based on
//
// W.Hormann, G.Derflinger:
// "Rejection-Inversion to Generate Variates
// from Monotone Discrete Distributions"
// http://eeyore.wu-wien.ac.at/papers/96-04-04.wh-der.ps.gz
//
//
// The values generated are in the range of [0, imax]
//
class Zipf {
public:
// The constructor requires a pointer to a function that returns a
// uniformly destributed random double on [0,1]
//
// The second parameter is the Zipf distribution parameter
//
// The optional third parameter is the "v" in the distribution
// function (v+k)**(-s). The default value is 1.
//
// The optional fourth parameter defines the upper end of the range
// of generated values. The default value is the largest possible
// unsigned integer in the system.
//
// The value of s must be greater than 1; v must be greater
// than or equal to one. If the user supplies invalid values,
// the setup function informs that they are being changed.
// On general principles, I just don't like constructors that
// can fail. On the other hand, I didn't want to make the user
// call a separate setup function to initialize the object.
// Chacun a son gout!
//
//
Zipf(udPoint ran,
Doub s,
int vv = 1,
unsigned long maxi= static_cast<unsigned long>(-1)):
imax(maxi), v(vv), q(s), urand(ran) {setup();}

//
// return a discrete random number from the discrete Zipf
// distribution in the range of [0, imax]
//
unsigned long next()
{
while (true) {
Doub ur = hxm + urand() * (hx0MinusHxm);
Doub x = Hinv(ur);

unsigned long k = static_cast<unsigned long>(x + 0.5);
if (k - x <= s) {
return k;
}
else if (ur >= (H(k + 0.5) - exp(-log(k + v) * q))) {
return k;
}
}
}
Doub get_s() const {return s;}
Doub get_v() const {return v;}

private:
Doub imax;
Doub v;
Doub q, oneMinusQ, oneMinusQ_Inverse, hx0MinusHxm, hxm, s;

udPoint urand; // pointer to function that returns uniformly
// distributed double on [0,1]

void setup()
{
if (q <= 1.0) {
cout << "Input value for s is " << q
<< ", which is invalid. changing to 2.0" << endl;
q = 2.0;
}
if (v < 1) {
cout << "Input value for v is " << v
<< ", which is invalid. changing to 1" << endl;
v = 1;
}

oneMinusQ = 1.0 - q;
oneMinusQ_Inverse = 1.0 / oneMinusQ;
hxm = H(imax + 0.5);
hx0MinusHxm = H(0.5) - exp(log(v) * (-q)) - hxm;
s = 1 - Hinv(H(1.5) - exp(-q * log(v + 1.0)));

}

Doub H(const Doub & x) const
{
return (exp(oneMinusQ * log(v + x)) * oneMinusQ_Inverse);
}

Doub Hinv(const Doub & x) const
{
return exp(oneMinusQ_Inverse * log(oneMinusQ * x)) - v;
}

};

Doub riemannZeta(Doub a);
Doub calcZetaProbs(Doub s, Doub *p, int n);

#define NUMPROBS 101

//
// A function that uses nr3 Ran to generate a
// uniformly distributed double on [0,1]
// A pointer to this function is used in
// the constructor for Zipf
//
Doub uDoub()
{
static Ran ran(time(0));
return ran.doub();
}

int main()
{
Doub sample;
Doub minval = 99999;
Doub maxval = -1.0;
Doub prob[NUMPROBS];
int count[NUMPROBS] = {0}; // intitalize to all zeros

Doub s = 3.0; // the zipf distribution parameter

Doub excess = calcZetaProbs(s, prob, NUMPROBS);

Zipf dz(uDoub, s);

int numsamples = 1000000;

cout << "For s = " << s;

cout << ", number of samples = " << numsamples << endl << endl;

cout << scientific;

int iexcess = 0;
int n;

for (int i = 0; i < numsamples; i++) {
sample = dz.next() + 1.0; //<--dz() returns numbers starting at zero
n = static_cast<int>(sample);
if (n < NUMPROBS) {
++count[n];
}
else {
++iexcess;
}
if (sample < minval) {
minval = sample;
}
if (sample > maxval) {
maxval = sample;
}
}
cout << "Min val = " << minval
<< ", Max val = " << maxval
<< endl << endl;
cout << setw(8) << "Number"
<< setw(13) << "Actual"
<< setw(13) << "Expected"
<< endl;
for (int i = 0; i < 34; i++) { // sum of above field widths
cout << '_';
}
cout << endl;

cout << fixed << setprecision(2) << showpoint;
for (int i = 1; i < NUMPROBS; i++) {
cout << setw(8) << i
<< setw(13) << count[i]
<< setw(13) << prob[i]*numsamples << endl;
}

if (iexcess) {
cout << setw(8) << "Outliers"
<< setw(13) << iexcess
<< setw(13) << excess*numsamples << endl
<< endl
<< "Note:" << endl
<< " \"Outliers\" indicates the number of sample values "
<< "greater than " << NUMPROBS-1 << endl;
}
cout << endl;

return 0;
}

//
// Quick and dirty (well, not that quick) approximation
// of the Riemann Zeta function.
// It's used in this test program to generate probabilities
// for comparison with data from the discrete Zipf distribution
// function. As we all know, it is notoriously slow to converge
// near s == 1, but it works "good enough" for simple
// testing of the sample generator function.
//
Doub riemannZeta(Doub s)
{
Doub term;
Doub sum = 0;
Doub i;
for (i = 1; i < 1000000; i++) {
term = pow(i, -s);
sum += term;
if (term < 1.0e-15*sum) {
break;
}
}
return sum;
}

//
// Calculates probabilities using Zeta distribution.
// Return value represents the probability tail beyond
// the number of probabilities in the given array
//
Doub calcZetaProbs(Doub s, Doub *p, int n)
{
int i;
Doub excess = 1.0;
Doub zeta = riemannZeta(s);
p[0] = 0;
for (i = 1; i < n; i++) {
p[i] = 1.0/(pow(i,s)*zeta);
excess -= p[i];
}
return excess;
}

Output:

For s = 3, number of samples = 1000000

Min val = 1.000000e+00, Max val = 1.255000e+03

Number Actual Expected
__________________________________
1 832225 831907.37
2 103745 103988.42
3 30769 30811.38
4 13007 12998.55
5 6729 6655.26
6 3758 3851.42
7 2324 2425.39
8 1607 1624.82
9 1220 1141.16
10 811 831.91
11 642 625.02
.
.
.

Regards,

Dave

legalrights

08-28-2009, 10:48 AM

Hi,

I am trying to simulate a mobile network (within a rectangular area) where
nodes move around following power law distribution, i.e., a node moves to a
specific position A with probability c*(1/d)^m, where c and m are the
constant and exponent of the power law distribution, respectively. d is the
distance from position A to the node's initial location.

Currently I have two possible approaches to do this job. None of them is
perfect. In the first approach, the whole area is divided into small cells
so that I can calculate the probability that a node can drop in each cell.
Then a series of uniformly distributed random number can be used to
approximate the power law distribution. However, in order to obtain quality
approximation, the cell must be very small. Thus, the computation complexity
and the memory requirement can be extremly high, if the network size or node
number increase.Following this approach, I can get a series of random number following power
law distribution. This number, however, is actually the distance to a node's
initial location. How to locate a position using this distance is also a
problem since there are multiple positions (actually it may be multiple
curves) with the same distance. An intuitive way is to generate another
uniformly distributed random number B (for example, the angle) so that one
of the positions can be picked up. The problem is how to determing the
upper/lower bounds of this uniformly distributed random number B. This may
also dramatically increase the compexity.

There are also some articles discussing how to generate network with node's
out degree following power law distribution. I don't know if/how I can use
them for my purpose.

Does anyone know about any ready software package doing this job? I really
appreciate if you can share some useful information with me. Please also
kindly let me know if I have any misunderstanding.