Fit a Gaussian function to data [Archive]

weeklens

10-20-2010, 07:55 AM

Hi there,

I am new in the C++ business and also in NR. I need to fit a Gaussian function to my data.

My data consist of 2 vectors x and y, where y(x) is (in a very good
approximation) a Gaussian.

Can anyone send me an example of fitting functions?

Thanks in advance

davekw7x

10-20-2010, 10:26 AM

...I need to fit a Gaussian function to my data...

Here's a test program that generates noisy samples from a known Gaussian function and shows how to use Fitmrq to estimate the parameters:

//
// Demonstration of use of Numerical Recipes Fitmrq.
//
// The original is a Gaussian function, and the data
// points are corrupted by additive noise, similar to
// "measurement noise" that is sometimes encountered.
//
// The Numerical Recipes distribution conveniently supplies
// a function that can be fed to Fitmrq to try to model data
// that consists of samples from sum of any number of
// Gaussian-type functions.
//
//
// davekw7x
//

// Always include nr3.h first
#include "../code/nr3.h"

// For Normaldev (to generate measurement noise)
#include "../code/ran.h"
#include "../code/gamma.h"
#include "../code/deviates.h"

// For Fitmrq
#include "../code/gaussj.h"
#include "../code/fitmrq.h"

// The fgauss() function is in the file fit_examples.h
#include "../code/fit_examples.h"

//
// Simplify appearance of expressions with x*x by using an in-line function
//
inline Doub sqr(const Doub & x)
{
return x * x;
}

int main()
{
int i;

//
// Try with different values, say 0.1 or some such thing.
// See how it affects the fit and see how it affects
// Chi-squared and the error estimate.
//
Doub noise = 0.01;// Standard deviation of generated noise

//
// For debugging, use the same random number seed every time
// so that you can see the results of changing different program
// parameters such as noise and initial guess.
//
Normaldev ran(0, noise, 12345678); // Gaussian noise generator

//
// For repeated runs with different noise values, seed
// the random number generators with something that gives
// a different value each time (assuming that the
// runs are more than a second apart).
//
//Normaldev ran(0, noise, time(0)); // Gaussian noise generator

//
// Try with different numbers of points to see how good
// the fit is. For a "good" fit, the vaoue of Chi-squared
// should be something on the order of the number of points.
//
Int num_points = 101;

// In order to use fgauss on K Gaussians, we need
// some vectors with size 3*K. So, for one Gaussian
// we have size = 3
//
// The three parameters contain amplitude, mean,
// and standard deviation of a Gaussian function that is
// used to generate the data points for testing.
//
// This is the actual array of parameters for the Gaussian
// function:
//
const Doub actual_d[] = { 5.0, 2.0, 3.0};

//
// This is the "guess" that we supply to fitmrq:
//
// Try it with a "pretty good" guess:
const Doub guess_d[] = {4.8, 2.2, 2.8};

//
// Just for kicks, you might want to try it with a "not so pretty
// good" guess. Maybe you will get lucky; maybe not.
// The point is that Levenberg-Marquardt requires an initial
// guess of some kind. If you are trying to fit a bunch
// of points to a guassian function without having any clue
// as to the actual parameters, well...
//
//const Doub guess_d[] = {1.0, 2.5, 1.0};
//const Doub guess_d[] = {0.1, 0.1, 0.1};

//
// The number of elements in an array is needed to initialize
// VecDoub objects.
//
const Int ma = sizeof(actual_d) / sizeof(actual_d[0]);

VecDoub actual(ma, actual_d);
VecDoub guess(ma, guess_d);
VecDoub x(num_points);
VecDoub y(num_points);
VecDoub sd(num_points);

//
// Fill array y with values for the gaussian function, corrupted
// by additive noise.
//
// The sd array contains estimates of the standard deviations
// of the elements of y. We use a "measurement error" factor
// equal to the standard deviation of the noise distribution.
//
Doub xstart = 0.0;
Doub xmax = 10.0;
Doub deltax = xmax / (num_points - 1);
Doub xx;
Doub r;
cout << fixed << showpos;
for (i = 0, xx = xstart; i < num_points; i++, xx += deltax) {
x[i] = xx;
r = ran.dev();
//
// This is the actual value with given parameters
//
y[i] = actual[0]*exp(-sqr((xx - actual[1]) / actual[2])) +
actual[3]*exp(-sqr((xx - actual[4]) / actual[5]));
//
// Add an amount of noise proportional to the value
// of the actual datq point.
//
y[i] += r*y[i];
//
// If you don't know what the noise is, you can try
// somethihng like sd[i] = y[i]; or even sd[i] = 1.0;
//
// These may (or may not) give an acceptable fit, but
// the value of Chi-squared won't be helpful in deciding
// whether it's a "good" fit.
//
// But really...are you trying just to fit a bunch
// of noisy values to a gaussian curve of some sort
// without knowing anything about the actual function
// or the measurement noise? Really? Oh, well...
//
sd[i]= noise*y[i];
//
// If you want to see what the program is working on,
// uncomment the following statement:
//
//cout << "y[" << i << "] = " << y[i] << ", r = " << r
//<< ", sd[" << i << "] = " << sd[i] << endl;
}

//
// Use the constructor to bind the data
//
Fitmrq mymrq(x, y, sd, guess, fgauss); // Use default tolerance 1.0e-3

//
// Now do the fit
//
mymrq.fit();

cout << "Initial guess: ";
for (int i = 0; i < guess.size(); i++) {
cout << guess[i] << " ";
}
cout << endl;
cout << "noise : " << noise << endl << endl;
cout << scientific << setprecision(5);
cout << setw(10) << "Actual"
<< setw(16) << "Fitted"
<< " "
<< setw(13) << "Err. Est."
<< endl;

for (i = 0; i < actual.size(); i++) {
cout << setw(13) << actual[i]
<< setw(16) << mymrq.a[i]
<< " +- " << sqrt(mymrq.covar[i][i])
<< endl;
}
cout << endl;

cout << fixed;
cout << "Number of points = " << num_points
<< ", Chi-squared = " << mymrq.chisq << endl;

return 0;
}

Output:

Initial guess: +4.800000 +2.200000 +2.800000
noise : +0.010000

Actual Fitted Err. Est.
+5.00000e+00 +4.99186e+00 +- +7.13650e-03
+2.00000e+00 +1.99912e+00 +- +3.85874e-03
+3.00000e+00 +3.00091e+00 +- +1.76862e-03

Number of points = +101, Chi-squared = +108.19790

Regards,

Dave

[/begin edit]
It has been pointed out that my calculation of the actual values was erroneous.
Here's the buggy part:

//
// This is the actual value with given parameters
//
y[i] = actual[0]*exp(-sqr((xx - actual[1]) / actual[2])) +
actual[3]*exp(-sqr((xx - actual[4]) / actual[5]));

Change that to

//
// This is the actual value with given parameters
//
y[i] = actual[0]*exp(-sqr((xx - actual[1]) / actual[2]));

[/end edit]

davekw7x

10-20-2010, 10:52 AM

...fast reply...
Why are you summing 2 Gaussians?
Oops. It my response was a little too fast (and sloppy).

That's a mistake, left over from a previous test program with data from the sum of two Gaussian functions.

Anyhow...

The calculation for the "actual" value should be

//
// This is the actual value with given parameters
//
y[i] = actual[0]*exp(-sqr((xx - actual[1]) / actual[2]));

I regret the error. (Such things can cause program crashes, not to mention wrong answers.)

Try it again.

Regards,

Dave

weeklens

10-20-2010, 10:54 AM

Thanks Dave!

Thanks for the fast replies and for keeping this forum so up to date! :D

davekw7x

10-20-2010, 11:40 AM

Thanks Dave!It was my pleasure.

Sometimes I have already worked on examples that people ask about. Sometimes I have to dig a little.

Anyhow...

Note that I have no connection with the authors of the book or the code or with the maintainers of the web site. Examples that I give and opinions I express are not approved by anyone but me. (See Footnote.)

I like to learn stuff by trying to see problems from other people's point of view. That's why I hang around (from time to time).

Regards,

Dave

Footnote:
"The opinions expressed here are not necessarily my own.
It's these dang voices in my head."
---davekw7x