evgeny
06-20-2006, 01:42 AM
Hi,
I suggest a few updates in svdcmp() function to speed up it and do more reliable:
1) replace all a[i][j] expressions on *(*(a+i)+j) to allow direct using of pointers on matrix rows and rewrite all row-oriented operations (especially in right-hand reduction code part or row normalization) such that internal loop runs on j with introducing a temporary pointer variable *(a+i) before each internal loop.
For example:
pr_tmp1 = *(a + i);
for (k = l; k < n; k++){
scale += fabs(*(pr_tmp1+k));
}
2) point order of operations in complex multiplication:
f = (c * g) + (s * y);
x = (c * y) - (s * g);
3) do this complex multiplication on assembler
4) point order of operations in below operator:
v[j][i] = (u[i][j]/u[i][l]) / g;
5) calculate, where it is possible, instead of division an inverse value out of loop:
uInv = 1.0 / u[i][l];
and than do loop on j
v[j][i] = (u[i][j] * uInv) / g;
6) point order of operations in below operator:
f = (r_s / a[i][i]) * g;
7) introduce more simple pythag1() function which calculates
sqrt(1+a*a) only instead of sqrt(a*a+b*b)
8) do initialization at start to avoid compiler complans:
g=0.0;
x=0.0;
anorm=0.0;
l = 1;
9) do abort() if dimensions of input matrix less than 2
Evgeny
I suggest a few updates in svdcmp() function to speed up it and do more reliable:
1) replace all a[i][j] expressions on *(*(a+i)+j) to allow direct using of pointers on matrix rows and rewrite all row-oriented operations (especially in right-hand reduction code part or row normalization) such that internal loop runs on j with introducing a temporary pointer variable *(a+i) before each internal loop.
For example:
pr_tmp1 = *(a + i);
for (k = l; k < n; k++){
scale += fabs(*(pr_tmp1+k));
}
2) point order of operations in complex multiplication:
f = (c * g) + (s * y);
x = (c * y) - (s * g);
3) do this complex multiplication on assembler
4) point order of operations in below operator:
v[j][i] = (u[i][j]/u[i][l]) / g;
5) calculate, where it is possible, instead of division an inverse value out of loop:
uInv = 1.0 / u[i][l];
and than do loop on j
v[j][i] = (u[i][j] * uInv) / g;
6) point order of operations in below operator:
f = (r_s / a[i][i]) * g;
7) introduce more simple pythag1() function which calculates
sqrt(1+a*a) only instead of sqrt(a*a+b*b)
8) do initialization at start to avoid compiler complans:
g=0.0;
x=0.0;
anorm=0.0;
l = 1;
9) do abort() if dimensions of input matrix less than 2
Evgeny