Spearmans Rank Problem


txhughes
07-08-2007, 06:19 PM
Hi, I am trying to calculate the rank coefficient between two data sets and have used the spear subroutine for Fortran 77 (and all other required subroutines; betai, crank, erfcc, and sort2). Upon testing, I discovered that the rs value never falls below 0.96 with 0. probabilities, even for datasets which obviously have no correlation or strong anticorrelation. I have double checked over the typing of the subroutines and the values passed to the subroutine but cannot locate an error.

Can anyone point me in the right direction as to where the cause of the problem might lie? Or any suggestions for solutions?

Any help would be greatly appreciated!

Regards, Tom

txhughes
07-21-2007, 04:58 AM
I have found the problem: the method used in calculating Spearman's rank is wrong. Rather than calculating the difference of the ranks from the ranked unsorted data, the subroutine calculates the difference of the ranks from the sorted data, i.e. the value of rs is always near 1.0 because the difference between the sorted ranks is very small, so the subroutine returns a strong positive correlation. This also explains why test dataset with known negative correlation or no correlation generated the same result as datasets with known positive correlation.

To rectify this problem, the subroutine needs to take the ranks based on sorting the data and then reassign these ranks to the unsorted data array. The unsorted data can then be compared to the sorted data, so when two elements in the data array are equal, the rank based on the sorted data can be reassigned to the position of the unsorted data element.

I have changed the routine to use the correct method and it is working fine now (all datases of known correlations generate the expected results). I hope I have explained this correctly and that it may help someone trying to use these routines in the future.

Tom

Bill Press
07-21-2007, 01:46 PM
Hello, Tom.
We can't reproduce this bug. The routine sort2 definitely moves the elements in the two arrays in lockstep as pairs, so the difference of ranks is the correct one. It sounds like you are somehow using a wrong sorting routine, not sort2, or perhaps there is some other bug in your code.
Cheers,
Bill P.