Are CURVEFIT Sigma Values Screwy?

QUESTION: Folks, I'm at wits' end here, don't know what's going on.

I need to do a chi-square minimization curve fitting of a nonlinear parametrized function to noisy data. Using the routine CURVEFIT works admirably for the fitting itself, and the chi-square comes out okay, but the values returned for the standard deviations (in the optional Sigma keyword) on the fitted coefficients just don't make sense to me.

Looking at the source code (idl 5.3) and comparing with the Levenberg-Marquardt algorithm in Numerical Recipes (chapter 15.5), I am still unable to pinpoint the error (whether in my code or in my admittedly lacking understanding of chi-square statistics).

So here's a code I wrote (FTEST.PRO) for testing CURVEFIT versus a routine I know works, LINFIT. It generates a noisy straight line, fits a line using both LINFIT and CURVEFIT, and plots both fits together with the fits offset by one sigma on the coefficients.

You can see the discrepancy in the one-sigma lines. Can someone tell me what's up with the sigma returned from CURVEFIT, and how I can make them conform?

ANSWER: This answer comes from a series of IDL newsgroup articles, entitled "CURVEFIT.PRO standard deviations?", started by Ralf Flicker on 10 May 2002. As usual, I paraphase and edit for (what I hope is) clarity. You might want to search for the original articles for additional insight.

First, more clues from Andrew Noymer.

I'm not sure exactly what's going on, but I take an interest in this because I use these sorts of procedures.

I fed your code into IDL and modified it so the data were also written-out. I then fed the points into Stata (www.stata.com).

Here is what I found:
LINFIT parameters, sigma, and chi-square : 
     -13.7844      2.91336
      1.32243    0.0944590
      266.783
CURVEFIT parameters, sigma, and chi-square : 
     -13.7839      2.91333
     0.388221    0.0277362
      11.5993
So I conclude that the parameters are the same (for all intents and purposes) between the two procedures, but the chi-square and sigma values are different. Here's what Stata gives me:
      Source |       SS       df       MS              Number of obs =      25
-------------+------------------------------           F(  1,    23) =  951.27
       Model |  11034.0003     1  11034.0003           Prob > F      =  0.0000
    Residual |  266.782989    23  11.5992604           R-squared     =  0.9764
-------------+------------------------------           Adj R-squared =  0.9754
       Total |  11300.7833    24   470.86597           Root MSE      =  3.4058

------------------------------------------------------------------------------
          v1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        var2 |   2.913364    .094459    30.84   0.000     2.717961    3.108768
       _cons |  -13.78436   1.322426   -10.42   0.000       -16.52   -11.04871
------------------------------------------------------------------------------
Stata's coefficient's are (of course) the same. What CURVEFIT calls chi-square, Stata calls Residual MSE (mean square error). And it looks like LINFIT gives the Standard Errors of the coefficients that I would use if I were you. The LINFIT chi-square is what Stata calls the Residual SSE (sum square error).

I'm not sure how the CURVEFIT sigma values are calculated but I would not use them if I were you, without knowing exactly where they come from.

Ralf Flicker replies.

No, you're right I can't use these CURVEFIT sigma values as they come, but I think I've found the culprit. Just for the sake of it, I also included LMFIT in the test above, which returned exactly the same numbers as CURVEFIT. Now, I'm very inclined to think that I just don't understand the numbers rather than that the procedures are doing something spurious.

A closer inspection reveals something patently absurd with the CURVEFIT and LMFIT sigmas: they don't vary with the noise in the data (i.e., they are always the same). This means that they must have been normalized to the current chi-square (whoever would come up with such an idea ought to be flogged in public). Futhermore, they sometimes are and sometimes are not divided by the number of degrees of freedom (as, indeed, the final value should be). So comparing the three procedures LINFIT, CURVEFIT and LMFIT, I surmise the following transformations that need to be applied to have them return the same numbers (which hopefully are the "right" ones too):
LINFIT :   chisq = chisq_0/nfree

CURVEFIT : sigma = sigma_0*sqrt(chisq_0)

LMFIT :    chisq = chisq_0/nfree
           sigma = sigma_0*sqrt(chisq)
where chisq_0 and sigma_0 denote the values returned by the procedure, and nfree is the number of degrees of freedom:
nfree = n_elements(data)-n_elements(parameters)
One would think that they could bloody well inform you of these seemingly arbitrary variations in normalization in the preamble of the online help descriptions.

Finally, some badly needed clarification from the IDL Math Guy, Craig Markwardt.

Ralph, these are not necessarily patently absurd. The "sigma" values of a fitting program typically mean to find the confidence limits on a parameter based on a change in the chi-squared of +1.

Since you left your fit unweighted (i.e., WEIGHTS=1), the resulting chi-squared value is too high. You have weighted your data so highly that the fit parameters are hypersensitive to any fluctuations in the data. Hence the confidence region is artificially too small. What you need to do is decrease the weights, corresponding to increasing the individual uncertainties, until you achieve a reduced chi-squared of order unity. Then the parameter errors reported by CURVEFIT or MPFITFUN / MPCURVEFIT will be appropriate. This is more or less described in the MPFIT.PRO documentation for the PERROR keyword.

Note that Craig's wonderful curve fitting routines, including MPCURVEFIT, are available on-line and are often preferred to IDL's own.

Finally, confirmation that weights are needed to make sense of all this from Bill Thompson.

I tested this with one of my own programs, under IDL/v5.4, and got the following:
LINFIT parameters, sigma, and chi-square : 
     -7.67345      2.36052
      11.8136     0.843827
      21290.1
CURVEFIT parameters, sigma, and chi-square : 
     -7.67969      2.36077
     0.388235    0.0277367
      925.657
LSTSQR parameters, sigma, and chi-square (w/o): 
      -7.6734443       2.3605204
       11.813374      0.84381246
       925.65655
LSTSQR parameters, sigma, and chi-square (with): 
      -7.6734443       2.3605204
      0.38828358     0.027734542
       925.65655
My chi-squared values agree with those from CURVEFIT. I can replicate either the LINFIT or CURVEFIT errors depending on how I define the weights to be applied to the data. The two cases are described below.

Case 1: No errors passed

In the LINFIT program, as in my own, the errors are passed in through an optional keyword. (This is my "w/o" case.) For LINFIT, this is done through the keyword MEASURE_ERRORS. When this keyword is omitted, the LINFIT program tries to estimate the errors in the data based on the reduce chi-squared values. This behavior in LINFIT is clearly discussed in the documentation:
;        Note: if MEASURE_ERRORS is omitted, then you are assuming that the
;        linear fit is the correct model. In this case,
;        SIGMA is multiplied by SQRT(CHISQ/(N-M)), where N is the
;        number of points in X. See section 15.2 of Numerical Recipes
;        in C (Second Edition) for details.
Apparently it manages to do this correctly, even though it reports a completely bogus chi-squared. I believe that someone mentioned that the chi-squared problem is fixed in v5.5?

Case 2: Error bars passed.

In the CURVEFIT program, the error bars are passed in through the WEIGHTS parameter in the calling sequence. There doesn't seem to be any way to make this optional. When you put in a WEIGHTS array of all ones, you're saying that the error in each point is +/- 1.0. If you put in more realistic weights, you'd get a more realistic error and chi-squared. With realistic weights, you should get a chi-squared of about one.

I was able to replicate your CURVEFIT results in my own program by passing in an array of error bars of unity. The same happens with LINFIT (except for chi-squared), when the MEASURE_ERROR is passed with all ones.

In short, LINFIT returned the correct SIGMAA values, but the wrong CHISQR. CURVEFIT would have returned the correct SIGMAA and CHISQR values if an appropriate WEIGHTS array had been passed.