# Are CURVEFIT Sigma Values Screwy?

**QUESTION:** Folks, I'm at wits' end here, don't know what's going on.

I need to do a chi-square minimization curve fitting of a nonlinear
parametrized function to noisy data. Using the routine **CURVEFIT**
works admirably for the fitting itself, and the chi-square comes out
okay, but the values returned for the standard deviations (in the
optional **Sigma **keyword) on the fitted coefficients just don't make sense
to me.

Looking at the source code (idl 5.3) and comparing with the Levenberg-Marquardt algorithm in Numerical Recipes (chapter 15.5), I am still unable to pinpoint the error (whether in my code or in my admittedly lacking understanding of chi-square statistics).

So here's a code I wrote (FTEST.PRO) for testing **CURVEFIT **versus a
routine I know works, **LINFIT**. It generates a noisy straight line,
fits a line using both **LINFIT **and **CURVEFIT**, and plots both fits
together with the fits offset by one sigma on the coefficients.

You can see the discrepancy in the one-sigma lines. Can someone tell
me what's up with the sigma returned from **CURVEFIT**, and how I can
make them conform?

** ANSWER:** This answer comes from a series of IDL newsgroup
articles, entitled "CURVEFIT.PRO standard deviations?", started
by Ralf Flicker on 10 May 2002. As usual, I paraphase and edit for
(what I hope is) clarity. You might want to search for the original
articles for additional insight.

First, more clues from Andrew Noymer.

I'm not sure exactly what's going on, but I take an interest in this because I use these sorts of procedures.

I fed your code into IDL and modified it so the data were also written-out. I then fed the points into Stata (www.stata.com).

Here is what I found:

LINFIT parameters, sigma, and chi-square : -13.7844 2.91336 1.32243 0.0944590 266.783 CURVEFIT parameters, sigma, and chi-square : -13.7839 2.91333 0.388221 0.0277362 11.5993So I conclude that the parameters are the same (for all intents and purposes) between the two procedures, but the chi-square and sigma values are different. Here's what Stata gives me:

`Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 1, 23) = 951.27 Model | 11034.0003 1 11034.0003 Prob > F = 0.0000 Residual | 266.782989 23 11.5992604 R-squared = 0.9764 -------------+------------------------------ Adj R-squared = 0.9754 Total | 11300.7833 24 470.86597 Root MSE = 3.4058 ------------------------------------------------------------------------------ v1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- var2 | 2.913364 .094459 30.84 0.000 2.717961 3.108768 _cons | -13.78436 1.322426 -10.42 0.000 -16.52 -11.04871 ------------------------------------------------------------------------------`

Stata's coefficient's are (of course) the same. What

CURVEFITcalls chi-square, Stata callsResidual MSE (mean square error). And it looks likeLINFITgives the Standard Errors of the coefficients that I would use if I were you. TheLINFITchi-square is what Stata calls theResidual SSE (sum square error).I'm not sure how the

CURVEFITsigma values are calculated but I would not use them if I were you, without knowing exactly where they come from.

Ralf Flicker replies.

No, you're right I can't use these

CURVEFITsigma values as they come, but I think I've found the culprit. Just for the sake of it, I also includedLMFITin the test above, which returned exactly the same numbers asCURVEFIT. Now, I'm very inclined to think that I just don't understand the numbers rather than that the procedures are doing something spurious.A closer inspection reveals something patently absurd with the

CURVEFITandLMFITsigmas: they don't vary with the noise in the data (i.e., they are always the same). This means that they must have been normalized to the current chi-square (whoever would come up with such an idea ought to be flogged in public). Futhermore, they sometimes are and sometimes are not divided by the number of degrees of freedom (as, indeed, the final value should be). So comparing the three proceduresLINFIT,CURVEFITandLMFIT, I surmise the following transformations that need to be applied to have them return the same numbers (which hopefully are the "right" ones too):LINFIT : chisq = chisq_0/nfree CURVEFIT : sigma = sigma_0*sqrt(chisq_0) LMFIT : chisq = chisq_0/nfree sigma = sigma_0*sqrt(chisq)where

chisq_0andsigma_0denote the values returned by the procedure, andnfreeis the number of degrees of freedom:nfree = n_elements(data)-n_elements(parameters)One would think that they could bloody well inform you of these seemingly arbitrary variations in normalization in the preamble of the online help descriptions.

Finally, some badly needed clarification from the IDL Math Guy, Craig Markwardt.

Ralph, these are not necessarily patently absurd. The "sigma" values of a fitting program typically mean to find the confidence limits on a parameter based on a change in the chi-squared of +1.

Since you left your fit unweighted (i.e.,

WEIGHTS=1), the resulting chi-squared value is too high. You have weighted your data so highly that the fit parameters are hypersensitive to any fluctuations in the data. Hence the confidence region is artificially too small. What you need to do is decrease the weights, corresponding to increasing the individual uncertainties, until you achieve a reduced chi-squared of order unity. Then the parameter errors reported by CURVEFIT or MPFITFUN / MPCURVEFIT will be appropriate. This is more or less described in the MPFIT.PRO documentation for the PERROR keyword.

Note that Craig's wonderful curve fitting routines,
including **MPCURVEFIT**, are available
on-line and are often preferred to IDL's own.

Finally, confirmation that weights are needed to make sense of all this from Bill Thompson.

I tested this with one of my own programs, under IDL/v5.4, and got the following:

LINFIT parameters, sigma, and chi-square : -7.67345 2.36052 11.8136 0.843827 21290.1 CURVEFIT parameters, sigma, and chi-square : -7.67969 2.36077 0.388235 0.0277367 925.657 LSTSQR parameters, sigma, and chi-square (w/o): -7.6734443 2.3605204 11.813374 0.84381246 925.65655 LSTSQR parameters, sigma, and chi-square (with): -7.6734443 2.3605204 0.38828358 0.027734542 925.65655My chi-squared values agree with those from

CURVEFIT. I can replicate either theLINFITorCURVEFITerrors depending on how I define the weights to be applied to the data. The two cases are described below.## Case 1: No errors passed

In the

LINFITprogram, as in my own, the errors are passed in through an optional keyword. (This is my "w/o" case.) ForLINFIT, this is done through the keywordMEASURE_ERRORS. When this keyword is omitted, theLINFITprogram tries to estimate the errors in the data based on the reduce chi-squared values. This behavior inLINFITis clearly discussed in the documentation:; Note: if MEASURE_ERRORS is omitted, then you are assuming that the ; linear fit is the correct model. In this case, ; SIGMA is multiplied by SQRT(CHISQ/(N-M)), where N is the ; number of points in X. See section 15.2 of Numerical Recipes ; in C (Second Edition) for details.Apparently it manages to do this correctly, even though it reports a completely bogus chi-squared. I believe that someone mentioned that the chi-squared problem is fixed in v5.5?

## Case 2: Error bars passed.

In the

CURVEFITprogram, the error bars are passed in through theWEIGHTSparameter in the calling sequence. There doesn't seem to be any way to make this optional. When you put in aWEIGHTSarray of all ones, you're saying that the error in each point is +/- 1.0. If you put in more realistic weights, you'd get a more realistic error and chi-squared. With realistic weights, you should get a chi-squared of about one.I was able to replicate your

CURVEFITresults in my own program by passing in an array of error bars of unity. The same happens withLINFIT(except for chi-squared), when theMEASURE_ERRORis passed with all ones.In short,

LINFITreturned the correctSIGMAAvalues, but the wrongCHISQR.CURVEFITwould have returned the correctSIGMAAandCHISQRvalues if an appropriateWEIGHTSarray had been passed.

Copyright © 2003 David W. Fanning

Last Updated 26 September 2003