Is the use of weighted least squares regression appropriate for solid and hazardous waste measurements of cyanide and phenol instead of linear regression, specifically as it relates to EPA 9012B and EPA 9066 sample analyses?

Ordinary least squares (OLS) regression is a specific application of weighted least squares (WLS) regression in which each of the weighting factors is assigned a value of unity (“1”). OLS is straightforward – especially when applied to a linear model – and the rationale for use and underlying mathematics are easily understood. However, implicit in OLS is the assumption that the variance associated with each calibration level is constant across the entire calibration range. Intuitively, we know that this assumption is not entirely valid. As a result, un-weighted OLS regression tends to generate calibration curves that fit higher concentration points more closely than those at lower concentrations.

Weighted least squares (WLS) regression would appear to be an attractive alternative and, in principle, would yield more representative fitting of experimental data when properly applied to a sufficient number of data points. However, estimation of appropriate weighting factors is challenging, and poor estimates may significantly impact the quality of the fitted data to the empirical calibration model.

For example, WLS requires that replicate calibration standards be analyzed at each concentration level in order to assess variance for each concentration value. Clearly, this is not a practical approach for a production laboratory. SW-846 Method 8000 discusses approaches to calibration for gas chromatography methods, and provides example weighting factors that are reciprocal values of either concentrations, concentrations squared, or of variances themselves (if available). Each approach has its advantages as well as limitations.

WLS may be an appropriate calibration modeling approach for cyanides and phenol if supported by the empirical data, characteristics of the methods, and other program-specific needs. Any model applied to actual data should do more than just “fit the data” – it should be supported by the analytical methodology, detection characteristics, and physiochemical behavior of the analyte within the measurement system. For example, in optical absorption spectroscopy, we have a well-defined physical model, called the Beer-Lambert Law, which describes the linear relationship between analyte concentration and absorption of incident radiation. If we happened to measure calibration sample responses and found that a quadratic model yielded an acceptable representation of those data, use of this model would be in direct contrast to what we know about absorption of light and would not be valid (even though it “worked”).

You can find a summary overview of various EPA programs’ uses and needs for calibration curve modeling in the following document:

http://www2.epa.gov/sites/production/files/2014-05/documents/calibration-guide-ref-final-oct2010.pdf.

Other Category: Detection & Quantitation