SNW Data Analysis
Your Data. Your Direction.

Selected Past Projects


Ruby Fluorescence Decay

A friend of Sarah’s who works in academia had developed a laboratory exercise and was preparing to present it in a workshop at a conference.  They had collected data and fit the known model in a spreadsheet, but wanted a more professional plot and an improved fit.  By applying a least-squares fit using the Gauss-Newton method built into the nlm() function in R, she was able to find an improved fit. In just a couple hours, she was able to fit the curve, make a professional plot, and teach her friend how to do the same. The plot is below, along with some interesting statistics giving an indication of the statistical significance of the fit and 95% confidence levels for the parameters and values predicted from the fit.

Plot shows exponential decay of voltage vs time in ruby fluorescence decay data. Many data points are shown along an exponentially decaying curve from about 0.25 V near t = 0 s to 0.01 V around t = 12 s. The spread in the data is about 0.02 V at any time.  The data is fit to the function V(t) = V_0exp(-t/tau) + B for the range shown, 0 s to about .016 s. The parameters matching the best fit are V_0 = 0.2328983, tau = 0.003700199, and B = 0.007836809.

The estimated variance can be used to calculate an approximate F statistic in the large-sample or asymptotic limit.

equation expressing variance: sigma^2 = 2.51 x 10^-5

Because of the large number of samples (n = 1642), this asymptotic result should be valid.

equation showing F-statistic: 
F = MSmodel/MSerror = 152666

For this large value of the approximate F statistic, P < .001, and the probability of obtaining such a high value purely due to chance is very small.

95% confidence levels can be computed for the parameters and values predicted using the best fit equation:

95% confidence limit of V_0:
0.232 <= V_0 <= 0.234
95% confidence limit of tau:
0.00366 <= tau <= 0.00374
95% confidence limit of B:
0.0073 <= B <= 0.0084
Plot featuring the 95% confidence limit for points predicted with the best fit parameters and the regression model.  Over the range of the plot from 0 to about 0.16 s there is a nearly-constant confidence band  of about +/- 0.0098 along the best fit curve.

PTP Analysis for ASTM D02.07

Background of work:

The ASTM D02.07 Subcommittee on Flow Properties of Petroleum Products is charged with conducting activities related to promoting the use and improvement of standards that characterize flow properties of petroleum products, liquid fuels, and lubricants except for grease. One of the ways the subcommittee monitors its standard test methods is through the use of the ASTM’s Proficiency Test Program (PTP).  Samples are sent to participating labs several times each year, and the labs run a number of tests according to standard test methods. The officers of ASTM D02.07 monitor results for a number of methods, including ASTM D2983 Standard Test Method for Low-Temperature Viscosity of Automatic Transmission Fluids, Hydraulic Fluids, and Lubricants Using a Rotational Viscometer, to see if are performing as expected. 

Over the past several years, a lab had raised concerns over the PTP results for a particular gear oil sample, so the subcommittee began to closely track the Test Performance Index (TPI) for each PTP cycle.  The TPI is a measure of how closely the reproducibility of the PTP sample matches the reproducibility published in the method. TPI values of 1.0 indicate the reproducibility of the sample matches that of the method.  Lower TPI values may indicate that the sample had worse reproducibility than was published. As the chair of the subcommittee section overseeing ASTM D2983, Sarah Nuss-Warren of SNW Data Analysis, LLC volunteered to further examine the question of why there were low TPI values for this test method and whether any corrective action was needed. The subcommittee recommended suppressing statistics while the matter was investigated.  The presentation below was shared with the D02.07 Subcommittee at the ASTM June 2022 Committee Week.

Work product:

A slide with the title, "A closer look into D2983, June 2022." It contains plot titled "TPI (for A, B, C) over historical cycles for ATF and GO" that shows TPI vales plotted from cycles 1600 to 2201. The values range from about 0.1 to 1.1. Most values are below the guide lines of 0.6 and 1. Two points are above 1, and quite a few are below 0.6. There seem to be more points below 0.6 for cycles from 2018 and later.

Values of 0.6 or less are typically statistically significant given the number of measurements included in most PTP cycles for this method. There have been a number of TPI values below 0.6 since 2018 (cycle numbers 1800 and greater).

A slide titled "A closer look into D2983 Statistical Analysis (ANOVA)" is shown. It contains the text: 
To get enough data points, combine values from all programs by using the z-score; this effectively eliminates sample-to-sample variation
Assume all factors are independent
Over 5 years, most labs have run 2, 3, or even 4 procedures
Combined D with other procedures where run and used GESD  to remove outliers
Same A, B, C outliers were removed as in PTP reports
Adding/removing Procedure D did not change the key outcomes of the analysis

Note that lab number did not stay consistent throughout the analysis, so any lab effect will not be detected.

A slide titled "A closer look into D2983 Checking Assumptions: Normality" is shown.  It contains a histogram showing relative frequency vs. Z-score. A gaussian curve with mean 0 and standard deviation 1 is overlaid.  The data mostly tracks the  Gaussian except the data show a few points near the mean having higher frequency than expected based on the Gaussian distribution. The highest relative frequency in the Gaussian is about 0.045 while the highest relative frequency for a data point is about 0.065.

The transformation to z-score effectively mapped the data set onto a normal distribution with a mean of 0 and a standard deviation of 1.  While this is expected to hide any program effects, it lets all the data be combined.

A slide titled "A closer look into D2983 Analysis of Variation Results" is shown.  An Analysis of Variation table including the factors Program, Lab, Speeds, and Procedures in addition to two-way, three-way and four-way interactions is shown. Speeds and procedures both have a P-value of 0 to three decimal places. The Program-Speed interaction has a P-value of .006.  The Program-Lab-Speeds interaction has a P-value of .004.  The Program-Speeds-Procedures interaction has a P-value of .008. The Program-Lab-Speeds-Procedures interaction has a P-value of .004.  No other factors or interactions had P-values below .05. The total degrees of freedom were 859 with 502 attributed to error.

An analysis of variation (ANOVA) indicates strong effects in speeds and procedure with speeds giving the strongest effect. This is not surprising. The method indicates that measurements at different speeds are not to be combined. Many of the problematic TPI values from the previous years occurred when samples were taken at multiple speeds.  Interaction effects are much smaller but still significant any time Program and Speeds are combined.  This suggests that there may be a sample-specific effect where different speeds give different results for certain samples.

A slide with the title "A closer look into D2983 Variation Due to Speed and Procedure" is shown.  It contains two plots side-by-side.  One shows Z-score vs Speed.  In it a number of points at speeds .6, 3, 6, 12, 30, 60, and 120 rpm show lower mean z-score and slightly reduced z-score variance as the speed increases. The other plot shows Z-score vs Procedure.  A line of points is shown for Procedures A, B, C, and D.  Not much difference in mean z-score is shown, though it is plausible that Procedures B and D have lower means compared to A and C. The effect is slightly more pronounced for D.  There is not much difference in variance, though D appears to have a slightly smaller variance.

This is a visual check to make sure the analysis is believable.  There really does seem to be a change in Z-score with speed. A difference due to procedure is plausible from the graph.

This shows a slide with the title "A closer look into D2983 Speed-Procedure Interactions?" Four plots are shown, one for A, B, C, and D.  Each is a plot of z-score vs. speed.  All show the pattern of decreasing mean z-score with speed.  Again speeds of 0.6, 3, 6, 12, 30, 60, and 120 rpm are shown. The plots look slightly different in that certain speeds have more variance or slightly higher or lower means for one speed compared to another plot, but all show the same pattern.

For each procedure, speed continues to have a clear effect. Some difference between the speed effects for each procedure are plausible.

A slide with the title, "A closer look into D2983 Analysis of Variation Results: 1 Year at a Time" is shown.  It contains a plot that shows bot P-value of the F-test from an analysis of variation for factors Labs, Speeds, and Procedures for program cycles aggregated over a year from 2017 to 2021. A red line indicates statistical significance with a P-value of .05.  Labs are statistically significant for 2021. Procedures are statistically significant for 2019.  Speeds are statistically significant for all years except 2021, and the point for 2021 has a P-value barely above .05.  There is also a black line indicating the number of Poor TPI values out of 6 for each year. The black line shows no poor TPI results in 2017, 2 in 2018 and 2019, 1 in 2020, and 3 in 2021.

This plot shows an attempt to understand how the factors evaluated seem to be impacting TPI over time.  The black line shows the number of poor TPI values in the 6 PTP reports every year.  A higher value on the line indicates poorer reproducibility evident among the PTP samples. The points show the P-value for the ANOVA factor that year. The red line indicates a P-value of .05, which is the standard value chosen to indicate statistical significance.  For most years analyzed, speed was an important factor.  Procedure and lab were only occasionally important factors in the variation among results. Even though lab is not stable over many cycles, lab number is likely to be consistent across a single year.  However, the number of poor TPI values in a year does not obviously correlate to any one of these factors.

This shows a slide with title, "F/F critical for variation from speed in each procedure."  It contains a table of Cycle, Procedure A, B, and C, and TPI with the ratio of F/F critical from speed shown for each Procedure in each cycle. The following rations of F-to-F-critical are highlighted: 1.15 for B in GO1612; 1.985 for C in ATF1907; 1.636 for B in GO1908; 1.008 for B in GO1912; 1.343 for B and 1.741 for C in ATF2007; 8.337 for B in G2008; 1.837 for C in ATF2011; 9.537 for B in ATF2103; and 1.179 for B in ATF2203. The TPI is also shown for each cycle. The following TPIs are highlighted: .43 for FO1804; .52 for ATF1807; 0.48 for GO1908; .32 for GO1912; .09 for GO2008; .51 for GO2104; and .29 for ATF2111.

This table gives a closer look at the speed effect across any given procedure for each particular program cycle, which is tied to the sample.  The columns A, B, and C show the F statistic computed for that procedure in that cycle divided by the critical F-value.  A number greater than 1 indicates a statistically significant result, which is highlighted in yellow.  Low TPI values are also highlighted in yellow. 

A slide titled "A closer look into D2983 Conclusions?" is shown.  IT contains the text: 
Procedure and speed are the biggest contributors to the variation
In a given cycle, variation due to procedures for a given speed or speeds for a given procedure may or may not be statistically significant
Significant variation due to speeds or procedures does not necessarily correspond with poor TPI values, but it does sometimes
The PTP data may indicate we have underestimated the R in our round robin
Combining speeds is not accounted for in the method precision
Eliminating PTP statistics is probably not ideal long-term

Work outcomes:

Based on the analysis and presentation, the Subcommittee was able to determine the method itself did not appear to be performing poorly and to make appropriate recommendations to labs regarding how to best use D2983 results from PTP for quality control purposes.


The following extension of a textbook example shows one approach Sarah might use for selecting a preferred model when many possible regressors are available. Sarah completed this example using R in a Jupyter notebook.

Asphalt Example