Selected Past Projects

Ruby Fluorescence Decay

A friend of Sarah’s who works in academia had developed a laboratory exercise and was preparing to present it in a workshop at a conference. They had collected data and fit the known model in a spreadsheet, but wanted a more professional plot and an improved fit. By applying a least-squares fit using the Gauss-Newton method built into the nlm() function in R, she was able to find an improved fit. In just a couple hours, she was able to fit the curve, make a professional plot, and teach her friend how to do the same. The plot is below, along with some interesting statistics giving an indication of the statistical significance of the fit and 95% confidence levels for the parameters and values predicted from the fit.

The estimated variance can be used to calculate an approximate F statistic in the large-sample or asymptotic limit.

Because of the large number of samples (n = 1642), this asymptotic result should be valid.

For this large value of the approximate F statistic, P < .001, and the probability of obtaining such a high value purely due to chance is very small.

95% confidence levels can be computed for the parameters and values predicted using the best fit equation:

95% confidence limit of V_0:
0.232 <= V_0 <= 0.234

95% confidence limit of tau:
0.00366 <= tau <= 0.00374

95% confidence limit of B:
0.0073 <= B <= 0.0084

Plot featuring the 95% confidence limit for points predicted with the best fit parameters and the regression model. Over the range of the plot from 0 to about 0.16 s there is a nearly-constant confidence band of about +/- 0.0098 along the best fit curve.

PTP Analysis for ASTM D02.07

Background of work:

The ASTM D02.07 Subcommittee on Flow Properties of Petroleum Products is charged with conducting activities related to promoting the use and improvement of standards that characterize flow properties of petroleum products, liquid fuels, and lubricants except for grease. One of the ways the subcommittee monitors its standard test methods is through the use of the ASTM’s Proficiency Test Program (PTP). Samples are sent to participating labs several times each year, and the labs run a number of tests according to standard test methods. The officers of ASTM D02.07 monitor results for a number of methods, including ASTM D2983 Standard Test Method for Low-Temperature Viscosity of Automatic Transmission Fluids, Hydraulic Fluids, and Lubricants Using a Rotational Viscometer, to see if are performing as expected.

Over the past several years, a lab had raised concerns over the PTP results for a particular gear oil sample, so the subcommittee began to closely track the Test Performance Index (TPI) for each PTP cycle. The TPI is a measure of how closely the reproducibility of the PTP sample matches the reproducibility published in the method. TPI values of 1.0 indicate the reproducibility of the sample matches that of the method. Lower TPI values may indicate that the sample had worse reproducibility than was published. As the chair of the subcommittee section overseeing ASTM D2983, Sarah Nuss-Warren of SNW Data Analysis, LLC volunteered to further examine the question of why there were low TPI values for this test method and whether any corrective action was needed. The subcommittee recommended suppressing statistics while the matter was investigated. The presentation below was shared with the D02.07 Subcommittee at the ASTM June 2022 Committee Week.

Work product:

A slide with the title, "A closer look into D2983, June 2022." It contains plot titled "TPI (for A, B, C) over historical cycles for ATF and GO" that shows TPI vales plotted from cycles 1600 to 2201. The values range from about 0.1 to 1.1. Most values are below the guide lines of 0.6 and 1. Two points are above 1, and quite a few are below 0.6. There seem to be more points below 0.6 for cycles from 2018 and later.

Values of 0.6 or less are typically statistically significant given the number of measurements included in most PTP cycles for this method. There have been a number of TPI values below 0.6 since 2018 (cycle numbers 1800 and greater).

A slide titled "A closer look into D2983 Statistical Analysis (ANOVA)" is shown. It contains the text:
To get enough data points, combine values from all programs by using the z-score; this effectively eliminates sample-to-sample variation
Assume all factors are independent
Over 5 years, most labs have run 2, 3, or even 4 procedures
Combined D with other procedures where run and used GESD to remove outliers
Same A, B, C outliers were removed as in PTP reports
Adding/removing Procedure D did not change the key outcomes of the analysis

Note that lab number did not stay consistent throughout the analysis, so any lab effect will not be detected.

A slide titled "A closer look into D2983 Checking Assumptions: Normality" is shown. It contains a histogram showing relative frequency vs. Z-score. A gaussian curve with mean 0 and standard deviation 1 is overlaid. The data mostly tracks the Gaussian except the data show a few points near the mean having higher frequency than expected based on the Gaussian distribution. The highest relative frequency in the Gaussian is about 0.045 while the highest relative frequency for a data point is about 0.065.

The transformation to z-score effectively mapped the data set onto a normal distribution with a mean of 0 and a standard deviation of 1. While this is expected to hide any program effects, it lets all the data be combined.

A slide titled "A closer look into D2983 Analysis of Variation Results" is shown. An Analysis of Variation table including the factors Program, Lab, Speeds, and Procedures in addition to two-way, three-way and four-way interactions is shown. Speeds and procedures both have a P-value of 0 to three decimal places. The Program-Speed interaction has a P-value of .006. The Program-Lab-Speeds interaction has a P-value of .004. The Program-Speeds-Procedures interaction has a P-value of .008. The Program-Lab-Speeds-Procedures interaction has a P-value of .004. No other factors or interactions had P-values below .05. The total degrees of freedom were 859 with 502 attributed to error.

An analysis of variation (ANOVA) indicates strong effects in speeds and procedure with speeds giving the strongest effect. This is not surprising. The method indicates that measurements at different speeds are not to be combined. Many of the problematic TPI values from the previous years occurred when samples were taken at multiple speeds. Interaction effects are much smaller but still significant any time Program and Speeds are combined. This suggests that there may be a sample-specific effect where different speeds give different results for certain samples.

A slide with the title "A closer look into D2983 Variation Due to Speed and Procedure" is shown. It contains two plots side-by-side. One shows Z-score vs Speed. In it a number of points at speeds .6, 3, 6, 12, 30, 60, and 120 rpm show lower mean z-score and slightly reduced z-score variance as the speed increases. The other plot shows Z-score vs Procedure. A line of points is shown for Procedures A, B, C, and D. Not much difference in mean z-score is shown, though it is plausible that Procedures B and D have lower means compared to A and C. The effect is slightly more pronounced for D. There is not much difference in variance, though D appears to have a slightly smaller variance.

This is a visual check to make sure the analysis is believable. There really does seem to be a change in Z-score with speed. A difference due to procedure is plausible from the graph.

This shows a slide with the title "A closer look into D2983 Speed-Procedure Interactions?" Four plots are shown, one for A, B, C, and D. Each is a plot of z-score vs. speed. All show the pattern of decreasing mean z-score with speed. Again speeds of 0.6, 3, 6, 12, 30, 60, and 120 rpm are shown. The plots look slightly different in that certain speeds have more variance or slightly higher or lower means for one speed compared to another plot, but all show the same pattern.

For each procedure, speed continues to have a clear effect. Some difference between the speed effects for each procedure are plausible.

A slide with the title, "A closer look into D2983 Analysis of Variation Results: 1 Year at a Time" is shown. It contains a plot that shows bot P-value of the F-test from an analysis of variation for factors Labs, Speeds, and Procedures for program cycles aggregated over a year from 2017 to 2021. A red line indicates statistical significance with a P-value of .05. Labs are statistically significant for 2021. Procedures are statistically significant for 2019. Speeds are statistically significant for all years except 2021, and the point for 2021 has a P-value barely above .05. There is also a black line indicating the number of Poor TPI values out of 6 for each year. The black line shows no poor TPI results in 2017, 2 in 2018 and 2019, 1 in 2020, and 3 in 2021.

This plot shows an attempt to understand how the factors evaluated seem to be impacting TPI over time. The black line shows the number of poor TPI values in the 6 PTP reports every year. A higher value on the line indicates poorer reproducibility evident among the PTP samples. The points show the P-value for the ANOVA factor that year. The red line indicates a P-value of .05, which is the standard value chosen to indicate statistical significance. For most years analyzed, speed was an important factor. Procedure and lab were only occasionally important factors in the variation among results. Even though lab is not stable over many cycles, lab number is likely to be consistent across a single year. However, the number of poor TPI values in a year does not obviously correlate to any one of these factors.

This shows a slide with title, "F/F critical for variation from speed in each procedure." It contains a table of Cycle, Procedure A, B, and C, and TPI with the ratio of F/F critical from speed shown for each Procedure in each cycle. The following rations of F-to-F-critical are highlighted: 1.15 for B in GO1612; 1.985 for C in ATF1907; 1.636 for B in GO1908; 1.008 for B in GO1912; 1.343 for B and 1.741 for C in ATF2007; 8.337 for B in G2008; 1.837 for C in ATF2011; 9.537 for B in ATF2103; and 1.179 for B in ATF2203. The TPI is also shown for each cycle. The following TPIs are highlighted: .43 for FO1804; .52 for ATF1807; 0.48 for GO1908; .32 for GO1912; .09 for GO2008; .51 for GO2104; and .29 for ATF2111.

This table gives a closer look at the speed effect across any given procedure for each particular program cycle, which is tied to the sample. The columns A, B, and C show the F statistic computed for that procedure in that cycle divided by the critical F-value. A number greater than 1 indicates a statistically significant result, which is highlighted in yellow. Low TPI values are also highlighted in yellow.

A slide titled "A closer look into D2983 Conclusions?" is shown. IT contains the text:
Procedure and speed are the biggest contributors to the variation
In a given cycle, variation due to procedures for a given speed or speeds for a given procedure may or may not be statistically significant
Significant variation due to speeds or procedures does not necessarily correspond with poor TPI values, but it does sometimes
The PTP data may indicate we have underestimated the R in our round robin
Combining speeds is not accounted for in the method precision
Eliminating PTP statistics is probably not ideal long-term

Work outcomes:

Based on the analysis and presentation, the Subcommittee was able to determine the method itself did not appear to be performing poorly and to make appropriate recommendations to labs regarding how to best use D2983 results from PTP for quality control purposes.

Preferred Regression Model

The following extension of a textbook example shows one approach Sarah might use for selecting a preferred model when many possible regressors are available. Sarah completed this example using R in a Jupyter notebook.

Asphalt Example

The data below from the 1966 J. W. Gorman and R. J. Toman paper "Selection of variables for fitting equation sto data" in Technometrics, 8, pp. 27 – 51 gives rut depth of 31 asphalt pavements prepared under different conditions. An indicator variable separates the data into two sets of runs.
$y$: rut depth per million whell passes
$x_1$: viscosity of asphalt
$x_2$: percentage of asphalt in surface course
$x_3$: percentage of asphalt in base course
$x_4$: run
$x_5$:percentage of fines in the surface course
$x_6$: percentage of voids in the surface course
The goal is to determine what regressor variables most effectively model the given data. Based on subject matter knowldege regarding viscosity, it makes most sense to use the log of viscosity.

In [240]:

df_asphalt <- read.table("Table9_6.csv", sep=",", header=TRUE)
df_asphalt

A data.frame: 31 × 7
yi	xi1	xi2	xi3	xi4	xi5	xi6
<dbl>	<dbl>	<dbl>	<dbl>	<int>	<dbl>	<dbl>
6.75	2.80	4.68	4.87	0	8.4	4.916
13.00	1.40	5.19	4.50	0	6.5	4.563
14.75	1.40	4.82	4.73	0	7.9	5.321
12.60	3.30	4.85	4.76	0	8.3	4.865
8.25	1.70	4.86	4.95	0	8.4	3.776
10.67	2.90	5.16	4.45	0	7.4	4.397
7.28	3.70	4.82	5.05	0	6.8	4.867
12.67	1.70	4.86	4.70	0	8.6	4.828
12.58	0.92	4.78	4.84	0	6.7	4.865
20.60	0.68	5.16	4.76	0	7.7	4.034
3.58	6.00	4.57	4.82	0	7.4	5.450
7.00	4.30	4.61	4.65	0	6.7	4.853
26.20	0.60	5.07	5.10	0	7.5	4.257
11.67	1.80	4.66	5.09	0	8.2	5.144
7.67	6.00	5.42	4.41	0	5.8	3.718
12.25	4.40	5.01	4.74	0	7.1	4.715
0.76	88.00	4.97	4.66	1	6.5	4.625
1.35	62.00	4.01	4.72	1	8.0	4.977
1.44	50.00	4.96	4.90	1	6.8	4.322
1.60	58.00	5.20	4.70	1	8.2	5.087
1.10	90.00	4.80	4.60	1	6.6	5.971
0.85	66.00	4.98	4.69	1	6.4	4.647
1.20	140.00	5.35	4.76	1	7.3	5.115
0.56	240.00	5.04	4.80	1	7.8	5.939
0.72	420.00	4.80	4.80	1	7.4	5.916
0.47	500.00	4.83	4.60	1	6.7	5.471
0.33	180.00	4.66	4.72	1	7.2	4.602
0.26	270.00	4.67	4.50	1	6.3	5.043
0.76	170.00	4.72	4.70	1	6.8	5.075
0.80	98.00	5.00	5.07	1	7.2	4.334
2.00	35.00	4.70	4.80	1	7.7	5.705

The R library "olsrr" contains different algortihms that test different combinations of parameters to find a preferred set of regressors for the model.

In [241]:

library("olsrr")

In [242]:

rut <- df_asphalt$yi
lnvisc<-log(df_asphalt$xi1)
surf <- df_asphalt$xi2
base <- df_asphalt$xi3
run <- df_asphalt$xi4
fines <- df_asphalt$xi5
voids <-df_asphalt$xi6
df_asphalt_xform <- data.frame("rut"=rut, "ln.vis"=lnvisc, "surf"=surf, "base"=base, 
                               "run"=run, "fines"=fines, "voids"=voids)

The full model would use all regressors and provides an input to the "all combinations" algorithm and a basis against which to measure model effectiveness.

In [243]:

asphalt_fm <-lm(rut ~ ln.vis + surf + base + run + fines + voids, data=df_asphalt_xform, x=TRUE)
summary(asphalt_fm)

Call:
lm(formula = rut ~ ln.vis + surf + base + run + fines + voids, 
    data = df_asphalt_xform, x = TRUE)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1211 -1.9075 -0.7175  1.6382  9.5947 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept) -14.9592    25.2881  -0.592   0.5597   
ln.vis       -3.1515     0.9194  -3.428   0.0022 **
surf          3.9706     2.4966   1.590   0.1248   
base          1.2631     3.9703   0.318   0.7531   
run           1.9655     3.6472   0.539   0.5949   
fines         0.1164     1.0124   0.115   0.9094   
voids         0.5893     1.3244   0.445   0.6604   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.324 on 24 degrees of freedom
Multiple R-squared:  0.806,	Adjusted R-squared:  0.7575 
F-statistic: 16.62 on 6 and 24 DF,  p-value: 1.743e-07

At least one variable is significant, and the $R^2=.80$, which indicates the model explains a reasonable portion of the variation in the rut depth. It appears that only $\ln(\mathrm{viscosity})$ is significant, but this could be misleading as all variables are included in the model at this point.
Consider the residuals:

In [244]:

qqnorm(asphalt_fm$resid); qqline(asphalt_fm$resid)
plot(asphalt_fm$fitted,asphalt_fm$resid,main="Residuals of full model", 
     xlab=expression(hat(y[i])), ylab=expression(e[i]))

There appear to be one or two significant outliers in the tails, and the residuals show a curved relationship with the most negative values around 5 and most positive values at the ends. This indicates that the model is inadequate.
Plot the residuals against each regressor to see where the full model fails.

In [245]:

plot(df_asphalt_xform$ln.vis, asphalt_fm$resid, xlab="log of viscosity",ylab="residuals")
plot(df_asphalt_xform$surf, asphalt_fm$resid, xlab="% asphalt in surface",ylab="residuals")
plot(df_asphalt_xform$base, asphalt_fm$resid, xlab="% asphalt in base",ylab="residuals")
plot(df_asphalt_xform$run, asphalt_fm$resid, xlab="run",ylab="residuals")
plot(df_asphalt_xform$fines, asphalt_fm$resid, xlab="fines in surface",ylab="residuals")
plot(df_asphalt_xform$voids, asphalt_fm$resid, xlab="voids in surface",ylab="residuals")

The log of viscosity vs. the residuals shows a definite curve, similar to the overall residuals plot. There may also be a different relationship for low and high viscosity samples. The plot of residuals in the runs show more variability for the first run (represented by 0), which is distinctly influenced by an apparent outlier. When residuals are plotted against the other regressors, the trend is not so obvious, but there does appear to be at least one outlier.
Transforming the response variable, rut depth, should stabilize the errors so that statistical techniques for assessing the fit will be reliable.

In [246]:

ln.rut <- log(df_asphalt_xform$rut)
df_asphalt_xform2 <- df_asphalt_xform
df_asphalt_xform2$ln.rut<-ln.rut
asphalt2_fm <- lm(ln.rut ~ ln.vis + surf + base + run + fines + voids, x=TRUE, 
                  data=df_asphalt_xform2)
summary(asphalt2_fm)

Call:
lm(formula = ln.rut ~ ln.vis + surf + base + run + fines + voids, 
    data = df_asphalt_xform2, x = TRUE)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.53072 -0.18563 -0.00003  0.20017  0.55079 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.23294    2.34970  -0.525    0.605    
ln.vis      -0.55769    0.08543  -6.528 9.45e-07 ***
surf         0.58358    0.23198   2.516    0.019 *  
base        -0.10337    0.36891  -0.280    0.782    
run         -0.34005    0.33889  -1.003    0.326    
fines        0.09775    0.09407   1.039    0.309    
voids        0.19885    0.12306   1.616    0.119    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3088 on 24 degrees of freedom
Multiple R-squared:  0.961,	Adjusted R-squared:  0.9512 
F-statistic: 98.47 on 6 and 24 DF,  p-value: 1.059e-15

Notice that at least one regressor is still important, the $R^2$ is very good (though it cannot be compared directly because the response was trasnformed), and there now appear to be at least 3 coefficients with significant values.
Try the residual plots again.

In [247]:

qqnorm(asphalt2_fm$resid); qqline(asphalt2_fm$resid)
plot(asphalt2_fm$fitted,asphalt2_fm$resid,
     main="residuals of model with response log of rut depth",xlab=expression(hat(y[i])),
     ylab=expression(e[i]))

In [248]:

plot(df_asphalt_xform2$ln.vis, asphalt2_fm$resid, main="Residuals for ln(rut depth)",
     xlab="log of viscosity",ylab="residuals")
plot(df_asphalt_xform2$surf, asphalt2_fm$resid, main="Residuals for ln(rut depth)", 
     xlab="% asphalt in surface",ylab="residuals")
plot(df_asphalt_xform2$base, asphalt2_fm$resid, main="Residuals for ln(rut depth)", 
     xlab="% asphalt in base",ylab="residuals")
plot(df_asphalt_xform2$run, asphalt2_fm$resid,main="Residuals for ln(rut depth)", 
     xlab="run",ylab="residuals")
plot(df_asphalt_xform2$fines, asphalt2_fm$resid, main="Residuals for ln(rut depth)", 
     xlab="fines in surface",ylab="residuals")
plot(df_asphalt_xform2$voids, asphalt2_fm$resid, main="Residuals for ln(rut depth)", 
     xlab="voids in surface",ylab="residuals")

The residuals look much better, though there is still a difference in variability between the runs. There no longer seems to be an outlier.
I will use an all-regressions approach to selecting which parameters can best be used for this model, assessing the best 10 according to the adjusted $R^2$ and Mallow’s $C_p$, which are known to be effective criteria for assessing regressions.

In [249]:

asphalt_all <- ols_step_all_possible(asphalt2_fm)
top.10.by.adjr<-sort.list(asphalt_all$result$adjr, decreasing=TRUE, method="auto")[1:10]
selection.by.adjr<-asphalt_all$result[top.10.by.adjr,1:length(asphalt_all$result)]
selection.by.adjr
top.10.by.cp<-sort.list(asphalt_all$result$cp, method="auto")[1:10]
selection.by.cp<-asphalt_all$result[top.10.by.cp,1:length(asphalt_all$result)]
selection.by.cp

A data.frame: 10 × 15
	mindex	n	predictors	rsquare	adjr	rmse	predrsq	cp	aic	sbic	sbc	msep	fpe	apc	hsp
	<int>	<int>	<chr>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>
25	22	3	ln.vis surf voids	0.9578631	0.9531812	0.2823055	0.9360162	2.906603	19.55872	-66.91993	26.72866	2.844547	0.1033101	0.05462192	0.003519356
60	57	5	ln.vis surf run fines voids	0.9608365	0.9530038	0.2721629	0.9277106	5.078511	21.29020	-63.39979	31.32811	2.864141	0.1096275	0.05796203	0.003827085
46	42	4	ln.vis surf run voids	0.9591996	0.9529226	0.2777924	0.9291244	4.084907	20.55954	-65.16782	29.16347	2.864498	0.1068488	0.05649290	0.003680348
47	43	4	ln.vis surf fines voids	0.9589206	0.9526007	0.2787403	0.9333344	4.256398	20.77075	-65.03275	29.37468	2.884081	0.1075793	0.05687911	0.003705509
44	44	4	ln.vis surf base voids	0.9579092	0.9514337	0.2821510	0.9326247	4.878267	21.52480	-64.54857	30.12872	2.955094	0.1102281	0.05827960	0.003796747
63	63	6	ln.vis surf base run fines voids	0.9609642	0.9512052	0.2717188	0.9218579	7.000000	23.18896	-60.87204	34.66085	2.978924	0.1168992	0.06180674	0.004146312
59	58	5	ln.vis surf base fines voids	0.9593265	0.9511918	0.2773599	0.9309323	6.006861	22.46294	-62.75020	32.50085	2.974569	0.1138542	0.06019676	0.003974639
58	59	5	ln.vis surf base run voids	0.9592078	0.9510493	0.2777644	0.9225907	6.079855	22.55330	-62.69971	32.59121	2.983251	0.1141865	0.06037247	0.003986241
45	45	4	ln.vis surf run fines	0.9565107	0.9498200	0.2868001	0.9350426	5.738098	22.53806	-63.89300	31.14198	3.053279	0.1138906	0.06021600	0.003922897
24	23	3	ln.vis surf fines	0.9534561	0.9482845	0.2967013	0.9377030	5.616119	22.64236	-64.67926	29.81230	3.142053	0.1141151	0.06033471	0.003887437

A data.frame: 10 × 15
	mindex	n	predictors	rsquare	adjr	rmse	predrsq	cp	aic	sbic	sbc	msep	fpe	apc	hsp
	<int>	<int>	<chr>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>
25	22	3	ln.vis surf voids	0.9578631	0.9531812	0.2823055	0.9360162	2.906603	19.55872	-66.91993	26.72866	2.844547	0.1033101	0.05462192	0.003519356
46	42	4	ln.vis surf run voids	0.9591996	0.9529226	0.2777924	0.9291244	4.084907	20.55954	-65.16782	29.16347	2.864498	0.1068488	0.05649290	0.003680348
47	43	4	ln.vis surf fines voids	0.9589206	0.9526007	0.2787403	0.9333344	4.256398	20.77075	-65.03275	29.37468	2.884081	0.1075793	0.05687911	0.003705509
44	44	4	ln.vis surf base voids	0.9579092	0.9514337	0.2821510	0.9326247	4.878267	21.52480	-64.54857	30.12872	2.955094	0.1102281	0.05827960	0.003796747
60	57	5	ln.vis surf run fines voids	0.9608365	0.9530038	0.2721629	0.9277106	5.078511	21.29020	-63.39979	31.32811	2.864141	0.1096275	0.05796203	0.003827085
7	7	2	ln.vis surf	0.9508647	0.9473550	0.3048489	0.9376697	5.209344	22.32198	-65.49655	28.05793	3.194137	0.1128471	0.05966428	0.003810740
24	23	3	ln.vis surf fines	0.9534561	0.9482845	0.2967013	0.9377030	5.616119	22.64236	-64.67926	29.81230	3.142053	0.1141151	0.06033471	0.003887437
45	45	4	ln.vis surf run fines	0.9565107	0.9498200	0.2868001	0.9350426	5.738098	22.53806	-63.89300	31.14198	3.053279	0.1138906	0.06021600	0.003922897
23	24	3	ln.vis surf run	0.9530103	0.9477892	0.2981188	0.9332149	5.890199	22.93786	-64.46277	30.10780	3.172147	0.1152081	0.06091259	0.003924671
59	58	5	ln.vis surf base fines voids	0.9593265	0.9511918	0.2773599	0.9309323	6.006861	22.46294	-62.75020	32.50085	2.974569	0.1138542	0.06019676	0.003974639

The highest $R^2_{Adj}$ is an indicator of a preferred model. I will plot this parameter versus the number of parameters to visualize the gains made by adding additional parameters to the model.

In [250]:

plot(selection.by.adjr$n,selection.by.adjr$adjr,xlab="p",ylab=expression(R[Adj]^2),
     xlim=c(3,7),col="blue")
text(selection.by.adjr$n+.27,selection.by.adjr$adjr-.0001,labels=selection.by.adjr$predictors)

Based on the $R^2_{Adj}$ the model using ln(viscosity), % asphalt in surface, run, and voids in surface, may be preferred.

Mallow’s $C_p$ is a statistic related to the sean square error of a fitted value. For a model with negligable bias, the expectation value of $C_p$ is $p$, the number of parameters. So the lowest value close to the line $C_p=p$ indicates a lower bias of the fitted value. I will plot $C_p$ versus $p$ to visualize this relationship.

In [251]:

plot(selection.by.cp$n,selection.by.cp$cp,xlab="p",ylab=expression(C[p]),xlim=c(2,6),ylim=c(2,6),
     col="blue")
text(selection.by.cp$n+.25,selection.by.cp$cp-.1,labels=selection.by.cp$predictors)
lines(selection.by.cp$n,selection.by.cp$n)

The smallest $C_p$, which is close to the $C_p=p$ line, and below it is ln(viscosity), %asphalt in the surface, and voids in the surface.
Therefore, this seems to be a particularly promising model.

In [252]:

suggested <- lm(ln.rut~ln.vis+surf+voids,data=df_asphalt_xform2,x=TRUE)
summary(suggested)

Call:
lm(formula = ln.rut ~ ln.vis + surf + voids, data = df_asphalt_xform2, 
    x = TRUE)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.53548 -0.20181 -0.01702  0.16748  0.54707 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.02079    1.36430  -0.748   0.4608    
ln.vis      -0.64649    0.02879 -22.458   <2e-16 ***
surf         0.55547    0.22044   2.520   0.0180 *  
voids        0.24479    0.11560   2.118   0.0436 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3025 on 27 degrees of freedom
Multiple R-squared:  0.9579,	Adjusted R-squared:  0.9532 
F-statistic: 204.6 on 3 and 27 DF,  p-value: < 2.2e-16

The regression is significant, and each of the regressors also appears to be statistically significant with $p<.05$. The adjusted R-squared indicates that the model explains about 95% of the variation in the rut depth, even accounting for the additional parameters included in the model.

Due diligence requires making sure this is actually a good model. First, check the residuals to ensure the assumptions of a linear regression and the statistical measures of goodness of regression are met.

In [253]:

qqnorm(suggested$resid);qqline(suggested$resid)
plot(suggested$fitted, suggested$resid, xlab="predicted values", ylab="residuals")

In [254]:

#residuals by regressors
plot(df_asphalt_xform2$ln.vis, suggested$resid, xlab="ln(viscosity)", ylab="residuals")
plot(df_asphalt_xform2$surf, suggested$resid, xlab="% asphalt in surface", ylab="residuals")
plot(df_asphalt_xform2$voids, suggested$resid, xlab="% voids in surface", ylab="residuals")

The residuals appear to be relatively normal, and there is no trend in the residuals when plotted against the predicted values or regressors, so we can consider the errors relatively constant. The statistical assumptions appear to be valid for this regression.

The PRESS statistic is the sum of the residuals when each fitted point is left out of the calculation to generate its fit and is generally considered to be a measure of how well a model is likely to predict new data, assuming the data generation process has not changed. A small value of the PRESS statistic is preferable.

In [255]:

asphalt2_PRESS <-sum(rstandard(suggested,type="pred")^2)
asphalt2_PRESS

3.75152181233515

An $R^2_{prediction}=1-\frac{\mathrm{PRESS}}{SS_T}$ can be calculated to estimate how much of the variabilty in predictions can be explained by the model.

In [256]:

SST <- sum(anova(suggested)["Sum Sq"])
Rsq.pred <- 1- asphalt2_PRESS/SST
Rsq.pred

0.936016216310302

Finally, check for influential observations, using a number of statistics.

In [257]:

#check for influential points
sug.DFBETAS <- dfbetas(suggested)
#sug.DFBETAS
obs_influence <- data.frame("residual"=suggested$resid, "Rstudent"=rstudent(suggested),"hii"=hatvalues(suggested),"COVRATIO"=covratio(suggested),"Cook"=cooks.distance(suggested),"DFFITS"=dffits(suggested),"DFBETAS"=sug.DFBETAS)
obs_influence

A data.frame: 31 × 10
	residual	Rstudent	hii	COVRATIO	Cook	DFFITS	DFBETAS..Intercept.	DFBETAS.ln.vis	DFBETAS.surf	DFBETAS.voids
	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>
1	-0.207020660	-0.70576134	0.07716063	1.1680732	1.060900e-02	-0.204076350	-0.0768560893	0.116717571	0.0960249500	-0.026075463
2	-0.196604175	-0.68261694	0.11137436	1.2189481	1.489486e-02	-0.241663052	0.1181669254	0.133181810	-0.1427830578	-0.044636072
3	-0.050340501	-0.17644761	0.14237719	1.3495420	1.340252e-03	-0.071893204	0.0199809283	0.057440562	-0.0056221636	-0.048851682
4	0.441408916	1.53664439	0.05275796	0.8671436	3.130054e-02	0.362649335	0.0079638130	-0.223043165	-0.0229832713	0.082808242
5	-0.149863137	-0.53914638	0.17779940	1.3529136	1.613867e-02	-0.250716516	-0.1568714344	0.008405062	0.0893144917	0.197539157
6	0.133982003	0.45592728	0.08392031	1.2296880	4.904526e-03	0.137994730	-0.0421271850	-0.040168599	0.0710196881	-0.019645783
7	-0.017017293	-0.05667237	0.05100291	1.2248480	4.480739e-05	-0.013138217	-0.0015591434	0.007533843	0.0021386531	-0.002357769
8	0.021637965	0.07288572	0.07229544	1.2525558	1.074553e-04	0.020346657	-0.0002626004	-0.015007477	-0.0006582427	0.005281403
9	-0.347062555	-1.22409327	0.10526642	1.0387986	4.327344e-02	-0.419867803	-0.0402121301	0.336406115	0.0760109013	-0.116703187
10	-0.056956335	-0.19932273	0.13938954	1.3430821	1.668033e-03	-0.080217405	0.0044150702	0.034118110	-0.0232087571	0.025079850
11	-0.418101761	-1.50631695	0.11846143	0.9440012	7.280511e-02	-0.552184473	-0.0903539337	0.281287925	0.2295967575	-0.285754755
12	0.160993608	0.54843906	0.08266286	1.2107409	6.956219e-03	0.164633838	0.1003748000	-0.059636667	-0.1102204580	-0.014454652
13	0.097998379	0.33920436	0.11772149	1.2950462	3.968128e-03	0.123904322	-0.0066007515	-0.074413772	0.0263978164	-0.017744946
14	0.010113811	0.03482754	0.11248118	1.3100958	3.990781e-05	0.012398635	0.0019529281	-0.008873054	-0.0041495648	0.005055671
15	0.295685995	1.12570714	0.23853276	1.2625268	9.826766e-02	0.630048273	-0.1032340605	0.120092917	0.3139119750	-0.342791899
16	0.547065980	1.94888166	0.04961981	0.7092528	4.492036e-02	0.445311916	-0.1209803715	-0.188283742	0.1605797275	0.047232430
17	-0.251932423	-0.86594058	0.08354090	1.1325473	1.724823e-02	-0.261445469	-0.0359677314	-0.188413652	-0.0208449235	0.137164535
18	0.543288805	2.65687799	0.44049012	0.7952221	1.134712e+00	2.357413788	2.1150712927	0.542259966	-2.2282360013	-0.881293443
19	0.101404291	0.34894568	0.10710733	1.2783385	3.774304e-03	0.120855884	0.0421144764	0.076795202	-0.0081160680	-0.090602555
20	-0.017862416	-0.06142864	0.11003123	1.3059748	1.211017e-04	-0.021599390	0.0156665985	-0.005099435	-0.0166065838	-0.006647633
21	-0.102719201	-0.36381402	0.15681515	1.3515051	6.358420e-03	-0.156896062	0.0739228342	0.007235259	-0.0317356661	-0.126049977
22	-0.336938545	-1.16322216	0.07107183	1.0219923	2.554689e-02	-0.321751320	-0.0272310074	-0.213208381	-0.0420366897	0.154632498
23	0.173968579	0.63552900	0.19917308	1.3653508	2.568020e-02	0.316943140	-0.2376549031	0.107325373	0.2580763864	0.080142661
24	-0.269229569	-1.00194106	0.21079983	1.2663765	6.702625e-02	-0.517825831	0.3662536890	-0.079906575	-0.2768255465	-0.366230789
25	0.482814188	1.81731199	0.16284224	0.8610369	1.479844e-01	0.801510790	-0.2914672760	0.274021695	0.1280330729	0.456758555
26	0.261282603	0.91896278	0.12162475	1.1650758	2.940264e-02	0.341954924	-0.0500982902	0.226732573	0.0250327008	0.058833748
27	-0.445690716	-1.66286622	0.16359128	0.9280503	1.269094e-01	-0.735407603	-0.4870032936	-0.532962391	0.3814422075	0.493012813
28	-0.535480749	-1.98303819	0.11657957	0.7494093	1.170250e-01	-0.720374902	-0.3287467441	-0.536080072	0.3066572006	0.248401867
29	0.202467248	0.69300775	0.08513846	1.1814374	1.139274e-02	0.211408915	0.0768102687	0.146654474	-0.0739972777	-0.052536665
30	-0.076486031	-0.26711913	0.13479790	1.3294844	2.878166e-03	-0.105435723	-0.0273290777	-0.077036995	-0.0014893891	0.074619760
31	0.005193694	0.01779541	0.10357264	1.2972551	9.498850e-06	0.006048852	-0.0012598435	-0.001080200	-0.0004417806	0.004360873

In [258]:

#concerning limit for hii
2*3/31
# suggested limit for DFFITS
2*sqrt(3/31)
#suggested limit for DFBETAS
2/sqrt(31)
#suggested COVRATIO Upper Limit
1+3*3/31
#suggested COVRATIO Lower Limit
1-3*3/31

0.193548387096774

0.622171016838255

0.35921060405355

1.29032258064516

0.709677419354839

Observation 18 has the largest value of hii, indicating it has most leverage. It also happens to be the largest outlier, per its RStudent of 2.66, but it is less than 3 standard deviations from the predicted value. It’s Cook’s Distance is over 1 and its DFFITS value is higher than the ideal limit of $2\sqrt{\frac{p}{n}}$, so it definitely has influence. Per the COVRATIO, it is affecting the precision of the fit. The DFFITS values indicate it is primarily influencing the value of the coefficients for the intercept (which is not significant) and the % of asphalt in the surface course. While this could be of concern if there are questions about the accuracy of observation 18, the model on the whole seems acceptable.
Next, check for multicollinearity, or regressors that share a significant portion of the same information.

In [259]:

# calculate VIFs
sug.lnvis <- summary(lm(ln.vis~surf+voids,data=df_asphalt_xform2))$r.squared
sug.surf <- summary(lm(surf~ln.vis+voids,data=df_asphalt_xform2))$r.squared
sug.voids <- summary(lm(voids~ln.vis+surf,data=df_asphalt_xform2))$r.squared
sug.VIF <- 1-1/c(sug.lnvis,sug.surf,sug.voids)
sug.VIF

-3.49436861369904
-6.25279456551605
-2.15776390772964

There is some multicollinearity, as we would prefer to have VIF less than 3, but it does not appear to be severe and probably will not cause problems in the model.

To use this model, it would be helpful to know what the simultaneous 95% confidence level for the coefficients and for future predictions would be.

In [260]:

#calculate the simultaneous interval estimates using the Bonferroni method
cat("lower CI\n")
summary(suggested)$coefficients[1:4,1]-qt(.05/(2*4),27,lower.tail=FALSE)*summary(suggested)$coefficients[1:4,2]
cat("upper CI\n")
summary(suggested)$coefficients[1:4,1]+qt(.05/(2*4),27,lower.tail=FALSE)*summary(suggested)$coefficients[1:4,2]

lower CI

(Intercept): -4.67201398747139
ln.vis: -0.723531032935349
surf: -0.0344899544010845
voids: -0.0645768360959533

upper CI

(Intercept): 2.6304249985205
ln.vis: -0.569451150865691
surf: 1.14542709920293
voids: 0.554163704243565

In [261]:

#prediction 95% confidence interval
sug.ci.int <- function(vis,surf,voids){
    input <- c(1,log(vis),surf,voids)
    t(input)%*%solve(t(suggested$x)%*%suggested$x)%*%input
}
sug.ci.ll <- function(y,vis, surf, voids) {
    output <- y
    for (i in 1:length(y)) {
        output[i] <- y[i] - qt(.05/2,27,lower.tail=FALSE)*sqrt(summary(suggested)$sigma^2*(1+sug.ci.int(vis[i],surf[i],voids[i])))
    }
    output
}

sug.ci.ul <- function(y,vis, surf, voids) {
    output <- y
    for (i in 1:length(y)) {
        output[i] <- y[i] + qt(.05/2,27,lower.tail=FALSE)*sqrt(summary(suggested)$sigma^2*(1+sug.ci.int(vis[i],surf[i],voids[i])))
    }
    output
}
newdata<-data.frame("ln.vis"=log(seq(1,500)),surf=rep(5,500),voids=rep(5,500))
newys <- predict(suggested,newdata)
plot(log(seq(1,500)),newys, xlab="ln(viscosity)", ylab="predicted ln(rut depth)", main="predicted values with changing viscosity\nand % asphalt in surface and voids set to 5%",type="l")
points(log(seq(1,500)),sug.ci.ll(newys,seq(1,500),rep(5,500),rep(5,500)),type="l",lty=3)
points(log(seq(1,500)),sug.ci.ul(newys,seq(1,500),rep(5,500),rep(5,500)),type="l",lty=3)
legend("topright",legend=c("predicted values","95% CI"),lty=c(1,3))

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.