An experiment was conducted to determine the optimum conditions for extruding plastic film.
Three responses, tear
resistance, film gloss
and film opacity
were measured in relation to two factors, rate
of extrusion and amount of an additive
,
both of these being set to two values, High and Low. The data set comes from
Johnson & Wichern (1992).
Multivariate tests
We begin with an overall MANOVA for the two-way MANOVA model. In all these analyses, we use
car::Anova()
for significance tests rather than stats::anova()
, which only provides
so-called “Type I” (sequential) tests for terms in linear models.
In this example, because each effect has 1 df, all of the multivariate statistics
(Roy’s maximum root test, Pillai and Hotelling trace criteria, Wilks’ Lambda)
are equivalent, in that they give the same \(F\) statistics and \(p\)-values.
We specify test.statistic="Roy"
to emphasize that Roy’s test has
a natural visual interpretation in HE plots.
plastic.mod <- lm(cbind(tear, gloss, opacity) ~ rate*additive, data=Plastic)
Anova(plastic.mod, test.statistic="Roy")
#>
#> Type II MANOVA Tests: Roy test statistic
#> Df test stat approx F num Df den Df Pr(>F)
#> rate 1 1.619 7.55 3 14 0.003 **
#> additive 1 0.912 4.26 3 14 0.025 *
#> rate:additive 1 0.287 1.34 3 14 0.302
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
For the three responses jointly, the main effects of rate
and additive
are significant, while their interaction is not.
In some approaches to testing effects in multivariate linear models (MLMs),
significant multivariate tests are often followed by univariate tests on each
of the responses separately to determine which responses contribute to each
significant effect.
In R, univariate analyses are conveniently performed
using the update()
method for the mlm
object plastic.mod
, which re-fits the model with
only a single outcome variable.
Anova(update(plastic.mod, tear ~ .))
#> Anova Table (Type II tests)
#>
#> Response: tear
#> Sum Sq Df F value Pr(>F)
#> rate 1.74 1 15.8 0.0011 **
#> additive 0.76 1 6.9 0.0183 *
#> rate:additive 0.00 1 0.0 0.9471
#> Residuals 1.76 16
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(update(plastic.mod, gloss ~ .))
#> Anova Table (Type II tests)
#>
#> Response: gloss
#> Sum Sq Df F value Pr(>F)
#> rate 1.301 1 7.92 0.012 *
#> additive 0.612 1 3.73 0.071 .
#> rate:additive 0.544 1 3.32 0.087 .
#> Residuals 2.628 16
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(update(plastic.mod, opacity ~ .))
#> Anova Table (Type II tests)
#>
#> Response: opacity
#> Sum Sq Df F value Pr(>F)
#> rate 0.4 1 0.10 0.75
#> additive 4.9 1 1.21 0.29
#> rate:additive 4.0 1 0.98 0.34
#> Residuals 64.9 16
The results above show significant main effects for tear
,
a significant main effect of rate
for gloss
,
and no significant effects for opacity
, but they don’t shed light on the
nature of these effects.
Traditional univariate plots of the means for each variable separately
are useful, but they don’t allow visualization of the
relations among the response variables.
HE plots
We can visualize these
effects for pairs of variables in an HE plot, showing the “size” and
orientation of hypothesis variation (\(\mathbf{H}\)) in relation to error
variation (\(\mathbf{E}\)) as ellipsoids.
When, as here, the model terms have 1 degree of freedom, the
\(\mathbf{H}\) ellipsoids degenerate to a line.
In HE plots, the \(\mathbf{H}\) ellipses can be scaled relative to the
\(\mathbf{E}\) to show significance of effects (size="evidence"
),
or effect size (size="effect"
). In the former case, a model term
is significant (using Roy’s maximum root test) iff the
\(\mathbf{H}\) projects anywhere outside the \(\mathbf{E}\) ellipse.
This plot overlays those for both scaling, using thicker lines for the
effect scaling.
## Compare evidence and effect scaling
colors = c("red", "darkblue", "darkgreen", "brown")
heplot(plastic.mod, size="evidence",
col=colors, cex=1.25,
fill=TRUE, fill.alpha=0.1)
heplot(plastic.mod, size="effect",
add=TRUE, lwd=5, term.labels=FALSE, col=colors)
The interpretation can be easily read from the plot, at least for the two response variables
(tear
and gloss
) that are shown in this bivariate view. The effect of rate
of extrusion is
highly significant: high rate shows greater tear
compared to low rate. The effect of amount of
additive is not significant in this view, but high level of additive has greater tear
and gloss
.
With effect scaling, both the \(\mathbf{H}\) and \(\mathbf{E}\) sums of squares and products
matrices are both divided by the error df, giving multivariate analogs of univariate
measures of effect size, e.g., \((\bar{y}_1-\bar{y}_2) / s\).
With significance scaling, the \(\mathbf{H}\) ellipse is further divided by
\(\lambda_\alpha\), the critical value of Roy’s largest root statistic.
This scaling has the property that an \(\mathbf{H}\) ellipse will protrude somewhere
outside the \(\mathbf{E}\) ellipse iff the
multivariate test is significant at level \(\alpha\).
Figure 2.2 shows both scalings, using a thinner line for significance scaling.
Note that the (degenerate) ellipse for additive
is significant, but
does not protrude outside the \(\mathbf{E}\) ellipse in this view.
All that is guaranteed is that it will protrude somewhere in the 3D space of
the responses.
By design, means for the levels of interaction terms are not shown in the HE plot,
because doing so in general can lead to messy displays.
We can add them here for the term rate:additive
as follows:
# Compare evidence and effect scaling
colors = c("red", "darkblue", "darkgreen", "brown")
heplot(plastic.mod, size="evidence",
col=colors, cex=1.25,
fill=TRUE, fill.alpha=0.05)
heplot(plastic.mod, size="effect",
add=TRUE, lwd=5, term.labels=FALSE, col=colors)
## add interaction means
intMeans <- termMeans(plastic.mod, 'rate:additive', abbrev.levels=2)
points(intMeans[,1], intMeans[,2], pch=18, cex=1.2, col="brown")
text(intMeans[,1], intMeans[,2], rownames(intMeans),
adj=c(0.5, 1), col="brown")
lines(intMeans[c(1,3),1], intMeans[c(1,3),2], col="brown")
lines(intMeans[c(2,4),1], intMeans[c(2,4),2], col="brown")
The factor means in this plot (Figure 2.2 have a simple interpretation:
The high rate
level yields greater tear
resistance but lower gloss
than the low level.
The high additive
amount produces greater tear
resistance and greater gloss
.
The rate:additive
interaction is not significant overall, though it
approaches significance for gloss
.
The cell means for the combinations
of rate
and additive
shown in this figure suggest an explanation,
for tutorial purposes:
with the low level of rate
, there is little difference in gloss
for the levels of additive
. At the high level of rate
, there is
a larger difference in gloss
. The \(\mathbf{H}\) ellipse for the interaction
of rate:additive
therefore “points” in the direction of gloss
indicating that this variable contributes to the interaction in the
multivariate tests.
In some MANOVA models, it is of interest to test sub-hypotheses
of a given main effect or interaction, or conversely to test composite
hypotheses that pool together certain effects to test them jointly.
All of these tests (and, indeed, the tests of terms in a given model)
are carried out as tests of general linear hypotheses in the MLM.
In this example, it might be useful to test two composite hypotheses:
one corresponding to both main effects jointly, and another corresponding
to no difference among the means of the four groups (equivalent to
a joint test for the overall model). These tests are specified in terms
of subsets or linear combinations of the model parameters.
plastic.mod
#>
#> Call:
#> lm(formula = cbind(tear, gloss, opacity) ~ rate * additive, data = Plastic)
#>
#> Coefficients:
#> tear gloss opacity
#> (Intercept) 6.30 9.56 3.74
#> rateHigh 0.58 -0.84 -0.60
#> additiveHigh 0.38 0.02 0.10
#> rateHigh:additiveHigh 0.02 0.66 1.78
Thus, for example, the joint test of both main effects tests the parameters
rateHigh
and additiveHigh
.
linearHypothesis(plastic.mod, c("rateHigh", "additiveHigh"),
title="Main effects") |>
print(SSP=FALSE)
#>
#> Multivariate Tests: Main effects
#> Df test stat approx F num Df den Df Pr(>F)
#> Pillai 2 0.71161 2.7616 6 30 0.029394 *
#> Wilks 2 0.37410 2.9632 6 28 0.022839 *
#> Hotelling-Lawley 2 1.44400 3.1287 6 26 0.019176 *
#> Roy 2 1.26253 6.3127 3 15 0.005542 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linearHypothesis(plastic.mod, c("rateHigh", "additiveHigh", "rateHigh:additiveHigh"),
title="Groups") |>
print(SSP=FALSE)
#>
#> Multivariate Tests: Groups
#> Df test stat approx F num Df den Df Pr(>F)
#> Pillai 3 1.14560 3.2948 9 48.000 0.003350 **
#> Wilks 3 0.17802 3.9252 9 34.223 0.001663 **
#> Hotelling-Lawley 3 2.81752 3.9654 9 38.000 0.001245 **
#> Roy 3 1.86960 9.9712 3 16.000 0.000603 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correspondingly, we can display these tests in the HE plot by specifying these tests in the
hypothesis
argument to heplot()
, as shown in Figure 2.3.
heplot(plastic.mod,
hypotheses=list("Group" = c("rateHigh", "additiveHigh", "rateHigh:additiveHigh ")),
col=c(colors, "purple"),
fill = TRUE, fill.alpha = 0.1,
lwd=c(2, 3, 3, 3, 2), cex=1.25)
heplot(plastic.mod,
hypotheses=list("Main effects" = c("rateHigh", "additiveHigh")),
add=TRUE,
col=c(colors, "darkgreen"), cex=1.25)
Finally, a 3D HE plot can be produced with heplot3d()
, giving Figure 2.4.
This plot was rotated interactively to a view that shows both main effects
protruding outside the error ellipsoid.
colors = c("pink", "darkblue", "darkgreen", "brown")
heplot3d(plastic.mod, col=colors)