Effect Sizes with ART

Introduction

The aligned-rank transform (ART) allows for non-parametric analyses of variance. But how do we derive effect sizes from ART results?

NOTE: Before embarking down the path of calculating standardized effect sizes, it is always worth asking if that is what you really want. Carefully consider, for example, the arguments of Cummings (2011) against the use of standardized effect sizes and in favor of simple (unstandardized) effect sizes. If you decide you would rather use simple effect sizes, you may need to consider a different procedure than ART, as the ranking procedure destroys the information necessary to calculate simple effect sizes.

Contents

  1. Test Dataset: The test data we will use to compare a linear model against ART
  2. Partial eta-squared: Calculation of partial eta-squared (effect size for F tests)
  3. Cohen’s d: Calculation of standardized mean differences (Cohen’s d; effect size for t tests), including confidence intervals

Libraries needed for this

library(dplyr)      #%>%
library(emmeans)    #emmeans
library(DescTools)  #EtaSq
library(car)        #sigmaHat
library(ARTool)     #art, artlm
library(ggplot2)    #ggplot, stat_..., geom_..., etc

Test dataset

Let’s load the test dataset from vignette(“art-contrasts”):

data(InteractionTestData, package = "ARTool")
df = InteractionTestData    #save some typing

Let’s fit a linear model:

#we'll be doing type 3 tests, so we want sum-to-zero contrasts
options(contrasts = c("contr.sum", "contr.poly"))
m.linear = lm(Y ~ X1*X2, data=df)

Now with ART:

m.art = art(Y ~ X1*X2, data=df)

Partial eta-squared

Note that for Fixed-effects-only models and repeated measures models (those with Error() terms) ARTool also collects the sums of squares, but does not print them by default. We can pass verbose = TRUE to print() to print them:

m.art.anova = anova(m.art)
print(m.art.anova, verbose=TRUE)
## Analysis of Variance of Aligned Rank Transformed Data
## 
## Table Type: Anova Table (Type III tests) 
## Model: No Repeated Measures (lm)
## Response: art(Y)
## 
##         Df Df.res  Sum Sq Sum Sq.res F value     Pr(>F)    
## 1 X1     1    294 1403842     845192  488.33 < 2.22e-16 ***
## 2 X2     2    294  984215    1265239  114.35 < 2.22e-16 ***
## 3 X1:X2  2    294 1119512    1129896  145.65 < 2.22e-16 ***
## ---
## Signif. codes:   0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can use the sums of squares to calculate partial eta-squared:

m.art.anova$eta.sq.part = with(m.art.anova, `Sum Sq`/(`Sum Sq` + `Sum Sq.res`))
m.art.anova
## Analysis of Variance of Aligned Rank Transformed Data
## 
## Table Type: Anova Table (Type III tests) 
## Model: No Repeated Measures (lm)
## Response: art(Y)
## 
##         Df Df.res F value     Pr(>F) eta.sq.part    
## 1 X1     1    294  488.33 < 2.22e-16     0.62420 ***
## 2 X2     2    294  114.35 < 2.22e-16     0.43754 ***
## 3 X1:X2  2    294  145.65 < 2.22e-16     0.49769 ***
## ---
## Signif. codes:   0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can compare the above results to partial eta-squared calculated on the linear model (the second column below):

EtaSq(m.linear, type=3)
##          eta.sq eta.sq.part
## X1    0.3562872   0.5991468
## X2    0.1890921   0.4423595
## X1:X2 0.2162503   0.4756719

The results are comparable.

Cohen’s d

We can derive Cohen’s d (the standardized mean difference) by dividing estimated differences by the residual standard deviation of the model. Note that this relies somewhat on the assumption of constant variance across levels (aka homoscedasticity).

in the linear model (for comparison)

As a comparison, let’s first derive pairwise contrasts for all levels of X2 in the linear model:

x2.contrasts = summary(pairs(emmeans(m.linear, ~ X2)))
## NOTE: Results may be misleading due to involvement in interactions

Then divide these estimates by the residual standard deviation to get an estimate of d:

x2.contrasts$d = x2.contrasts$estimate / sigmaHat(m.linear)
x2.contrasts
##  contrast estimate    SE  df t.ratio p.value       d
##  C - D     -1.9121 0.142 294 -13.428  <.0001 -1.8991
##  C - E     -1.8530 0.142 294 -13.013  <.0001 -1.8403
##  D - E      0.0592 0.142 294   0.415  0.9093  0.0588
## 
## Results are averaged over the levels of: X1 
## P value adjustment: tukey method for comparing a family of 3 estimates

Note that this is essentially the same as the unstandardized estimate for this model; that is because this test dataset was generated with a residual standard deviation of 1.

in ART

We can follow the same procedure on the ART model for factor X2:

m.art.x2 = artlm(m.art, "X2")
x2.contrasts.art = summary(pairs(emmeans(m.art.x2, ~ X2)))
## NOTE: Results may be misleading due to involvement in interactions
x2.contrasts.art$d = x2.contrasts.art$estimate / sigmaHat(m.art.x2)
x2.contrasts.art
##  contrast estimate   SE  df t.ratio p.value       d
##  C - D     -123.13 9.28 294 -13.272  <.0001 -1.8769
##  C - E     -119.81 9.28 294 -12.914  <.0001 -1.8263
##  D - E        3.32 9.28 294   0.358  0.9319  0.0506
## 
## Results are averaged over the levels of: X1 
## P value adjustment: tukey method for comparing a family of 3 estimates

Note how standardization is helping us now: The standardized mean differences (d) are quite similar to the estimates of d from the linear model above.

Confidence intervals

We can also derive confidence intervals on these effect sizes. To do that, we’ll use the d.ci function from the psych package, which also requires us to indicate how many observations were in each group for each contrast. That is easy in this case, as each group has 100 observations. Thus:

x2.contrasts.ci = confint(pairs(emmeans(m.linear, ~ X2))) %>%
    mutate(d = estimate / sigmaHat(m.linear)) %>%
    cbind(d = plyr::ldply(.$d, psych::d.ci, n1 = 100, n2 = 100))
## NOTE: Results may be misleading due to involvement in interactions
x2.contrasts.ci
##   contrast    estimate        SE  df   lower.CL   upper.CL          d
## 1    C - D -1.91212883 0.1423941 294 -2.2475590 -1.5766987 -1.8990660
## 2    C - E -1.85296777 0.1423941 294 -2.1883979 -1.5175376 -1.8403091
## 3    D - E  0.05916106 0.1423941 294 -0.2762691  0.3945912  0.0587569
##      d.lower   d.effect    d.upper
## 1 -2.2317001 -1.8990660 -1.5630634
## 2 -2.1697674 -1.8403091 -1.5075275
## 3 -0.2185548  0.0587569  0.3359243

And from the ART model:

x2.contrasts.art.ci = confint(pairs(emmeans(m.art.x2, ~ X2))) %>%
    mutate(d = estimate / sigmaHat(m.art.x2)) %>%
    cbind(d = plyr::ldply(.$d, psych::d.ci, n1 = 100, n2 = 100)) 
## NOTE: Results may be misleading due to involvement in interactions
x2.contrasts.art.ci
##   contrast estimate       SE  df   lower.CL   upper.CL           d    d.lower
## 1    C - D  -123.13 9.277428 294 -144.98434 -101.27566 -1.87694379 -2.2083738
## 2    C - E  -119.81 9.277428 294 -141.66434  -97.95566 -1.82633505 -2.1550486
## 3    D - E     3.32 9.277428 294  -18.53434   25.17434  0.05060873 -0.2266816
##      d.effect    d.upper
## 1 -1.87694379 -1.5421619
## 2 -1.82633505 -1.4943094
## 3  0.05060873  0.3277681

And plotting both, to compare (red dashed line is the true effect):

rbind(
        cbind(x2.contrasts.ci, model="linear"), 
        cbind(x2.contrasts.art.ci, model="ART")
    ) %>%
    ggplot(aes(x=model, y=d, ymin=d.lower, ymax=d.upper)) +
    geom_pointrange() +
    geom_hline(aes(yintercept = true_effect), 
      data = data.frame(true_effect = c(-2, -2, 0), contrast = c("C - D", "C - E", "D - E")), 
      linetype = "dashed", color = "red") +
    facet_grid(contrast ~ .) + 
    coord_flip()