The factiv package allows analysts to address noncompliance in \(2^K\) factorial experiments using instrumental variables methods. The package implements the methods of Blackwell and Pashley (2021). A \(2^K\) factorial experiment randomly assigns units to \(2^K\) possible treatment combinations of \(K\) binary factors, and allows analysts to estimate the main and interactive effects of the factors, which we call factorial effects.
With noncompliance, the factorial effects of treatment uptake are not identified since noncompliance might be related to the outcomes. Blackwell and Pashley (2021) introduce a set of complier-average factorial effects that factiv is designed to estimate.
In factorial experiments, several factors are being randomized at the same time. The newhaven
data in the factiv package gives one example of such a factorial experiment involving a get-out-the-vote experiment where households were randomly assigned to receive (a) in-person canvassing or not, and (b) phone canvassing or not. Thus, in this experiment (which we call a 2x2 factorial design), there are four possible treatment assignments corresponding to the four combinations of the two binary treatments.
library(factiv)
data(newhaven)
table(
`In-Person` = newhaven$inperson_rand,
`Phone` = newhaven$phone_rand
)
## Phone
## In-Person 0 1
## 0 5645 633
## 1 1445 142
In factorial experiments, there are two types of quantities of interest we may want to estimate: main effects and interactions. Main effects are defined as the effect of one factor, marginalizing over the assignment of the other factor. For the newhaven
experiment in-person canvassing, this would be the average of the effect of in-person canvassing when assigned phone contact and the effect when assigned to no phone contact. An interaction is how the effect of a factor changes as a function of another factor. The in-person/phone interaction, then is the difference between the effect of in-person canvassing when assigned phone contact and the effect when not assigned to phone contact. These quantities of interest can be easily estimated using functions of sample means within each treatment combination.
Noncompliance occurs when respondents don’t take their assigned treatment. We say that treatment uptake differs from treatment assignment. We can see this in the newhaven
experiment, with phone canvassing:
table(
`Phone Assignment` = newhaven$phone_rand,
`Phone Uptake` = newhaven$phone
)
## Phone Uptake
## Phone Assignment 0 1
## 0 7090 0
## 1 626 149
Only 149 of the 775 household assigned to the phone canvassing actually received the canvassing. Much of this noncompliance occurs because household do not answer their phones or hang up once they hear the call is a canvasser. Because the decision to comply (answer the phone) is likely correlated with the ultimate outcome, voting, we might worry that using treatment uptake might lead to biased estimates.
To overcome these issues, factiv takes an instrumental variables approach and focuses on estimating the factorial effect among compliers, or those respondents who would comply with their assigned treatment. Blackwell and Pashley (2021) identified two types of complier effects in the factorial setting: marginalized complier average factorial effects (MCAFEs) and perfect complier average factorial effects (PCAFE). These effects are simplest to understand for main effects:
For instance, in the newhaven
data, the MCAFE for phone canvassing is the main factorial effect of (actual) phone canvassing among those who would pick up a phone call when called, regardless of how they would comply with in-person canvassing. The PCAFE would be the same effect for respondents who comply with both factors.
For interaction, we refer to the active factors as the set of factors being interacted. Then we can define our quantities for interactions:
Finally, we note that these interpretations of the quantities of interest depend on several instrumental variable assumptions detailed in Blackwell and Pashley (2021). These include monotoncity of the effects and two exclusion restrictions. The first, the outcome exclusion restriction, requires treatment assignment to only affect the outcome through treatment uptake. The second, the treatment exclusion restriction, requires assignment of each factor to only affect the uptake on that factor and not any others. For more details on these assumptions and the interpretation of these quantities of interest when the assumptions do not hold, see Blackwell and Pashley (2021).
factiv provides two ways of estimating the complier effects that differ in how they treat uncertainty. The first is a finite-population (or finite-sample) approach that treats outcomes as fixed and views treatment assignment as the only source of variation.
The iv_finite_factorial()
function will provide these estimates, along with confidence intervals. To specify the model, you can provide a formula with two right-hand side parts separated by |
that indicate the treatment uptake variables for each factor (on the left of |
) and the treatment assignment variables for each factor (on the right of |
). For the newhaven
data, we have:
<- iv_finite_factorial(turnout_98 ~ inperson + phone | inperson_rand +
out data = newhaven)
phone_rand, summary(out)
##
## Call:
## iv_finite_factorial(formula = turnout_98 ~ inperson + phone |
## inperson_rand + phone_rand, data = newhaven)
##
## Marginalized-complier factorial effects:
## Estimate 95% Confidence Interval
## inperson 0.0997 (-0.0538, 0.2523)
## phone -0.2114 (-0.4940, 0.0352)
##
## Perfect-complier factorial effects:
## Estimate 95% Confidence Interval
## inperson 0.0820 (-0.6656, 0.9921)
## phone 0.0932 (-0.6157, 1.0967)
## inperson:phone -0.1045 (-0.7908, 0.5747)
factiv also provides methods for accessing tidy versions of the output from its estimation function via the broom package.
tidy(out)
## # A tibble: 5 x 7
## term estimand estimate ci_1_lower ci_1_upper ci_2_lower ci_2_upper
## * <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 inperson MCAFE 0.0997 -0.0538 0.252 NA NA
## 2 phone MCAFE -0.211 -0.494 0.0352 NA NA
## 3 inperson PCAFE 0.0820 -0.666 0.992 NA NA
## 4 phone PCAFE 0.0932 -0.616 1.10 NA NA
## 5 inperson:phone PCAFE -0.105 -0.791 0.575 NA NA
In this tidy output, there are columns for 2 sets of confidence intervals. This is because the approach to confidence intervals we use, which we call the Fieller method, can sometimes produce disjoint (or infinite-length) confidence intervals if compliance is very low.
The second estimation approach factiv implements is based on a superpopulation approach to inference. In this setting, we consider our sample as a random sample from an infinite superpopulation and entertain variation from both treatment assignment and that sampling process. The iv_factorial()
function implements this approach:
<- iv_factorial(turnout_98 ~ inperson + phone | inperson_rand +
out_sp data = newhaven)
phone_rand, summary(out_sp)
##
## Main effects among perfect compliers:
## tval pval
## inperson 0.08204 0.35517 0.23099 0.817
## phone 0.09320 0.36317 0.25664 0.797
## inperson:phone -0.10452 0.29398 -0.35553 0.722
##
## Estimated prob. of perfect compliers: 0.07969 SE = 0.02311
The superpopulation approach also has a tidy method:
tidy(out_sp, conf.int = TRUE)
## # A tibble: 5 x 6
## term estimand estimate std.error conf.low conf.high
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 inperson MCAFE 0.0997 0.0774 -0.0519 0.251
## 2 phone MCAFE -0.211 0.132 -0.471 0.0478
## 3 inperson PCAFE 0.0820 0.355 -0.614 0.778
## 4 phone PCAFE 0.0932 0.363 -0.619 0.805
## 5 inperson:phone PCAFE -0.105 0.294 -0.681 0.472
Here, the confidence intervals are based on the usual large-sample variance estimates from the delta method and so will never be disjoint.
One downside to estimating complier effects is that we generally cannot identify which respondents are or are not part of each group. Fortunately, Blackwell and Pashley (2021) showed how to estimate profiles of the different complier groups in terms of covariate means. factiv can implement this approach in the compliance_profile()
function. You can pass this function a similar formula to the estimation functions above, with additional third part separated by a |
that indicates which covariates you would like to use in the profile:
<- compliance_profile(
cov_prof ~ inperson + phone | inperson_rand + phone_rand |
+ maj_party + turnout_96,
age data = newhaven)
cov_prof
## $raw_table
## term overall inperson phone inperson:phone
## 1 age 49.0173656 56.3512346 61.7818218 48.6963536
## 2 maj_party 0.7298688 0.7943939 0.7470757 0.6262926
## 3 turnout_96 0.4543350 0.4893304 0.6148287 0.6951197
##
## $std_table
## term overall inperson phone inperson:phone
## 1 age 49.0173656 0.36819497 0.64083618 -0.01611632
## 2 maj_party 0.7298688 0.14530850 0.03874941 -0.23325020
## 3 turnout_96 0.4543350 0.07028004 0.32231383 0.48355935
This function return a data frame with sample averages of each covariate within each marginal complier group and the perfect complier group (which is the group associated with the highest-order interaction). For example, in the newhaven
output, we see that the marginal compliers for phone canvassing are estimated to be on average roughly 61.8 years old compare to 49 in the sample overall. The std_table
reports the relative difference between each group’s mean and the overall mean (in terms of standard deviations of the overall sample).