Unifying instrumental variable and inverse probability weighting approaches for inference of causal treatment effect and unmeasured confounding in observational studies

Abstract

Confounding is a major concern when using data from observational studies to infer the causal effect of a treatment. Instrumental variables, when available, have been used to construct bound estimates on population average treatment effects when outcomes are binary and unmeasured confounding exists. With continuous outcomes, meaningful bounds are more challenging to obtain because the domain of the outcome is unrestricted. In this paper, we propose to unify the instrumental variable and inverse probability weighting methods, together with suitable assumptions in the context of an observational study, to construct meaningful bounds on causal treatment effects. The contextual assumptions are imposed in terms of the potential outcomes that are partially identified by data. The inverse probability weighting component incorporates a sensitivity parameter to encode the effect of unmeasured confounding. The instrumental variable and inverse probability weighting methods are unified using the principal stratification. By solving the resulting system of estimating equations, we are able to quantify both the causal treatment effect and the sensitivity parameter (i.e. the degree of the unmeasured confounding). We demonstrate our method by analyzing data from the HIV Epidemiology Research Study.

Keywords

Causal treatment effect identification bounds instrumental variable inverse probability weighting unmeasured confounding

1 Introduction

Observational studies offer an important alternative to randomized clinical trials when randomly assigning treatments to study subjects is unethical or practically impossible.¹ Analyzing data from such studies, however, confronts a difficulty that a direct comparison between the treated and untreated subjects does not necessarily reflect the causal effect of the treatment due to confounding.²

To control for the confounding effect, study investigators typically collect a rich set of covariates with the hope that these covariates capture all background differences between treated and untreated subjects. If all background differences between the comparison groups have been correctly measured, the confounding bias can be corrected by adjustments such as multivariable regressions, stratified analyses, propensity score matching, and inverse probability weighting (IPW).^3–11 Absence of unmeasured confounding (i.e. ignorability) is a strong and untestable assumption, and often implausible in observational studies.

When ignorability cannot be assumed, estimating the causal effect of a treatment and sometimes quantifying the degree of unmeasured confounding are imperative. The former is often the primary research objective, while the later becomes important when the robustness of analyses assuming ignorability needs to be objectively assessed, or when the impact of unmeasured confounding needs to be evaluated, e.g., for planning future studies in similar settings. These two objectives are typically not achievable in a single observational study, but possible given the existence of an instrumental variable (IV).

Instrumental variable methods can be traced back to 1920s,^12,13 and have been extensively implemented in econometric and recently in biomedical research. Loosely speaking, an IV can be envisioned as a “randomizer” which (1) varies independent of confounders, (2) has a causal effect on treatment received, but (3) has no direct effect on the outcome of interest. These three conditions are conventionally referred to as the exogeneity, monotonicity, and exclusion restriction assumptions, respectively.¹⁴ An IV allows for drawing causal inference about treatment effect despite the existence of unmeasured confounding. However, without additional assumptions, the IV estimate of the treatment effect applies only to a non-identifiable subpopulation of those whose treatment can be changed by the IV.^15–17

The population average treatment effect (ATE) is generally of broad interest in public health and epidemiology. To infer the ATE, IVs (if available) have been used to construct bound estimates in simple settings (e.g. when both treatment and outcome are binary).^18–23 In those settings, the uncertainty of unmeasured confounding effect is accounted for by a bound estimate instead of a point estimate. With continuous outcomes, obtaining bounds on ATE becomes a challenge because the domain of the outcome is typically unrestricted. In this case, additional properties of data may be implemented to construct contextually proper constraints on the observed and counterfactual data so as to identify meaningful bounds on the ATE. In this paper, we use the HIV Epidemiology Research Study (HERS)^24,25 as an example to explore such constraints. Our interest is to estimate the ATE of highly active antiretroviral therapy (HAART) on patients’ CD4+ T lymphocytes (CD4) count and to quantify the degree of unmeasured confounding in the HERS.

The HIV Epidemiology Research Study was conducted when the HAART first became available to HIV-infected patients. It was a cohort study and prescription of HAART to study participants was not random. One of investigators’ interests was the initial-stage causal effect of HAART on patients’ CD4 count, an immunological marker for immune system function and disease stage. The study had collected an extensive set of covariates, but like many observational studies unmeasured confounding might still exist^26,27 and its impact was unclear. To have a sense of unmeasured confounding, many HIV-positive individuals in the early HAART era were reluctant to initiate therapy due to the fear of adverse side effects and toxicity, and physicians at the time tended to prescribe HAART to patients with poor health condition, particularly those with low CD4 count. These prognostic factors were not fully measured and possibly confounded the HAART effect in a non-negligible way by affecting both treatment decisions and outcomes.²⁵

In this paper, we describe an approach that unifies the IV and IPW methods to simultaneously quantify (1) the ATE and (2) the degree of unmeasured confounding. To account for measured confounding, we use the IPW method⁴ to “restore” the balance on measured covariates between treated and untreated subjects. To capture the unmeasured confounding, a sensitivity parameter is incorporated into the IPW estimating equations using the approach of Robins et al.²⁸ The sensitivity parameter is defined as the systematic difference between the treated and untreated patients if hypothetically having these patients exposed to the same treatment condition, after the measured confounding has been balanced out. This sensitivity parameter has been previously used to conduct sensitivity analyses to assess the robustness of estimated causal treatment effects to unmeasured confounding.^25,29 In this paper, we assume that an IV is available. The HERS was conducted at two types of study sites: academic medical centers and community health clinics. This motivates us to consider using study site of the HERS as an instrument variable. Instead of conducting a sensitivity analysis, we take advantage of additional information provided by IV to estimate the sensitivity parameter. We propose to unify the IV and IPW estimating equations with a constraint imposed by the principal stratification³⁰ and contextually suitable assumptions. By solving the resulting system of estimating equations, we obtain causal bound estimates on both the ATE and the sensitivity parameter for unmeasured confounding.

The rest of the paper is organized as follows: More details about the HERS and motivations of using HERS site as an IV are provided in Section 2. Notations and models are elaborated in Section 3. In Section 4, we review the IV and IPW methods, and then introduce a unified system of estimating equations derived from them. In Section 5, we present three sets of constraints and assumptions in the context of HERS, and develop bounds on the initial-stage ATE of HAART on CD4 count and bounds on the degree of unmeasured confounding. In Section 6, we analyze the HERS data, and finally in Section 7, we offer some points for discussion.

2 The HIV Epidemiology Research Study (HERS)

2.1 Study overview

The HERS was conducted from 1993 to 2001 to investigate the natural history of HIV progression in women. Details of the study have been reported previously.²⁴ The study enrolled a total of 871 HIV-infected women at four study sites: Detroit, Providence, Baltimore, and New York City. Clinical outcomes (e.g. CD4 count) of each participant were recorded about every six months since enrollment. Starting around 1996, HAART became the recommended treatment regimen for HIV infected people, especially for those with low CD4 counts.³¹ In this paper, we used data extracted for 201 HERS participants who completed both their seventh and eighth visits. They also met the following two conditions: (a) They were HAART naive before their seventh visit, and (b) had a low CD4 count of less than 350 cells/mm³ before their eighth visit which indicated having a deteriorating immune system. Some of them were prescribed HAART after the seventh visit. Their CD4 counts at the eight visit were used as the outcome. The study had collected a rich set of covariates, but unmeasured confounding might still exist.

Table 1 summarizes the key demographic and clinical characteristics of the 201 women. In brief, 46 women (23%) had initiated the HAART. Those receiving HAART had a higher CD4 count on average than those not on HAART, but this “as-received” treatment effect³² was not statistically significant (standard normal z statistic = 0.58) and was certainly a biased estimate of HAART causal effect.

Table 1.

Summary of patient demographic characteristics by HAART receipt status and study site.

	Received HAART?		Comparison
	Yes	No	Statistic
Number of patients (n)	46	155	–
Average CD4 counts (cell/mm $^{3}$ )	229 (19)	216 (11)	Z $= 0.58$
Candidate confounders
Race:
Black; white; others	46; 28; 26%	61; 15; 24%	$χ_{2}^{2} = 5.1$
ARV med. receipt rate
At enrollment (%)	50% (7.4)	39% (3.9)	Z $= 1.2$
At previous visit (%)	74% (6.5)	57% (4.0)	Z $= 1.9$
Presence of HIV symptoms (%)	26% (6.5)	37% (3.9)	Z $= 1.2$
HIV RNA (log $_{10} c opy / m m^{3}$ )
At enrollment (average)	3.2 (.15)	3.1 (.07)	Z $= .78$
At previous visit (average)	3.7 (.15)	3.4 (.09)	Z $= 1.5$
Intravenous drug use
Recent (%)	22% (6.1)	.25 (.035)	Z $= .19$
Lifetime (%)	61% (7.2)	.63 (.039)	Z $= .04$
Aware of HIV status (%)	83% (5.6)	.81 (.032)	Z $= .08$
	The HERS study site
	Academic centers	Community clinics
Number of patients (n)	93	108	–
HAART received, n (%)	26; 28%	20; 18%	Z $= 1.4$
Average CD4 (cell/mm $^{3}$ )	230 (14)	210 (12)	Z $= 1.0$

Note: The numbers inside parentheses are standard errors. z stands for a standard normal test for comparing two sample means, and $χ_{2}^{2}$ for a chi-squared statistic for Pearson’s chi-squared test. HAART: highly active antiretroviral therapy.

Ko et al.²⁵ analyzed the same data set and screened out several candidate confounders, which are listed in the upper panel of Table 1.]Notably, we found that patients receiving HAART were (a) more likely to be aware their HIV status and on antiretroviral medicines at their enrollment and at the previous visit; (b) presenting less HIV symptoms and less likely to be a drug user; (c) having higher viral loads (HIV-RNA) at their enrollment and at the previous visit; and (d) consisting of relatively more white and less black. As pointed out earlier, other confounders could likely exist and were not fully captured by the study.

2.2 HERS study site as IV

The HERS was a multi-center study and designed to recruit participants from two types of study sites for increased study generalizability. The study sites in Detroit and Providence were academic medical centers, while the other two study sites in Baltimore and New York City were community health clinics. The two types of study sites differed in many aspects. For example, the HERS investigators have noted that the academic sites had higher referral rates to the HERS by physicians and study clinic nurses and had higher HAART uptake rates among their participants.²⁴ Generally speaking, compared with community health clinics, academic medical centers tended to involve more actively in research on cutting-edge therapies and innovative HIV treatments besides routine patient care. As a result, physicians at the academic medical centers were more likely to be aware of the latest breakthroughs on HIV treatment, and hence when HAART first became available, they were more likely to prescribe it to HIV patients.

These differences motivate us to consider using the type of HERS study site as an IV. Using different characteristics of hospitals or physicians as IVs has been explored in other studies.^27,33,34 As noted by authors of these studies, differences in health care facilities/giver can be a reasonable but not a perfect IV. In the following section, we formalize the assumptions that are needed to use HERS study site as an IV. Potential violations of these assumptions are pointed out and their impacts are discussed later in Section 7.

3 Notations and definitions

3.1 Notation

We use Z to denote an IV (in the HERS, Z = 1 if the study site is an academic medical center and $= 0$ a community health clinic) and A_z the potential treatment status that an individual would receive should Z be set to z. This notation implies that each individual has a pair of potential treatments $(A_{1}, A_{0})$ ^35,36 that she would potentially receive if seen by doctors at the two types of study site, and the actual treatment received is $A = A_{Z} = A_{1} Z + A_{0} (1 - Z)$ , where A = 1 means that the individual receives HAART, and 0 otherwise. With the Stable Unit Treatment Value Assumption,³⁶ we use $Y_{z} (a)$ to denote the potential outcome for an individual should we hypothetically set the IV to z and her treatment to a. The actual outcome observed is therefore $Y = Y_{Z} (A) = Y_{Z} (A_{Z})$ . In the following, we denote all confounders by a vector X and the measured confounders by $V \subseteq X$ . The observed data consist of n identically and independently distributed copies of ${V_{i}, Z_{i}, A_{i}, Y_{i}}, i = 1, 2, \dots, n$ .

We assume that the conventional IV assumptions – the exogeneity, exclusion restriction, and monotonicity assumptions^15,16 – are satisfied. The exclusion restriction implies that $Y (a) \equiv Y_{1} (a) = Y_{0} (a)$ , i.e. the IV has no direct effect on the outcome beyond its impact on individual’s treatment. The monotonicity assumption assumes that $\Pr (A_{1} \geq A_{0}) = 0$ , i.e. an individual who was not prescribed HAART at an academic medical center would not either at a community health clinic. The exogeneity assumes that Z is jointly independent of the potential outcomes and treatments, $Z ⊥ (A_{0}, A_{1}, Y (0), Y (1))$ .

3.2 Definitions of causal treatment effect

Using potential outcomes, the causal effect of a treatment can be defined at different levels. The ATE $= E {Y (1) - Y (0)}$ is the treatment effect defined over the entire population, which is the interest of this paper. With an IV, the local average treatment effect (LATE)¹⁵ is defined as the average treatment effect among the subpopulation whose treatment assignment can be changed by the IV, i.e. $LATE = E {Y (1) - Y (0) | A_{0} = 0, A_{1} = 1}$ . For a given IV, the LATE can be estimated consistently even with the presence of unmeasured confounding. However, a limitation of LATE estimates is that the subpopulation is not fully identified and the interpretation of the LATE depends on the choice of IV, which poses a significant drawback for generalizing results to the general population and to other settings.

The relationship between the ATE and LATE can be expressed using the principal stratification.³⁰ For a binary instrument and a binary treatment, the principal stratification suggests that the population can be partitioned into four mutually exclusive subpopulations based on the potential treatments each individual would have: In HERS, $P_{00} = {(A_{0}, A_{1}) = (0, 0)}$ is the subpopulation who would never receive HAART; $P_{01} = {(A_{0}, A_{1}) = (0, 1)}$ is the subpopulation who would receive HAART only at academic medical centers; $P_{10} = {(A_{0}, A_{1}) = (1, 0)}$ is the subpopulation who would receive HAART only at community health clinics; and $P_{11} = {(A_{0}, A_{1}) = (1, 1)}$ is the subpopulation who would always receive HAART. The monotonicity assumption implies that the subpopulation $P_{10}$ is an empty set.

Let us denote the estimands of ATE and LATE by $β^{ATE}$ and $β^{LATE}$ , respectively. Then the relationship between ATE and LATE is expressed as

β^{ATE} = π_{01} β^{LATE} + π_{00} {μ_{00} (1) - μ_{00} (0)} + π_{11} {μ_{11} (1) - μ_{11} (0)}

(1)

where

π_{j k} = \Pr (P_{j k})

and

μ_{j k} (a) = E {Y (a) | P_{j k}}

. We will use this relationship to unify the IPW and IV estimation methods.

4 Review of estimation methods

4.1 The IPW method

Putting aside the covariates for the moment, the potential outcomes can be rewritten using a marginal structural mean model^6,37,38

E [Y (a)] = β_{0} + β^{ATE} a, a = 0, 1

(2)

When unmeasured confounding is absent (i.e. $Y (a) ⊥ A | V$ ), $β^{ATE}$ can be consistently estimated by the solution ${\hat{β}}_{IPW}$ to the following IPW estimating equations⁴

U_{1} (β_{IPW}) : = \sum_{i = 1}^{n} {(1, A_{i})}^{⊤} W_{1 i} (Y_{i} - β_{1} - A_{i} β_{IPW}) = 0,

where

W_{1 i} = A_{i} / e (V_{i}; γ) + (1 - A_{i}) / {1 - e (V_{i}; γ)}

, and

e (V; γ) = \Pr (A = 1 | V)

is the propensity score³⁹ with a l-dimension parameter γ. We assume that

0 < e (V; γ) < 1

so that the estimating equations are well defined. This condition is referred to as the positivity assumption in the causal inference literature. It assumes that all individuals have positive probabilities of receiving the treatment and positive probabilities of not receiving the treatment. In other words, for any subpopulation on the support of

V

, we have information available about the distributions of both Y(0) and Y(1).

The IPW method has several properties that are worth mentioning. First, the efficiency of the resulting estimator can be improved by using stabilized weights to replace $W_{1 i}$ .⁴⁰ Second, the estimating equations can be augmented to achieve double robustness if we further specify an outcome regression model of Y on A and V.⁸ Finally, if γ is unknown, ${\hat{β}}_{IPW}$ remains consistent when γ is replaced by a consistent estimator $\hat{γ}$ that solves

U_{2} (γ) : = \sum_{i = 1}^{n} W_{2 i} {A_{i} - e (V_{i}; γ)} = 0

where

W_{2 i}

is an appropriate weight function; for example,

W_{2 i} = \partial e (V_{i}; γ) / \partial γ

if a logistic model can describe the relationship between A and V.

When unmeasured confounding exists, $U_{1} (β_{IPW})$ is biased (i.e. $E {U_{1} (β_{IPW})} \neq 0$ ). In this case, Robins et al.³⁸ proposed to introduce a sensitivity parameter τ and estimate $β^{ATE}$ by the solution ${\hat{β}}_{MIPW} (τ)$ to the following modified IPW estimating equations

U_{3} (β_{MIPW}, τ) : = \sum_{i = 1}^{n} {(1, A_{i})}^{⊤} W_{1 i} {Y_{i}^{*} - β_{2} - A_{i} β_{MIPW}} = 0

where

Y_{i}^{*} = Y_{i} - τ {A_{i} - e (V_{i}; γ)}

is the “outcome” corrected for the selection bias due to unmeasured confounding. For binary treatments, the sensitivity parameter can be simplified as the contrast of the potential outcomes between the treated and untreated conditional on V²⁵

τ = (a - a') [E {Y (a) | A = a, V} - E {Y (a) | A = a', V}]

with

a = 1 - a'

. In the context of the HERS,

τ > 0

means that the HAART might be preferentially given to those with higher CD4 counterfactuals Y(a); while

τ < 0

means the opposite; and when τ = 0, no unmeasured confounding is implied and the resulting estimator

{\hat{β}}_{MIPW} (0) = {\hat{β}}_{IPW}

Without additional assumptions, the parameter τ is not identified by the data. The resulting estimator ${\hat{β}}_{MIPW} (τ)$ is typically used to conduct a sensitivity analysis—estimate $β^{ATE}$ using ${\hat{β}}_{MIPW} (τ)$ as if τ is known and examine the sensitivity of ${\hat{β}}_{MIPW} (τ)$ to unmeasured confounding by varying the value of τ over its plausible range.^25,29 In this paper, we assume that an IV is available. Instead of conducting a sensitivity analysis, we propose to use the extra information extracted by IV to quantify τ (i.e. the degree of unmeasured confounding).

4.2 The IV method

The IV methods have been widely used in econometric research.⁴¹ In the just-identified case with a single binary IV and a binary treatment, the standard IV estimating equations are

U_{4} (β_{IV}) : = \sum_{i = 1}^{n} {(1, Z_{i})}^{⊤} (Y_{i} - β_{3} - β_{IV} A_{i}) = 0

When the IV assumptions given in Section 3.1 are satisfied, the solution

{\hat{β}}_{IV} = \frac{\bar{Y Z} / \bar{Z} - \bar{Y (1 - Z)} / \bar{(1 - Z)}}{\bar{A Z} / \bar{Z} - \bar{A (1 - Z)} / \bar{(1 - Z)}}

(3)

is consistent for

β^{LATE}

.^15,16,42 An important property of the IV estimand is that

{\hat{β}}_{IV}

remains consistent despite the existence of unmeasured confounding.

Under the framework of the generalized method of moments, the IV estimating equations can be readily solved using the two-stage least squares method.^14,43,44 The IV estimating equations also can incorporate a weight matrix to allow for heteroskedastic or correlated residuals, and be generalized to deal with multiple IVs and non-continuous outcomes.^41,45

5 A unified system of estimating equations

We propose to use principal stratification and the resulting constraint (1) to unify the IV and IPW methods. Specifically, we propose to jointly solve the following system of constrained estimation equations

{(U_{2} (γ), U_{3} (β_{MIPW}, τ), U_{4} (β_{IV}))}^{⊤} = 0

(4)

and use the solution

{\hat{β}}_{MIPW}

to estimate the ATE and

\hat{τ}

to estimate the degree of unmeasured confounding.

One problem of using the constraint (1) is that $μ_{11} (0)$ and $μ_{00} (1)$ in the equation are the averages of unobserved potential outcomes and hence not identified, while all other parameters are identified because $π_{11} = E (A = 1 | Z = 0), π_{00} = E (A = 0 | Z = 1), π_{01} = 1 - π_{00} - π_{11}, μ_{11} (1) = E (Y | A = 1, Z = 0)$ , and $μ_{00} (0) = E (Y | A = 0, Z = 1)$ .¹⁶ So to implement the above constrained estimating equations system, additional prior information of $μ_{11} (0)$ and $μ_{00} (1)$ is needed.

In this following, we explore and present three sets of assumptions in the context of the HERS. Each allows us to identify bounds on the ATE and unmeasured confounding parameter τ. In Sections 5.1 to 5.3, we first assume that the sample size n is sufficiently large such that the sampling variation of the estimating equations (4) is ignored. Then in Sections 5.4 and 5.5, we discuss inferences on the sampling uncertainty of bound estimates for a finite n.

5.1 Assumption on the upper limits of μ₁₁ $(0)$ and μ₀₀ $(1)$

The outcome variable of our interest is CD4 count, so $μ_{00} (1)$ and $μ_{11} (0)$ must be greater than zero. In our first set of assumptions, we make a simple assumption that there exist two upper bounds that

Assumption (A): $0 \leq μ_{00} (1) \leq ξ_{1}$ , $0 \leq μ_{11} (0) \leq ξ_{0}$ , with known ξ₀ and ξ₁.

Assumption (A) leads to a simplified version of the Robins-Manski type bound on the ATE.^18,19,23 It is straightforward to show that with known ξ₀ and ξ₁, the ATE falls within the interval

[b (ξ_{0}, 0), b (0, ξ_{1})]

where to emphasize the unidentifiable parameters in equation (1), we define

b (μ_{11} (0), μ_{00} (1)) = π_{01} \times LATE + π_{11} {μ_{11} (1) - μ_{11} (0)} + π_{00} {μ_{00} (1) - μ_{00} (0)}

The bound on τ can be inferred by finding the values of τ such that the corresponding solutions to equation (4) are consistent with the above bound on ATE. It is straightforward to verify that for a given $β^{ATE}$ , the solution of τ is

{\hat{τ}}_{n} (β^{ATE}, γ) = \frac{\bar{W_{1} A} * \bar{W_{1} (Y - β^{ATE} A)} - {\bar{W}}_{1} * \bar{W_{1} A (Y - β^{ATE} A)}}{\bar{W_{1} A} * \bar{W_{1} (A - e (V; γ))} - {\bar{W}}_{1} * \bar{W_{1} A (A - e (V; γ))}}

which is a non-increasing function of

β^{ATE}

. So the unmeasured confounding parameter τ is bounded by

[τ (b (0, ξ_{1}), γ), τ (b (ξ_{0}, 0), γ)]

where

τ (β^{ATE}, γ) \equiv {\hat{τ}}_{\infty} (β^{ATE}, γ)

Assumption (A) alone is sufficient to identify the bounds on ATE and τ, but the two upper limits ξ₀ and ξ₁ need to be sufficiently large, making the two bounds too wide to be of practical value. In practices, contextually plausible constraints, such as that the average treatment effect among $P_{11}$ (those always receiving treatment) is higher than other subpopulations, are often made to tighten the bounds.^22,46–48 That motivates us to consider making assumptions on the relative magnitudes between the unidentified and identified quantities.

5.2 Constraints on relationships between μ₁₁ $(0)$ and μ₀₀ $(1)$ and identified quantities

Assumption (B): We assume that

The average treatment effect among $P_{11}$ is no less than a known δ₁₁

E {Y (1) - Y (0) | P_{11}} = μ_{11} (1) - μ_{11} (0) \geq δ_{11}

A plausible choice for δ₁₁ is zero; that is, we assume that on average, the subpopulation $P_{11}$ (who would always receive HAART) would benefit from HAART. We make this assumption because during the HERS, HAART was becoming a recommended therapy particularly for HIV patients with low CD4 count. Further, we impose a known lower bound on the average treatment effect among $P_{00}$ that

E {Y (1) - Y (0) | P_{00}} = μ_{00} (1) - μ_{00} (0) \geq δ_{00}

A negative value of δ₀₀ implies that HAART can potentially be harmful for those who would never receive HAART at either site, while setting $δ_{00} = 0$ implies that HAART is also beneficial for them on average.

3. The difference on $E {Y (0)}$ between $P_{11}$ and $P_{00}$ is bounded above by $δ_{y 0}$

E {Y (0) | P_{11}} - E {Y (0) | P_{00}} = μ_{11} (0) - μ_{00} (0) \leq δ_{y 0}

In the HERS, it is sensible to set $δ_{y 0} = 0$ . Intuitively, it means that in the untreated condition, people who would always receive HAART had higher degree of HIV progression (lower CD4 on average, compared to those who would never receive HAART).

The difference in average treatment effects between those who would always receive HAART and those who would never receive HAART is bounded below

E {Y (1) - Y (0) | P_{11}} - E {Y (1) - Y (0) | P_{00}} = {μ_{11} (1) - μ_{11} (0)} - {μ_{00} (1) - μ_{00} (0)} \geq δ_{trt}

For example, letting $δ_{trt} = 0$ implies that the average treatment effect on those who would always receive HAART is greater than those would never do so.

Under this set of assumptions, it can be shown that the ATE is bounded by

[b (c_{0}, μ_{00} (0) + δ_{00}), b (0, c_{1})]

and τ by

[τ (b (0, c_{1}), γ), τ (b (c_{0}, μ_{00} (0) + δ_{00}), γ)]

where

c_{0} = \min {μ_{11} (1) - δ_{11}, μ_{00} (0) + δ_{y 0}}

and

c_{1} = μ_{11} (1) + μ_{00} (0) - δ_{trt}

5.3 Constraint conditional on measured covariates

Given the HERS data, it may be more realistic to assume that Assumption (B) holds conditional on the measured covariates V. So we propose our third set of assumptions as

Assumption (B′): We assume that for known δ₁₁, δ₀₀, $δ_{y 0}$ and δ_trt.

$E {Y (1) - Y (0) | P_{11}, V} \geq δ_{11}$ ; $E {Y (1) - Y (0) | P_{00}, V} \geq δ_{00}$ .

$E {Y (0) | P_{11}, V} - E {Y (0) | P_{00}, V} \leq δ_{y 0}$ .

$E {Y (1) - Y (0) |, V} - E {Y (1) - Y (0) | P_{00}, V} \geq δ_{trt}$ .

Further, we assume that the monotonicity and exclusion restriction assumptions hold conditional on $V$ . So the constraint equation (1) becomes

\begin{array}{l} β^{ATE} = π_{01} β^{LATE} + \int_{V} E {Y (1) - Y (0) | P_{11}, V} P (P_{11} | V) d F (V) \\ + \int_{V} E {Y (1) - Y (0) | P_{00}, V} P (P_{00} | V) d F (V) \end{array}

(5)

where

V

is the support of

V

with a distribution

F (V)

. We write the product

π_{01} β^{LATE}

as before because both the

LATE

and π₀₁ are identified by the data. Again, no observed data are available for

E {Y (0) | P_{11}, V}

and

E {Y (1) | P_{00}, V}

and therefore the two quantities are not identified. Henceforth, we denote them by

μ_{11} (0, V)

and

μ_{00} (1, V)

, respectively.

Under (B′) and equation (5), it can be shown that a bound on $ATE$ is

[π_{01} \times LATE + \int_{V} \tilde{b} {c_{0} (V), c_{2} (V)} d F, π_{01} \times LATE + \int_{V} \tilde{b} {0, c_{1} (V)} d F]

and a bound on τ is

[τ (π_{01} \times LATE + \int_{V} \tilde{b} {0, c_{1} (V)} d F, γ), τ (π_{01} \times LATE + \int_{V} \tilde{b} (c_{0} (V), c_{2} (V)) d F, γ)]

where

\tilde{b} (μ_{11} (0, V), μ_{00} (1, V)) = [E {Y (1) | P_{11}, V} - μ_{11} (0, V)] \Pr (P_{11} | V) + [μ_{00} (1, V) - E {Y (0) | P_{00}, V}] \Pr (P_{00} | V), c_{0} (V) = \min (E {Y (1) | P_{11}, V} - δ_{11}, E {Y (0) | P_{00}, V} + δ_{y 0}), c_{1} (V) = E {Y (1) | P_{11}, V} + E {Y (0) | P_{00}, V} - δ_{trt}

, and

c_{2} (V) = E {Y (0) | P_{00}, V} + δ_{00}

5.4 Inference from finite samples

With a finite sample size n, we can estimate the bounds on ATE and τ, based on the results in Sections 5.1 to 5.3. We proceed by first estimating the identifiable parameters in the constraint (1). Specifically we assume two regression models

E (A | Z) = {logit}^{- 1} (η (Z; θ_{1})) and E (Y | A, Z) = κ (A, Z; θ_{2})

for some known functionals

η (Z; θ_{1})

and

κ (A, Z; θ_{2})

. For binary Z and A, we can use two saturated models and specify that

η (Z; θ_{1}) = θ_{10} + θ_{11} Z

and

κ (A, Z; θ_{2}) = θ_{20} + θ_{21} Z + θ_{23} A + θ_{23} A Z

with

θ_{1} = {(θ_{10}, θ_{11})}^{⊤}

and

θ_{2} = {(θ_{20}, θ_{21}, θ_{22}, θ_{23})}^{⊤}

. Here, a saturated additive model for

κ (A, Z; θ_{2})

is compatible with the structural model (2), which suggests that

E (Y | A, Z)

is linear in A, Z, and AZ. The estimated regression parameter

{\hat{θ}}_{1}

and

{\hat{θ}}_{2}

can be obtained by solving

U_{5} (θ_{1}, θ_{2}) : = (\begin{matrix} \sum_{i = 1}^{n} W_{3 i} [A_{i} - {logit}^{- 1} {η (Z_{i}; θ_{1})}] \\ \sum_{i = 1}^{n} W_{4 i} {Y_{i} - κ (A, Z; θ_{2})} \end{matrix}) = 0

where

W_{3 i} = \partial {logit}^{- 1} {η (Z_{i}; θ_{1})} / \partial θ_{1}

and

W_{4 i} = {(1, Z_{i}, A_{i}, A_{i} Z_{i})}^{⊤}

. Based on the above regression models, we have

{\hat{π}}_{11} = {logit}^{- 1} {η (0; {\hat{θ}}_{1})}, {\hat{π}}_{00} = 1 - {logit}^{- 1} {η (1; {\hat{θ}}_{1})}, {\hat{π}}_{01} = {logit}^{- 1} {η (1; {\hat{θ}}_{1})} - {logit}^{- 1} {η (0; {\hat{θ}}_{1})}

{\hat{μ}}_{11} (1) = κ (1, 0; {\hat{θ}}_{2})

, and

{\hat{μ}}_{00} (0) = κ (0, 1; {\hat{θ}}_{2})

For Assumptions (A) and (B), we then estimate the function $b (\cdot)$ by $\hat{b} (μ_{11} (0), μ_{00} (1)) = {\hat{π}}_{01} {\hat{β}}_{IV} + {\hat{π}}_{11} {{\hat{μ}}_{11} (1) - μ_{11} (0)} + {\hat{π}}_{00} {μ_{00} (1) - {\hat{μ}}_{00} (0)}$ . Further, estimate $c_{0} (\cdot)$ by ${\hat{c}}_{0} (δ_{11}, δ_{y 0}) = \min {({\hat{μ}}_{11} (1) - δ_{11}), ({\hat{μ}}_{00} (0) + δ_{y 0})}$ and $c_{1} (\cdot)$ by ${\hat{c}}_{1} (δ_{trt}) = {\hat{μ}}_{11} (1) + {\hat{μ}}_{00} (0) - δ_{trt}$ . By substituting these estimates for their estimands, we obtain the bound estimates on the ATE and τ, as shown in Table 2.

Table 2.

Bounds estimates on ATE and $τ$ under assumptions (A), (B) and (B′).

	Parameter	Estimated bound
(A)	ATE	$[\hat{b} (ξ_{0}, 0), \hat{b} (0, ξ_{1})]$
	$τ$	$[{\hat{τ}}_{n} (\hat{b} (0, ξ_{1}), \hat{γ}), {\hat{τ}}_{n} (\hat{b} (ξ_{0}, 0), \hat{γ})]$
(B)	ATE	$[\hat{b} ({\hat{c}}_{0}, {\hat{μ}}_{00} (0) + δ_{00}), \hat{b} (0, {\hat{c}}_{1})]$
	$τ$	$[{\hat{τ}}_{n} (\hat{b} (0, {\hat{c}}_{1}), \hat{γ}), {\hat{τ}}_{n} (\hat{b} ({\hat{c}}_{0}, {\hat{μ}}_{00} (0) + δ_{00}), \hat{γ})]$
(B′)	ATE	$[{\hat{π}}_{01} {\hat{β}}_{IV} + \frac{\sum_{i} \hat{b} ({\hat{c}}_{0} (V_{i}), {\hat{c}}_{2} (V_{i}))}{n}, {\hat{π}}_{01} {\hat{β}}_{IV} + \frac{\sum_{i} \hat{b} (0, {\hat{c}}_{1} (V_{i}))}{n}]$
	$τ$	$[{\hat{τ}}_{n} {{\hat{π}}_{01} {\hat{β}}_{IV} + \frac{\sum_{i} \hat{b} (0, {\hat{c}}_{1} (V_{i}))}{n}}, {\hat{τ}}_{n} {{\hat{π}}_{01} {\hat{β}}_{IV} + \frac{\sum_{i} \hat{b} ({\hat{c}}_{0} (V_{i}), {\hat{c}}_{2} (V_{i}))}{n}}]$

ATE: average treatment effect.

For Assumption (B′), we estimate the bounds on ATE and τ using the following steps

Step 1. We assume two observed-data models conditional on $V$ , $E (A | Z, V) = {logit}^{- 1} η (Z, V; θ_{3})$ , and $E (Y | Z, A, V) = κ (Z, A, V; θ_{4})$ for known $η (Z, V; θ_{3})$ and $κ (Z, A, V; θ_{4})$ . For example, we can assume two linear models without interactions that $η (Z, V; θ_{3}) = θ_{30} + θ_{31} Z + θ_{32} V$ with $θ_{3} = {(θ_{30}, θ_{31}, θ_{32}^{⊤})}^{⊤}$ and $κ (Z, A, V; θ_{4}) = θ_{40} + θ_{41} Z + θ_{42} A + θ_{43}^{⊤} V$ with $θ_{4} = {(θ_{40}, θ_{41}, θ_{42}, θ_{43}^{⊤})}^{⊤}$ . Let ${\hat{θ}}_{3}$ and ${\hat{θ}}_{4}$ denote the corresponding estimated parameters.

Step 2. With the monotonicity assumption and exclusion restriction, we estimate that $\hat{E} {Y (0) | P_{00}, V} = \hat{E} (Y | Z = 1, A = 0, V) = μ (1, 0, V; {\hat{θ}}_{4})$ , $\hat{E} {Y (1) | P_{11}, V} = μ (0, 1, V; {\hat{θ}}_{4}), \hat{\Pr} (P_{11} | V) = π (0, V; {\hat{θ}}_{3})$ , and $\hat{\Pr} (P_{00} | V) = 1 - π (1, V; {\hat{θ}}_{3})$ . Then we estimate $\hat{b} (\cdot)$ , ${\hat{c}}_{0} (\cdot), {\hat{c}}_{1} (\cdot)$ , and ${\hat{c}}_{2} (\cdot)$ by bringing in these estimates.

Step 3. We estimate the distribution of $F (V)$ in (5) by the empirical cumulative density function and integrals by empirical sums, e.g. estimate $\int_{V} b (c_{0} (V), c_{2} (V)) d F$ by $\sum_{i = 1}^{n} \hat{b} ({\hat{c}}_{0} (V_{i}), {\hat{c}}_{2} (V_{i})) / n$

The resulting bound estimates on the ATE and $τ$ for Assumption (B′) are shown in Table 2.

5.5 Uncertainty region for estimated bounds

With a finite sample size, we quantify the sampling uncertainty of estimated bounds using uncertainty regions (URs), which are defined as intervals that provide a $(1 - α)$ 100% coverage probability on the true bounds. Unlike usual confidence intervals (CIs), a UR takes into account both the sampling variability and partial identifiability. We considered two types of URs in this paper: point-wise and strong $(1 - α)$ 100% coverage URs.

Let $(L, U)$ denote the true bound. A point-wise UR is defined as an interval $(\hat{L}, \hat{U})$ that contains any particular value $ϱ \in (L, U)$ with a probability of at least ( $1 - α$ ), where $ϱ$ is the parameter generating the data. If the $\hat{L} and \hat{U}$ are consistent estimates and asymptotically normal, a large-sample ( $1 - α$ )100% point-wise UR is given as⁴⁹

{UR}_{P-CAN} = [\hat{L} - c^{*} se (\hat{L}), \hat{U} + c^{*} se (\hat{U})]

where se(

\cdot

) is the standard error and c

^{*}

is a critical value. When

(U - L)

is large compared to

se (\hat{L}) and se (\hat{U})

, c

^{*}

can be approximated by

Φ^{- 1} (1 - α)

with

Φ

being the normal distribution function.

A strong UR is an interval that contains the entire true bound $(L, U)$ with a probability of at least ( $1 - α$ ).^49,50 If both $\hat{L}$ and $\hat{U}$ are consistent and approximately normal, a strong ( $1 - α$ )100% UR is

{UR}_{S-CAN} = [\hat{L} - c se (\hat{L}), \hat{U} + c se (\hat{U})]

with

c = Φ^{- 1} (1 - α / 2)

Without assuming $\hat{L} and \hat{U}$ to be normally distributed, we also can obtain a strong ( $1 - α$ ) UR using the bootstrap method.⁵¹ Specifically, let $({\tilde{L}}^{*}, {\tilde{U}}^{*})$ denote the estimated bound from a bootstrapped sample. A bootstrap strong 95% UR can be defined as the interval $(L^{*}, U^{*})$ that has $\overset{*}{\Pr} (L^{*} \leq {\tilde{L}}^{*}, {\tilde{U}}^{*} \leq U^{*}) = 1 - α$ with $\overset{*}{\Pr} ({\tilde{L}}^{*} < L^{*}) = \overset{*}{\Pr} ({\tilde{U}}^{*} > U^{*})$ , where $\overset{*}{\Pr}$ is the empirical probability function induced by bootstrapped resamples. So a bootstrap strong UR, henceforth denoted by ${UR}_{S-BTS} = (L^{*}, U^{*})$ , can be obtained by finding the shortest interval that satisfies $\frac{# (L^{*} \leq {\tilde{L}}^{*} < {\tilde{U}}^{*} \leq U^{*})}{K} \geq 1 - α$ , and $\frac{# (L^{*} \leq {\tilde{L}}^{*})}{K} ≃ \frac{# (U^{*} \geq {\tilde{U}}^{*})}{K}$ , where $# ()$ counts the number of statements that hold and $K$ is the number of bootstrap resamples.

6 Application to the HERS data

6.1 Preliminary analyses

The upper panel of Table 3 shows the “as-treated” (AT) effect of initial-stage HAART on CD4 count, the IPW estimate of the ATE, and IV estimate of the LATE. The AT effect is estimated by the contrast of the average CD4 counts between those actually receiving HAART and those not. The IPW uses the variables listed in Table 1 as the measured confounders $V$ and assumes that $e (V; γ) = {logit}^{- 1} (γ^{⊤} V)$ .

Table 3.

Estimates of HAART treatment effect on CD4 and $τ$ .

		ATE	$τ$
AT	Point estimate 95% CI	13 $(-$ 30, 56)	–
IPW	Point estimate 95% CI	27 $(-$ 16, 70)	–
IV	Point estimate 95% CI	207 $(-$ 250, 664)	–
Assumption: (A):	Bound estimate	$(-$ 195, 256)	$(-$ 229, 223)
	$95 % {UR}_{P-CAN}$	( $-$ 229, 289)	( $-$ 269, 266)
	$95 % {UR}_{S-CAN}$	( $-$ 235, 295)	( $-$ 277, 274)
	$95 % {UR}_{S-BTS}$	( $-$ 233, 294)	( $-$ 274, 273)
Assumption: (B):	Bound estimate	$(20, 231)$	$(-$ 204, 7.5)
	$95 % {UR}_{P-CAN}$	( $-$ 9, 280)	( $-$ 260, 49)
	$95 % {UR}_{S-CAN}$	( $-$ 15, 289)	( $-$ 271, 57)
	$95 % {UR}_{S-BTS}$	( $-$ 14, 285)	( $-$ 270, 57)
Assumption: (B′):	Bound estimate	$(18, 218)$	$(-$ 191, 9.1)
	$95 % {UR}_{P-CAN}$	( $-$ 10, 261)	( $-$ 234, 48)
	$95 % {UR}_{S-CAN}$	( $-$ 16, 270)	( $-$ 243, 56)
	$95 % {UR}_{S-BTS}$	( $-$ 14, 270)	( $-$ 243, 56)

Note: We assume that $ξ_{0} = ξ_{1} = 500$ for Assumption (A); and that $δ_{00} = δ_{11} = δ_{y 0} = δ_{trt} = 0$ for Assumptions (B) and (B′). Uncertainty regions are highlighted using bold font. HAART: highly active antiretroviral therapy; ATE: average treatment effect.

The IPW estimate suggests that at initial stage, HAART could boost patient’s CD4 count by an average of 27 cells/mm³ among all patients with a 95% CI = ( $-$ 16, 70) cells/mm³. The IPW estimate is higher than the AT estimate but the difference is not statistically significant. The difference between the IPW and AT estimates can be regarded as the bias of AT estimate that is attributable to the measured confounders. The IV estimate of the LATE suggests that HAART could increase CD4 count by 207 cells/mm $^{3}$ on average with a 95% CI of ( $-$ 250, 664) cells/mm $^{3}$ among those who would receive HAART at academic medical centers but not at community health clinics. The IV estimate is not subject to the impact of unmeasured confounding but applies only to an unidentified subpopulation.

6.2 Bounds on HAART treatment effect and unmeasured confounding

We then estimate the initial-stage HAART treatment effect and the degree of unmeasured confounding using our proposed method. For Assumption (A), we let the upper limits $ξ_{0} = ξ_{1} = 500$ . (Recall that the two limits are on the expected values of $Y (0) among P_{11} and Y (1) among P_{00}$ ). We choose the two limits based on the facts that the average CD4 count at the previous visit was lower than 350 cells/mm $^{3}$ and at “current” (the eighth) visit, the average CD4 count was 229 cells/mm $^{3}$ for those treated and 216 cells/mm $^{3}$ for those untreated (refer to Tables 1). For Assumptions (B) and (B′), we let $δ_{11} = δ_{00} = δ_{y 0} = δ_{trt} = 0$ . The implications of choosing these values have been discussed in Section 5.2.

The lower panel of Table 3 summarizes the bound estimates of the ATE and $τ$ . The bound estimate of the ATE under Assumption (A) is $(- 196, 256)$ which is not informative compared with those under Assumption (B) (20, 231) and (B′) (18, 218). Assumption (B′) is not necessarily a stronger assumption than (B), but by imposing observed data models and having the estimated bounds smoothed over covariates, (B′) leads to slightly tighter bounds than (B) but their difference is negligible. The difference between the IPW (point) estimate and the bound estimates under (B) and (B′) is likely due to the bias of unmeasured confounding. Noticeably, the IPW estimate is close to the lower bound estimates under Assumptions (B) and (B′), suggesting that accounting only for measured confounders may not be adequate.

To obtain uncertainty regions on these bound estimates, we draw $K = 1000$ bootstrap resamples, fixing the number of patients at the two types of study sites. Because the bounds estimates contain $\min (\cdot)$ operation which complicates the derivation of their standard errors, we use the $K$ bootstrapped resamples to calculate the standard errors of the two ends of bounds.

Table 3 summarizes the point-wise, strong, and bootstrap strong 95% coverage URs. The URs under Assumptions (B) and (B′) have comparable lower limits to that of the IPW 95% CI, while the difference in their upper limits indicates that unmeasured confounding might cause a downward bias and the true ATE is likely to be higher than what the IPW 95% CI suggests.

The bound estimates on $τ$ under Assumptions (B) and (B′) are $(- 204, 7.5) and (- 191, 9.1)$ , which are tighter and more informative than the bound estimate under Assumption (A) $(- 229, 223)$ . A possibly negative value of $τ$ , as suggested by the 95% URs, implies that unmeasured factors might cause a selection bias in a way that resulted in preferential prescriptions of HAART to those with fewer CD4 count. The degree of unmeasured confounding, defined as the adjusted difference in CD4 between those on HAART (if left untreated) and those not, is estimated to be roughly between $- 250$ and $50$ .

6.3 Sensitivity to unknown parameters

In this section, we conduct a simple sensitivity analysis for the unknown parameters used in the three sets of assumptions. We impose a common upper limit $ξ = ξ_{0} = ξ_{1}$ for Assumption (A) and let $ξ$ vary from $300$ to $500$ . The bound estimates on the ATE and $τ$ along with bootstrap strong 95% URs are shown in Figure 1 (first row). As suggested by the figure, $ξ$ has more influence on the upper (lower) bound estimate on ATE (on $τ$ ) for the considered range of $ξ$ , and the resulting bound estimates remain wide and non-informative.

Figure 1.

Sensitivities of bound estimates to $ξ = ξ_{0} = ξ_{1}$ under assumption (A); to $δ_{00}$ under assumptions (B) and (B′). The gray zones show the bound estimates as a function of $ξ or δ_{00}$ . The bootstrap strong 95% ${UR}_{S-BTS}$ ’s are shown as dashed lines.

For (B) and (B′), we let $δ_{00}$ (the lower limit of treatment effect among $P_{00}$ ) range from $- 60 t o 20$ and fix $δ_{11} = δ_{y 0} = δ_{trt} = 0$ . (More sophisticated sensitivity analyses that jointly evaluate $δ_{11}, δ_{00}$ , $δ_{y 0} and δ_{trt}$ are possible.) We choose this range for $δ_{00}$ based on the magnitude of the AT and IPW estimates, and have it tilt toward the negative side for the possibility that HAART could be harmful for those never receiving HAART. Figure 1 (second and third rows) shows that $δ_{00}$ only affects the lower (upper) bound estimates of the ATE (of $τ$ ). The estimated ATE can be as high as over 200 cell/mm $^{3}$ , and the lower bound of ATE varies around zero depending on the value of $δ_{00}$ . Again, a possible negative value of $τ$ suggests that unmeasured confounding likely caused HAART to be preferentially prescribed to those with poorer health.

7 Discussions

We propose to use an IV and sets of contextually plausible assumptions to quantify the population causal effect of a treatment as well as the degree of unmeasured confounding. We describe three sets of assumptions that are suitable in an observational study (the HERS). Assumption (A) specifies the limits of the expected unobservable potential outcomes, which leads to a simplified version of the Robins-Manski bounds on ATE. Assumptions (B) and (B′) specify the relative magnitudes between identified and unidentified potential outcome averages. The bound estimates of ATE obtained under (B) and (B′) are tighter than that under (A); have less concern of having unmeasured confounding bias compared with the IPW estimate; and are more informative than the IV estimate (if we look at the CI of the IV estimate). The bound estimates on the ATE and $τ$ reveal that unmeasured confounding could cause a downward bias on the ATE because of HAART being preferentially prescribed to those with poorer health condition.

Quantifying the degree of unmeasured confounding can be valuable for analysis of studies conducted in similar settings but having no IV. Several HIV observational studies³⁷ have been conducted contemporarily as the HERS, and could suffer from unmeasured confounding as well. In those studies when unmeasured confounding is of concern, analyses should be complemented with a sensitivity analyses as described in Section 6.3, where a plausible range for $τ$ can be informed from our study.

In this paper, we use the type of study site as an instrument variable, assuming that two crucial IV assumptions (monotonicity and exclusion restriction) are satisfied. The observed HAART assignment rate at academic centers is higher than that at community clinics, an observation suggesting that the deterministic monotonicity $\Pr (A_{1} \geq A_{0}) = 1$ is plausible, but the assumption is not verified. As one limitation of our study, this assumption will be violated if some individuals would receive HAART at community clinics but not at academic medical centers. If the proportion of these individuals ( $P_{10}$ ) is small, it is reasonable to believe that the bias due to the violation of the monotonicity assumption is probably negligible. Alternatively, one can assume that $P_{00}$ is absent, so that $P_{11}, P_{01}$ , and $P_{10}$ form a partition of the population. This assumption allows everyone to have some chance of receiving HIV therapy, which is also sensible for the HERS because these patients’ CD4 counts are less than 350 cells/mm $^{3}$ six months before, and allows for the possibility that some people would potentially be treated at a community clinic but not an academic medical center. With this assumption, the proportions of $P_{11}, P_{01} and P_{10}$ are identified because $π_{01} = \Pr (A = 0 | Z = 0), \Pr (P_{10}) = \Pr (A = 0 | Z = 1)$ , and $\Pr (P_{11}) = 1 - π_{01} - \Pr (P_{10})$ . The following estimands are also identified: $E {Y (0) | P_{01}} = E (Y | A = 0, Z = 0), E {Y (0) | P_{10}} = E (Y | A = 0, Z = 1)$ , $E {Y (1) | P_{11} or P_{01}} = E (Y | A = 1, Z = 1)$ , and $E {Y (1) | P_{11} or P_{10}} = E (Y | A = 1, Z = 0)$ . A challenge here is how to incorporate the IV estimator, which now has an estimand as a “weighted” contrast of the average treatment effects between $P_{01} and P_{10}$ , to construct constraint similar to equation (1). This is worth further investigation.

Moreover, replacing the deterministic monotonicity with a stochastic monotonicity^52,53 assumption deserves some explorations. Roy et al.⁵⁴ assumed $\Pr (A_{1} = 1 | A_{0} = 1, V) \geq \Pr (A_{1} = 1 | A_{0} = 0, V)$ , and proposed to use auxiliary covariates to estimate the memberships of principal strata. Small et al.⁵⁵ assumed $\Pr (A_{1} = 1 | U) \geq \Pr (A_{0} = 1 | U) with U$ being a latent variable satisfying certain conditions. These stochastic monotonicity assumptions allow the possible presence of “defiers” and are generally a more plausible condition than the deterministic monotonicity. In the HERS study, given that the physicians at the academic centers were more likely to prescribe HAART to patients when it first became available, we assumed that the fraction of “defiers” was small and the deterministic monotonicity was a plausible condition in the context.

The exclusion restriction could also be violated if the type of study site $Z$ remains associated with the outcome $Y$ after accounting for the effect of $Z$ on HAART receipt. A weaker exclusion restriction assumption can be made, if the association between the instrument and the outcome can be removed after conditioning on some measured covariate $V^{*} \subseteq V$ , i.e. ${Y (1), Y (0)} ⊥ Z | V^{*}$ . In this case, various methods^{14,22,56–58} can be implemented for an IV analysis conditional on $V^{*}$ , and our method for bound estimation on ATE and $τ$ still applies.

There are several ways to account for the measured confounding. We use the method of inverse probability weighting by specifying a propensity score model. Alternatively, we can specify both an outcome regression model and a propensity score model and use the doubly robust (DR) estimator⁸ to estimate the ATE. We do not implement the DR estimator in this paper because when unmeasured confounding exists, the DR estimator is no longer guaranteed to be consistent for ATE and could suffer more bias than other estimators. The simulations of Kang et al.¹⁰ suggest that IPW is relatively robust to the impact of unmeasured confounding in terms of estimation bias. Because the focus issue of this paper is unmeasured confounding, we use the IPW for estimating ATE.

Finally, it should be pointed out that bound estimates may not be normally distributed asymptotically, especially when a bound occurs at the boundary of the parameter space or when the likelihood is not smooth around their true values. So practically, data analysts should check these regularity conditions in a similar way as they do when conducting statistical inference with other methods.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was facilitated by the Providence/Boston Center for AIDS Research (P30AI042853).

ORCID iDs

Tao Liu

Joseph W Hogan

References

Rosenbaum

PR.

Observational studies. New York, NY: Springer, 2002.

VanderWeele

Shpitser

On the definition of a confounder.

Ann Stat 2013; 41: 196–220.

Rosenbaum

Rubin

DB.

Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc 1984; 79: 516–524.

Robins

Rotnitzky

Zhao

LP.

Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 1994; 89: 846–866.

D’Agostino

RB.

Tutorial in biostatistics: propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med 1998; 17: 2265–2281.

Robins

Hernán

Brumback

Marginal structural models and causal inference in epidemiology. Epidemiology 2000; 11: 550–560.

Hogan

Lancaster

Instrumental variables and inverse probability weighting for causal inference from longitudinal observational studies.

Stat Meth Med Res 2004; 13: 17–48.

Bang

Robins

JM.

Doubly robust estimation in missing data and causal inference models.

Biometrics 2005; 61: 962–972.

Cole

Hernán

Margolick

, et al.

Marginal structural models for estimating the effect of highly active antiretroviral therapy initiation on CD4 cell count.

Am J Epidemiol 2005; 162: 471–478.

10.

Kang

JDY

Schafer

JL.

Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci 2007; 22: 523–539.

11.

Stuart

EA.

Matching methods for causal inference: a review and a look forward.

Stat Sci 2010; 25: 1–21.

12.

Wright

Appendix to the tariff on animal and vegetable oils. vol. 26. New York, NY: Macmillan; 1928.

13.

Stock

Trebbi

Retrospectives: who invented instrumental variable regression?

J Econ Perspect 2003; 17: 177–194.

14.

Angrist

Imbens

GW.

Two-stage least squares estimation of average causal effects in models with variable treatment intensity. J Am Stat Assoc 1995; 90: 431–442.

15.

Imbens

Angrist

JD.

Identification and estimation of local average treatment effects. Econometrica 1994; 62: 467–475.

16.

Angrist

Imbens

Rubin

DB.

Identification of causal effects using instrumental variables. J Am Stat Assoc 1996; 91: 444–455.

17.

Robins

Greenland

Identification of causal effects using instrumental variables: comment. J Am Stat Assoc 1996; 91: 456–458.

18.

Robins JM. The Analysis of Randomized and Non-Randomized AIDS Treatment Trials Using a New Approach to Causal Inference in Longitudinal Studies. In Sechrest L, Freeman H and Bailey A (eds) Health Service Research Methodology: A Focus on AIDS. Washington, DC: U. S. Public Health Service, 1898, pp. 113–159.

19.

Manski

CF.

Nonparametric bounds on treatment effects. Am Econ Rev 1990; 80: 319–323.

20.

Balke

Pearl

Bounds on treatment effects from studies with imperfect compliance. J Am Stat Assoc 1997; 92: 1171–1176.

21.

Joffe

MM.

Using information on realized effects to determine prospective causal effects. J R Stat Soc Ser B 2001; 63: 759–774.

22.

Cheng

Small

DS.

Bounds on causal effects in three-arm trials with non-compliance. J R Stat Soc Ser B 2006; 68: 815–836.

23.

Zhang

Rubin

DB.

Estimation of causal effects via principal stratification when some outcomes are truncated by death. J Edu Behav Stat 2003; 28: 353–368.

24.

Smith

Warren

Vlahov

, et al. Design and baseline participant characteristics of the Human Immunodeficiency Virus Epidemiology Research (HER) Study: a prospective cohort study of human immunodeficiency virus infection in US women. Am J Epidemiol 1997; 146: 459–469.

25.

Hogan

Mayer

KH.

Estimating causal treatment effects from longitudinal {HIV} natural history studies using marginal structural models.

Biometrics 2003; 59: 152–162.

26.

Walker

AM.

Confounding by indication.

Epidemiology 1996; 7: 335–336.

27.

Brookhart

Schneeweiss

Preference-based instrumental variable methods for the estimation of treatment effects: assessing validity and interpreting results. Int J Biostat 2007; 3: 14.

28.

Robins

Rotnitzky

Scharfstein

DO.

Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models.

Stat Model Epidemiol: Environ Clin Trials 1999; 116: 1–92.

29.

Brumback

Hernán

Haneuse

SJPA

, et al.

Sensitivity analyses for unmeasured confounding assuming a marginal structural model for repeated measures.

Stat Med 2004; 23: 749–767.

30.

Frangakis

Rubin

DB.

Principal stratification in causal inference.

Biometrics 2002; 58: 21–29.

31.

Carpenter

CCJ

Cooper

Fischl

, et al. Antiretroviral therapy in adults:updated recommendations of the International AIDS Society-USA Panel. J Am Med Assoc 2000; 283: 381–391.

32.

Ten Have

Normand

SLT

Marcus

, et al.

Intent-to-treat vs. non-intent-to-treat analyses under treatment non-adherence in mental health randomized trials.

Psych Ann 2008; 38: 772.

33.

Johnston

SC.

Combining ecological and individual variables to reduce confounding by indication. J Clin Epidemiol 2000; 53: 1236–1241.

34.

Brookhart

Wang

Solomon

, et al. Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable. Epidemiology 2006; 17: 268–275.

35.

Neyman

On the application of probability theory to agricultural experiments: essay on principles. Stat Sci 1923; 1990: 465–472.

36.

Rubin

DB.

Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 1974; 66: 668–701.

37.

Gange

Kitahata

Saag

, et al. Cohort profile: The North American AIDS Cohort Collaboration on Research and Design (NA-ACCORD). Int J Epidemiol 2007; 36: 294–301.

38.

Robins

JM.

Association, causation, and marginal structural models. Synthese 1999; 121: 151–179.

39.

Rosenbaum

Rubin

DB.

The central role of the propensity score in observational studies for causal effects. Biometrika 1983; 70: 41–55.

40.

Miguel

Babette

Robins

JM.

Marginal structural models to estimate the joint causal effect of nonrandomized treatments. J Am Stat Assoc 2001; 96: 440–448.

41.

Wooldrige

JM.

Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press, 2002.

42.

Hernan

Robins

JM.

Instruments for causal inference: an epidemiologist’s dream?

Epidemiology 2006; 17: 360–372.

43.

Davidson

MacKinnon

Estimation and inference in econometrics. Oxford, UK: Oxford University Press, 1993.

44.

Stock

JH.

Instrumental variables in statistics and econometrics. Amsterdam: Elsevier, 2001.

45.

Freedman

Statistical models: theory and practice. Cambridge, UK: Cambridge University Press, 2009.

46.

Bhattacharya

Shaikh

Vytlacil

Treatment effect bounds under monotonicity assumptions: an application to Swan-Ganz Catheterization. Am Econ Review 2008; 98: 351–356.

47.

Vansteelandt

Bowden

Babanezhad

, et al. On instrumental variables estimation of causal odds ratios. Stat Sci 2011; 26: 403–422.

48.

Siddique

Partially identified treatment effects under imperfect compliance: the case of domestic violence. J Am Stat Assoc 2014; 108: 504–513.

49.

Vansteelandt

Goetghebeurand

Kenward

, et al. Ignorance and uncertainty regions as inferential tools in a sensitivity analysis. Stat Sinica 2006; 16: 953–979.

50.

Horowitz

Manski

CF.

Nonparametric analysis of randomized experiments with missing covariate and outcome data. J Am Stat Assoc 2000; 95: 77–84.

51.

Bickel

Freeman

DA.

Some asymptotic theory for the bootstrap. Ann Stat 1981; 9: 1196–1217.

52.

DiNardo

Lee

DS.

Program evaluation and research designs. In: Ashenfelter O and Card D (eds) Handbook of labor economics. vol. 4. Amsterdam, the Netherlands: Elsevier, 2011, pp.463–536.

53.

de Chaisemartin

Tolerating defiance? Local average treatment effects without monotonicity. Quantitative Economics 2017; 8: 367–396.

54.

Roy

Hogan

Marcus

BH.

Principal stratification with predictors of compliance for randomized trials with 2 active treatments.

Biostatistics 2008; 9: 277–289.

55.

Small

Tan

Ramsahai

, et al. Instrumental variable estimation with a stochastic monotonicity assumption. Stat Sci 2017; 32: 561–579.

56.

Hirano

Imbens

Rubin

, et al. Assessing the effect of an influenza vaccine in an encouragement design. Biostatistics (Oxford, England) 2000; 1: 69–88.

57.

Abadie

Semiparametric instrumental variable estimation of treatment response models. J Econom 2003; 113: 231–263.

58.

Tan

Regression and weighting methods for causal inference using instrumental variables. J Am Stat Assoc 2006; 101: 1607–1618.

Unifying instrumental variable and inverse probability weighting approaches for inference of causal treatment effect and unmeasured confounding in observational studies

Abstract

Keywords

1 Introduction

2 The HIV Epidemiology Research Study (HERS)

2.1 Study overview

2.2 HERS study site as IV

3 Notations and definitions

3.1 Notation

3.2 Definitions of causal treatment effect

4 Review of estimation methods

4.1 The IPW method

4.2 The IV method

5 A unified system of estimating equations

5.1 Assumption on the upper limits of μ11 ( 0 ) and μ00 ( 1 )

5.2 Constraints on relationships between μ11 ( 0 ) and μ00 ( 1 ) and identified quantities

5.3 Constraint conditional on measured covariates

5.4 Inference from finite samples

5.5 Uncertainty region for estimated bounds

6 Application to the HERS data

6.1 Preliminary analyses

6.2 Bounds on HAART treatment effect and unmeasured confounding

6.3 Sensitivity to unknown parameters

7 Discussions

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References

5.1 Assumption on the upper limits of μ₁₁ $(0)$ and μ₀₀ $(1)$

5.2 Constraints on relationships between μ₁₁ $(0)$ and μ₀₀ $(1)$ and identified quantities