Abstract
Most of the individual difference variance in the population is found within families, yet studying the processes causing this variation is difficult due to confounds between genetic and nongenetic influences. Quasi-experiments can be used to test hypotheses regarding environment exposure (e.g., timing, duration) while controlling for genetic confounds. To illustrate, two studies of cognitive self-regulation in childhood (i.e., working memory, effortful control, attention span/persistence) are presented. Study 1 utilized an identical twin differences design (N = 85-98 pairs) to control for genetic differences while using relative twin birth weight difference to predict relative twin difference in working memory and effortful control. Larger relative twin difference in working memory and effortful control was predicted by the combination of shorter gestation and larger relative birth weight difference. Study 2 utilized an adoptive sibling relative difference design (N = 123 same-sex pairs) to control for genetic similarity while using relative sibling difference in age at time of adoption to predict relative sibling difference in attention span/persistence. Larger relative sibling difference in attention span/persistence was predicted by the combination of larger relative difference in time in the adoptive home and age at adoption. Within-family quasi-experimental designs allow stronger inferences about hypothesized environmental influences than between-family designs permit.
Introduction
Imagine being able to conduct ethical, ecologically valid, real world experiments with humans that would allow you to identify the ways in which environments “work”. Typically, the priority is on randomized controlled trials (RCT) that sometimes
RCTs often are not feasible. An under-utilized complementary approach for identifying environmental effects in the real world is to use quasi-experiments that control for or at least minimize the statistical effects of unidentified genetic factors that correlate and interact with identified environmental factors. The litmus test for whether a particular environmental factor has an effect is not only whether there is an average effect in an RCT (in which genetic factors typically are not considered), but also whether there is an effect of environment exposure in quasi-experiments that control for genetic factors through selection. This has been the central argument for behavioral genetic designs (McCartney & Weinberg, 2009). In most cases, behavioral genetic designs are used to decompose variances and covariances, and inferences are made based on that decomposition (e.g., heritability, shared vs. nonshared environment; see Plomin, DeFries, Knopik, & Neiderhiser, 2012). The key feature of behavior genetic designs is that they control for genetic confounds while permitting tests of hypotheses about specific environment features such as the effects of developmental timing and duration of exposure (i.e., initiation, dosage).
The broad goal of the current paper was to present two studies that highlight different sibling designs for conducting genetically controlled quasi-experiments with humans, to test hypotheses about potential environmental effects regarding developmental timing and exposure on individual differences in cognitive self-regulation. Cognitive self-regulation includes executive functions and closely related behaviors (e.g., attention, inhibitory control, working memory, “on task” persistent behavior). These comprise a major component of a broader self-regulation constellation of constructs spanning affective, cognitive, behavioral and biological domains—all of which include moderate to substantial genetic variance, and that are important correlates or predictors of physical, behavioral and mental health functions (for a review of self-regulation constructs see Bridgett, Burt, Edwards, & Deater-Deckard, 2015). The two current studies addressed the question of whether there would be discernible effects of developmental timing and duration of environment exposure on cognitive self-regulation, while controlling for potential genetic confounds.
Within-family Variation
At its most basic level, the rationale for within-family quasi-experiments rests on there being sufficient within-family (i.e., sibling and parent-offspring) variability. Developmental scientists have become more aware of the need to consider the nested structure of variance between- and within “groups” (e.g., families, classrooms, schools, neighborhoods, etc.), as seen in the rapid growth in the use of multi-level modeling approaches (e.g., Hayes, 2006). Descriptive and explanatory models of this kind can identify how much of the variation in the population rests within each family, pedigree, or kin network (narrowly or broadly construed).
There is ample evidence from evolutionary biology and population genetics that the variation in observed “phenotypes” in the population for a species arises from the tremendous variability observed within the pedigrees or families of that species (Plomin et al., 2012). This is no less true for humans. Families are the generators of much of the individual difference variation that exists, due to additive and interactive effects of genes and environments from sexual reproduction and childrearing (Reiss, Neiderhiser, Hetherington, & Plomin, 2000). Thus, for many if not most psychological constructs of interest, the variation observed in the population arises from the variation within families. This pattern of substantial within-family differentiation is seen in many attributes pertaining to a wide variety of psychological and health outcomes—and testing environmental effects using only between-family variation results in biased estimates (Almond, Chay, & Lee, 2005; Conley & Glauber, 2008; Gottfredson, 2004; Price & Swigert, 2012).
Genetically Controlled Sibling Quasi-experiments
Behavioral and social scientists can capitalize on within-family variation to test environmental hypotheses, by comparing certain types of siblings and parent-offspring pairs to control for genetic confounds. In general, these within-family designs test whether different types of sibling or parent-offspring pairs become more or less alike in outcomes of interest, depending on different timing and levels of exposure to pre- and postnatal environments.
One ideal experiment would involve manipulating environment exposure in order to differentiate observed behaviors of individuals who are genetically very similar (via selective breeding in inbred strains) or identical (i.e., clones) (Plomin et al., 2012). These experiments are common in animal studies. The quasi-experimental extension in humans is to study naturally occurring environmental variations that differentiate genetically identical monozygotic (MZ) twins (Vitaro, Brendgen, & Arseneault, 2009).
With regard to cognitive self-regulation, our prior work using this design showed that differential exposure to warm, supportive maternal behavior (e.g., affection, positive affect, rewarding verbalizations) was associated with greater MZ twin relative differences in attention span/persistence (A/P) over a one-year period in middle childhood (Deater-Deckard & Wang, 2012). However, that study did not examine any specific hypothesis regarding developmental timing or exposure effects. In the current Study 1, I tested whether genetically identical individuals were differentiated through prenatal environment variations. Specifically, the study tested the interaction between MZ twin relative difference in prenatal environmental risk (measured indirectly as twin difference in birth weight) and gestation length (i.e., duration of exposure), to predict MZ twin relative difference in cognitive self-regulation (i.e., short term and working memory [WM] and effortful control [EC]) in middle childhood.
Another ideal experiment would involve manipulating the environment to differentiate randomly selected genetically unrelated individuals via cross-fostering (Plomin et al., 2012). This design involves pairing genetically unrelated offspring with “adoptive” parents, and is commonly conducted in animal studies. The quasi-experimental extension in humans is to study naturally occurring environments and test whether and how they differentiate genetically unrelated individuals (e.g., siblings, parent-child pairs) in adoptive and step families (Reiss et al., 2000). In the current Study 2, I tested whether genetically unrelated adoptive siblings were differentiated through variations in developmental timing and duration of exposure to the postnatal adoptive home. Specifically, the study tested the interaction between adoptive sibling difference in duration of exposure to the postnatal adoptive home environment, and developmental timing of initiation of that exposure (i.e., age at adoption) on a measure of attention/persistence (A/P) in early and middle childhood.
Study 1
Environmental Differentiation of Genetically Identical Individuals
Lower birth weight and shorter gestation length are two of the most consistent prenatal predictors of poorer self-regulation (indicated by executive function, cognitive skills, attention deficits and hyperactivity) in childhood and adolescence (Bhutta, Cleves, Casey, Cradock, & Anand, 2002; Burnett et al., 2015). These statistical effects reflect alterations in the development of brain structures (including overall brain volume) and their related processes for cognitive functions and other capacities involved in learning and self-regulation that persist into adulthood (Nosarti et al., 2014; Walhovd et al., 2012). Accordingly, deleterious alterations to underlying neurobiological structures and functions are caused or enhanced by non-optimal prenatal environmental conditions that correspond with delayed growth (i.e., intrauterine growth retardation) and truncated gestation.
However, the effects linking birth weight, duration of gestation, and self-regulation outcomes could arise from confounded genetic factors. In the case of cognitive skills and executive function, individual differences are heritable, and the magnitude of this genetic variance increases with age over childhood into adolescence (Deater-Deckard, 2014; McCartney, Harris, & Bernieri, 1990). Birth weight also is heritable, with additive genetic influences accounting for about one-quarter to one-third of the variance that increases with age from prenatal to postnatal periods (Mook-Kanamori et al., 2012). Gestation duration also is heritable, with about one-third of the variance arising from additive genetic sources of variation (Plunkett & Muglia, 2008).
Thus, overlapping genetic influences very likely contribute to correlations between prenatal risks and self-regulation outcomes, calling into question “environmental” interpretations of such links when they are estimated without genetic controls (Boomsma, van Beijsterveldt, Rietveld, Bartels, & van Baal, 2001). Also, statistical estimates of presumed environmental influences on subsequent development are biased when confounded genetic effects are not correctly specified in the statistical model. Ignoring these confounds leads to inflated estimates of developmental effects arising from early (and often presumed to be “environmental”) risks such as premature birth and low birth weight (Almond et al., 2005).
One way to address the concern is to use a within-family quasi-experiment to control for genetic factors through selection of MZ twins, who are genetically identical. Studying their differences permits tests of the statistical predictive effects of birth weight (an indicator of prenatal environment risk) on twin differences in measures of subsequent self-regulation, while controlling for a host of potential confounds including additive and nonadditive genetic effects. Several published studies of identical twin differences in birth weight have shown evidence that the link between birth weight and later cognitive performance (e.g., IQ test scores) for singletons generalizes to within- pair differences (Newcombe, Milne, Caspi, Poulton, & Moffitt, 2007; Willerman & Churchill, 1967), although one study did not find this within-pair association (Boomsma et al., 2001).
Prior studies of MZ differences in birth weight and IQ have shown modest effect sizes, but this may be because they have not tested the moderating role of gestation duration (i.e., developmental timing of birth relative to conception). Indeed, the broader literature on birth weight and developmental outcomes has not tested whether gestation time and birth weight interact in their effects on subsequent developmental outcomes. Birth weight and duration of gestation are moderately correlated (e.g., in the .4 to .6 range; Karn & Penrose, 1951; Lesiński, 1962). This covariation reflects a presumed causal effect of a shorter gestation on a smaller birth weight, but the fetus also influences gestation:, complex interactions involving fetal biochemical signals are known to have an impact on the timing of labor (Condon, Jeyasuria, Faust, & Mendelson, 2004; Mendelson, 2009). Furthermore, genetic influences on birth weight probably operate differently across the range of viable gestational lengths (Mook-Kanamori et al., 2012). In addition, the effects of birth weight variation on subsequent health and developmental outcomes are systematically larger in the lower half of the distributions of birth weight and gestation length (Almond et al., 2005; Eriksen, Sundet, & Tambs, 2010).
Therefore, it is important to elucidate potential moderating effects of gestation length on birth weight effects. Estimating the independent or additive statistical effects of gestation duration and birth weight is not sufficient; the effect sizes may be attenuated because the synergy of their combined effects is not considered. The common practice of adjusting birth weight for gestation length (typically done using nationally established norms) does not address this. Rather, the statistical interaction should be estimated directly. Based on the literature, the current study tested the hypothesis that the interactive combination of premature birth and low birth weight (relative to the genetically identical co-twin) would most strongly predict poorer self-regulation later in childhood. Specifically, the deleterious effect on self-regulation (operationalized in the current study as working memory and effortful control) of low birth weight relative to co-twin would be strongest in the shortest gestations and weakest in the longest gestations.
Method
Sample
For the analysis of WM scores, the sample included 98 MZ twin pairs (62% female) with valid data on key variables. For the analysis of EC scores the sample included 85 pairs (60% female) with valid data on key variables. Twins were enrolled in the Western Reserve Reading and Math Projects (WRRMP) (Petrill, Deater-Deckard, Thompson, DeThorne, & Schatschneider, 2006). Twins in the WRRMP were M = 6.10 years old (SD = 0.68 years) at intake and were assessed annually thereafter. The current analyses focused on data collected in the third year of the study when the twins were nine years old on average, because that is the only assessment wave in which both WM and EC were assessed. WRRMP was a longitudinal twin study involving 436 pairs of same-sex twins that enrolled through school nominations, Ohio State Birth Records, and media advertisements. Questionnaires and testing of the twins occurred in their homes annually for eight years/waves. Identical twin zygosity was determined by genotyping via buccal swab or saliva sample, or in rare instances (around 13% of families) using a reliable and validated zygosity questionnaire (Goldsmith, 1991). The sample in the WRRMP was predominantly white, with 5% reporting as being African American and 2% as being Asian. In addition, the vast majority of parents were married, with 2% reporting as cohabiting and 6% as single mother household. Parent education level was widely distributed, with 10% with high school education or less, 16% some college education, 42% bachelor’s degree, 20% some postgraduate education, and 5% unspecified.
Measures
At enrollment when the children were in kindergarten or first grade, each twin’s birth weight and the gestational length of the pregnancy were reported by the mothers in a questionnaire (Smith, DeThorne, Logan, Channell, & Petrill, 2014). Two years later (in the third annual wave of this longitudinal study), each twin was assessed for WM using the sum of area scores from the digit span/memory for sentences subtest of the Stanford-Binet Intelligence Scale (Thorndike, Hagen, & Sattler, 1986), using standard scores based on the IQ distribution of M = 100 and SD = 15. In addition, one or both parents reported on each twin’s EC using the Child Behavior Questionnaire Short Form (sample items: “can easily stop an activity when s/he is told “no”; “is easily distracted…” [reverse scored]), a validated, reliable (alphas in .7-.8 range) and widely used instrument that captures individual differences in attention focusing and inhibitory control using a seven-point Likert-type scale with higher scores corresponding to higher EC (Putnam & Rothbart, 2006). The mother-father average was used in those cases where both parents provided reports (agreement r in the .4 to .6 range, p < .001, depending on the specific subscale, see for example, Mullineaux, Deater-Deckard, Petrill, Thompson, & DeThorne, 2009). Relative difference scores were computed for each twin pair [twin 1 – twin 2] on birth weight, WM, and EC. For any difference score variable, a positive difference score indicated that twin 1 score > twin 2 score, and a negative difference score indicated that twin 1 < twin 2.
Results and Discussion
Descriptive statistics are shown in Table 1. There were slightly different sample sizes for the WM and EC analyses because not all of the parents of the MZ twins who were tested completed the EC questionnaires; separate statistics for each analysis are shown. Pregnancies lasted about 35 weeks on average, with a wide normally distributed range. Twins were close to 5-1/2 pounds on average. The birth weights and gestation times for this sample were very typical of, and broadly similar to, the descriptive statistics found for the population of live twin births in the US (see Table 2 in Almond et al., 2005). The twins’ WM scores were normally distributed, and approximated the population normal distribution (i.e., M = 100 and SD = 15). Parents’ reported twin EC to be just above five on the seven-point Likert-type scale. The EC scores distribution approximated a normal distribution but had a modest negative skew, suggesting that on average, the majority of children were rated as moderate or high in EC—a typical distribution for EC in community samples of older children and young adolescents (e.g., Valiente et al., 2013).
Mean (M), standard deviation (SD) and min/max for gestation, birth weight, working memory and effortful control.
Mean (M), Standard Deviation (SD) and Min/Max for Sibling (n = 123 pairs) Age at Adoption, Time in Adoptive Home, and Attention/Persistence.
I tested the hypothesis using standard regression to predict variance in identical twin relative difference in self-regulation from the main effects and interaction of twin relative difference in birth weight, and each pair’s gestation duration. Consistent with classical applications of probability theory, one-tailed p-values were used for the significance tests of the regression coefficients because they corresponded to a directional hypothesis. Although it has become common for researchers to use two-tailed p-values when testing directional hypotheses to be more conservative, this conservative approach to reducing type-I error comes at the cost of inflating type-II error by reducing statistical power (Hays, 1988, pp. 276–277).
Identical twin relative differences in WM were examined first. The equation was significant, F (3, 94) = 4.82, p < .01, and explained 13% of the variance in identical twin relative difference in WM: relative difference in birth weight (β = .11, one-tailed p < .13), weeks’ gestation (β = −.36, one-tailed p < .001), and their interaction (β = −.25, one-tailed p < .01). Post-hoc probing using simple slopes at several cut-points above vs. below the sample mean for weeks’ gestation revealed a clear pattern, as shown in Figure 1. Twin relative difference in birth weight was positively associated with relative difference in WM (i.e., the heavier twin had a higher WM score compared to the lighter co-twin), but only among pairs born early—an effect that increased in size linearly as a function of how early. For example, among twins born at 27/28 weeks’ gestation or earlier (i.e., 2.5 SD below the sample mean for weeks gestation), the simple slope estimate was β = .78, p < .01—about 60% of the variance in twin relative difference in WM. In contrast, at the sample mean and among those born full term (36.5 weeks or more, 0.5 SD above M), there was a modest nonsignificant association approaching zero.

Within-family standardized simple slopes (p < .05 unless ns, nonsignificant) for twin relative difference in birth weight predicting twin relative difference in working memory (n = 98 pairs) and effortful control (n = 85 pairs). Slopes were estimated at various standard deviations (SD) above (+) and below (−) the sample mean for weeks gestation. As the twins’ weeks of gestation gets shorter (i.e., moving from right to left in the figure), twin relative difference in birth weight is an increasingly strong statistical predictor of twin relative difference in working memory and effortful control. Dashed lines show the between-family estimates of the same simple slopes; for bars without dashed lines, the between-family estimate was at or below zero and nonsignificant.
Next, identical twin relative differences in EC were examined. The full equation was not significant, F (3, 81) = 1.53, p = .21, R2 = .05. However, this was because only the statistical interaction between relative difference in birth weight and pairs’ gestation duration was significant: relative difference in birth weight (β = .11, one-tailed p < .18), weeks’ gestation (β = −.14, one-tailed p < .13), and their interaction (β = −.23, one-tailed p < .05). Post-hoc probing using simple slopes was conducted again, with results shown in Figure 1. The pattern was much the same as that found for WM. For example, among twins born at 27/28 weeks’ gestation or earlier (i.e., 2.5 SD below M), the simple slope estimate was β = .74, p < .01—about 55% of the variance in twin relative difference in EC. In contrast, at the sample mean and among those born full term (36.5 weeks or more, 0.5 SD above the sample mean), there was a modest nonsignificant association approaching zero.
What were the results if the same analyses were conducted using only between-family comparisons? This was done by averaging the twins’ birth weight, WM, and EC scores. For WM, the full equation was not significant, F (3, 95) = 1.02, p = .39, R2 = .03. Neither the main effects nor the two-way interaction was significant: birth weight (β = −.11, one-tailed p < .25), weeks’ gestation (β = .17, one-tailed p < .07), and their interaction (β = −.11, one-tailed p < .23). For EC, the full equation was not significant, F (3, 82) = 0.62, p = .61, R2 = .02. Neither the main effects nor the two-way interaction was significant: birth weight (β = −.15, one-tailed p < .18), weeks’ gestation (β = −.01, one-tailed p < .50), and their interaction (β = −.21, one-tailed p < .09). Results from post-hoc probing of the hypothesized interaction effect are shown in Figure 1.
As hypothesized, the magnitude of the within-pair relative difference in birth weight was increasingly strong as a statistical predictor of the within-pair relative difference in WM and EC, at shorter pregnancy durations. In contrast, there was no link found for full-term pregnancies. It is worth noting that the magnitude of the main effect of birth weight difference for WM and EC (.11 in both equations) was very similar to the effect found for IQ in the largest MZ twin birth weight differences study conducted to date (Newcombe et al., 2007). By comparison, the parallel analyses using between-family differences only, showed a much weaker effect.
By using an MZ twin relative differences study design, these findings can be interpreted more stringently with respect to inferences about probable environmental influences. In this design, the equation statistically and longitudinally predicted within-family differences in WM and EC from within-family differences in prenatal environmental risk, given that genetic differences cannot explain the birth weight difference for MZ twin pairs. The variances in, and covariances between, birth weight differences and WM/EC differences cannot be attributable to allele differences in the twins, nor to interactive effects of fetal genotype with maternal uterine environment factors. Rather, within-pair differences in birth weight among MZ twins arise from variables in the structure and function of the placenta and the chorion or chorions, such as the location and structure of the placenta and umbilical cords (i.e., single, fused, or double placentas/chorions; Almond et al., 2005). These are the same structural and functional variations that differentiate singletons’ birth weights (Salafia et al., 2008; Yampolsky et al., 2009).
As to whether such evidence generalizes to understanding population-level variation, an ample proportion of the between-family variance in birth weights originates within families. In their analysis of nearly 190,000 live twin births in the US, Almond and colleagues (2005) reported that nearly half of the total variance in birth weights was found within twin pairs. An even more substantial proportion of population variance is found within families when non-twin siblings are examined (Eriksen et al., 2010). As noted above in the Introduction, a similar pattern is likely to be found for all stable individual differences attributes, including those that capture cognitive and behavioral self-regulation.
There is a caveat. Birth weight and gestation are distal indicators of the overall “riskiness” of the prenatal environment, and MZ twin pregnancies have distinct features. Two-thirds of MZ twin pregnancies are monochorionic (in nearly every case, one placenta and chorion but two separate amnions), and the other third are dichorionic (two placentas or a fused placenta, two chorions, two amnions) (Ollikainen et al., 2010; Riese, 1999). MZ twin differences in birth weight are greater and gestation periods shorter, in monochorionic compared to dichorionic pairs (D’Antonio, Khalil, Dias, & Thilaganathan, 2013; Victoria, Mora, & Arias, 2001). Thus, the current finding could reflect an effect of chorion type on MZ twin differentiation.
This caveat aside, the implications of research findings for intervention or prevention with low birth weight and premature birth children ultimately depend on the identification of mediators of those prenatal environmental effects (Almond et al., 2005). Based on preliminary evidence from a few studies so far, these mediators are likely to include specific brain regions and networks (e.g., Nosarti et al., 2014; Walhovd et al., 2012) that have been shown to respond to self-regulation enhancement conditions (e.g., attention and mindfulness training; Tang et al., 2007). Furthermore, postnatal environments can ameliorate such effects (e.g., Madigan, Wade, Plamondon, Browne, & Jenkins, 2015). Effects of exposure to the home environment after birth are considered in Study 2.
Study 2
Environmental Differentiation of Genetically Unrelated Individuals
Postnatal exposure to particular childrearing environments is a key aspect of the development of individual differences in self-regulation skills over childhood and adolescence. A variety of home and parenting environment variables covary with stronger cognitive self-regulation, indicated by
As highlighted in the rationale for Study 1, prenatal environmental risk factors such as birth weight and gestation length are confounded with additive and interactive genetic effects. The same confound arises for tests of postnatal environmental effects. Furthermore, the literature on the home environment and self-regulation has not addressed a major confound of child age with two critical features of the postnatal environment: developmental timing of initiation, and duration of exposure (i.e., “dose”). Adoption studies have tackled directly the main effect of timing of initiation by examining whether the child’s age at time of placement in the adoptive home predicts an outcome of interest. Particularly noteworthy is the growing literature on internationally adopted children, who tend to be adopted later in childhood and also come from a much wider array of pre-adoption postnatal environments, such as orphanages and hospitals, compared to domestically adopted children (Juffer & van IJzendoorn, 2005). That literature shows that timing of placement matters. Positive effects of warm, supportive and stimulating adoptive home environments are strongest for children from adverse pre-adoptive environments who are adopted as early as possible, ideally in infancy (e.g., Rutter, 1998). For some, the ameliorative effects persist but for others they do not, leading to greater heterogeneity in developmental outcomes over longer periods of development into and through middle childhood and early adolescence (e.g., Beckett et al., 2006).
Although earlier adoptive placement consistently is associated with better developmental outcomes on average, this effect could arise from between-family selection effects whereby the parents who are most likely to adopt younger infants have more socioeconomic resources and developmentally enriching environments on average, compared to those who adopt older children. Furthermore, studying genetically related parent-child dyads limits inferences about the environment because of the gene-environment confound noted earlier. It is unknown whether differential “dosage” (in this case, sibling relative difference in time spent in the adoptive home) moderates this differential timing of placement effect. What is needed is a within-family analysis comparing adoptive siblings in the same home to test whether the developmental timing and dosage effects differentiate siblings. This is the goal of Study 2.
Adoptive siblings who come from different birth families are “pseudo-randomized”; they are no more alike or different genetically, on average, than two randomly selected individuals from a population unless selective adoptive placement occurs (e.g., if children from the same family are adopted together into the same adoptive home, or if features of the biological and adoptive parent are matched by the adoption agency). Researchers can examine the overall degree of differences in adoptive siblings’ self-regulation outcomes, to test for hypothesized effects of timing and duration of exposure to a childrearing environment while controlling for potential effects of genetic and prenatal environment differences (Plomin et al., 2012). Sibling relative differences in time spent in the postnatal home can be used to examine within-family effects for a host of developmental outcomes. In traditional sibling/family pedigree studies of children raised by their birth parents, this is computed as the sibling age difference (e.g., Eriksen et al., 2010). In the case of adopted siblings, this is computed as sibling relative difference in the time spent in the adoptive home (i.e., current age minus age at time of adoption).
Based on the literature, the hypothesis for Study 2 was that the interactive combination of sibling relative differences in age at time of adoption and time spent in the adoptive home would predict sibling relative difference in A/P. Specifically, the sibling who was adopted earlier would have stronger A/P, and this differential timing effect would be strongest in pairs with the largest relative differences in time spent in the adoptive home.
Method
Sample
The sample included 123 same-sex adoptive sibling pairs (63% female, age in years: older sibling “1”, M = 7.58 yrs [SD = 2.91 yrs], younger sibling “2”, M = 4.67 yrs [SD = 2.54 yrs]). They were selected from a larger dataset (described below) based on these criteria: same-sex pair adopted from different biological families; older sibling lived in the home longer or the same amount of time as the younger sibling; both were adopted before they reached 10 years of age (to focus the analysis on variance in adoptive placement in childhood); both lived in the adoptive home for less than 18 years (to parallel the years in home typical of legal minors); and both had valid data on one or both adoptive parents’ reports of A/P.
Siblings were drawn from the Northeast Northwest Collaborative Adoption Projects or N2CAP (Deater-Deckard, Petrill, & Wilkerson, 2003), a national survey study of 1797 adoptive families in the US. Parents were recruited anonymously by mail through adoption agencies and adoption lawyers, and participating parents completed their surveys by mail (or email in a small number of cases). The sample of adoptive parents was predominantly white (95%) and married (88%). The majority of the families (75%) had one or both parents with at least a four-year college degree; this is typical of adoptive parents in the US, who are higher in SES compared to nonadoptive parents (Child Trends, 2012; Stoolmiller, 1999). The sample was dominated by international adoptions (88%): 74% were adopted from Asia (China or Korea, with fewer from India, Thailand, Vietnam, Philippines, or other locations), and 14% were adopted from locations either in Central or South America, or Eastern Europe.
Measures
At enrollment, parents reported the age at which each adoptive sibling arrived in the adoptive home and documented that each child was adopted from a different biological family. The child’s age at the time of study enrollment was used to calculate the amount of time spent in the adoptive home. One or both parents rated each child on her or his attention span and behavioral persistence (A/P) on the attention-persistence scale from the Colorado Childhood Temperament Inventory or CCTI (sample items: “persists at a task until successful”; “plays with a single toy for long periods of time”. The CCTI is the predecessor to the EAS Temperament Questionnaire (Buss & Plomin, 1984) and has strong internal consistency and test-retest reliability (McClelland, Acock, Piccinin, Rhea, & Stallings, 2013). It is scored using a five-point Likert-type scale, and the five items are summed such that higher scores correspond with greater A/P. In those cases where the child had a rating from both parents, mother-father scores were averaged (sibling 1, inter-parent agreement r [61] = .60, p < .001; sibling 2, agreement r [58] = .55, p < .001). Sibling relative difference scores (older sibling 1 minus younger sibling 2) were computed for age at time of placement, time in the adoptive home, and A/P.
Results and Discussion
Descriptive statistics are shown in Table 2. Both children had been about one year old on average at the time of placement in the adoptive home, with a very wide range. Average sibling relative difference in age of placement was modest and highly variable between families. Sibling 1 had been in the home for more than six years, and the younger sibling 2 for over three years. Thus, the average difference was around three years, again with a wide range in the sample. The distribution for A/P scores in this sample was very similar to the distribution of parent-rated EC in Study 1. The mean fell just above the middle of the Likert-type scale, and the distribution had a modest negative skew such that most children were rated as being moderate to high in A/P. The sibling relative difference in A/P was modest and highly variable between families.
I tested the hypothesis regarding the interactive effect of sibling relative difference in age of adoption and sibling relative difference in time in the adoptive home using standard regression, to predict variance in sibling relative difference in A/P from relative difference in age of adoption, relative difference in time in adoptive home, and the interaction term. The overall equation was marginally significant, F (3, 119) = 2.21, p < .10, because one of the predictors was nonsignificant. The model explained 5% of the variance in the adoptive sibling relative difference in A/P: relative difference in time in adoptive home (β = −.13, one-tailed p < .11), relative difference in age at adoption (β = −.27, one-tailed p < .05), and the two-way interaction (β = .19, one-tailed p < .05). Post-hoc probing was conducted in the same manner as Study 1, with results shown in Figure 2 showing simple slopes for the prediction of sibling relative difference in A/P from sibling relative difference in age at adoption, as a function of sibling relative difference in years in the home. The negative association between sibling relative difference in age at adoption on sibling relative difference in A/P became stronger as a function of larger sibling relative difference in years in the home. Thus, at larger sibling differences in years in the home, the child who was older at time of adoption (relative to the sibling) also had poorer A/P compared to the sibling.

Within-family standardized simple slopes (p < .05) for sibling relative difference in age at adoption predicting sibling relative difference in attention span/persistence (n = 123 pairs). Slopes were estimated at various standard deviations (SD) above (+) and below (−) the sample mean for sibling relative differences in years in the adoptive home. As the sibling relative difference in time in the adoptive home gets larger (i.e., moving from right to left in the figure), the relative difference in age at adoption is an increasingly strong statistical predictor of sibling relative difference in attention span/persistence. Dashed lines show the between-family estimates of the same simple slopes.
Thus, with genetic effects controlled through selection of genetically unrelated siblings and parent-offspring pairs, the evidence for early initiation of exposure to the home environment playing a role in differentiating sibling children in their A/P behavior was contingent on sibling differences in duration of exposure to that environment. As in Study 1, the same analysis based on between-family comparisons showed a much weaker effect. Furthermore, by using an adoptive sibling within-family design, the findings can be interpreted more precisely regarding probable environmental influences. In this design, the equation predicted within-family differences in A/P from within-family differences in environment exposure, given that potential genetic confounds were controlled by examining genetically unrelated individuals who were being raised together by adoptive parents. The results point to the critical role of timing of initiation of environmental exposure, with the first year or two of life representing a potential sensitive period for predicting within-family similarity and the ameliorative effects of supportive, stimulating environments (e.g., Rutter, 1998).
Adoptive home environments may well be a naturally occurring group of childrearing environments that would maximally support growth in cognitive self-regulation in children (as in the environments studied in correlational and experimental intervention studies such as those highlighted by Hughes, 2011). Within the US, adoptive families represent the middle and top of the socioeconomic distribution (e.g., Stoolmiller, 1999)—households that are maximally enriching and supportive of stronger language and cognitive outcomes (Hart & Risley, 1992). When compared to home environments internationally, middle and upper SES households in the US are even more advantaged, since they capture the highest end of the wide distribution of enriching childrearing environments, especially when compared to populous but economically disadvantaged developing nations (Bornstein et al., 2012). Although this range restriction raises concerns for variance decomposition models of genetic and environmental sources of variance (as noted by Stoolmiller), this restricted range of enriching home environments is advantageous for testing additive and interactive effects of timing and duration of exposure.
General Discussion
Hypotheses about developmental timing and duration effects of prenatal and postnatal environments can be tested using within-family differentiation and assimilation designs that more precisely control for genetic effects through sample selection. These and other under-utilized within-family quasi-experiments are useful not only because of their precision and less biased estimates (compared to between-family designs), but because so much of the individual difference variation in the population is observed within families.
One reason that these types of sibling study designs (i.e., genetically identical pairs, genetically unrelated pairs) are not used more often is that the results may not generalize to non-twin, non-adoptive individuals and families. Certainly, identical twins are not common. Just above 3% of live infant births in the US in 2009 were twins, the majority being dizygotic fraternal pairs (Martin, Hamilton, & Osterman, 2012). Although identical twins are rare, nearly one-third of genetic full siblings share well more than half of their genetic alleles (Visscher et al., 2006). Thus, identical twins may not be as anomalous as they seem, when considered within the broader spectrum of actual genetic similarity among all siblings in biological families. Similarly, adopted children are not common. Adoptees account for 2% of all youth in the US, with the vast majority being adopted domestically either through government-sponsored foster care or private adoption agencies (Child Trends, 2012). However, living with genetically unrelated family members is actually very common in the US. Many individuals spend at least some of their childhood or adolescence with a genetically unrelated parent and sibling. Nearly half of adult Americans have a step-relationship themselves (e.g., are a step-parent or a step-child), nearly half (40%) of all new marriages are remarriages, and most remarriages include children from prior relationships; nearly one in seven adults (13%) is a step-parent of at least one co-residing minor (Livingston, 2014; Parker, 2011). Therefore, adoptive siblings may not be as anomalous as they seem, when considered within the broader spectrum of genetic similarity of siblings in step families.
Another potential reason the designs are under-utilized is that developmental scientists may be assuming that there is relatively little population-level variation observed within families relative to the variation between families. However, this assumption does not hold up to scrutiny. For instance, sibling data on A/P for same-sex DZ and MZ twin pairs from the WRRMP show very substantial within-family variation (relative to between-family variation) for both parents’ and observers’ reports (Deater-Deckard & Wang, 2012). Such patterns are observed in many psychological and health variables, from outcomes as diverse as IQ and body mass index (Gottfredson, 2004; Price & Swigert, 2012).
In conclusion, the entire system of processes that differentiate us—from production of germ cells to sexual reproduction, and from prenatal to ongoing postnatal gene-environment interaction effects (including epigenetic processes)—ensures that the biological and phenotypic diversity of humans is continually regenerated within each and every family (Meaney, 2010; Plomin et al., 2012). Analysis of potential environmental influences on this wide-ranging variability necessitates examination of within-family environmental differences. Ignoring the within-family variation by relying on statistical tests that only examine between-family variation not only confounds gene-environment effects, but results in biased estimates of effects. Although not without their own limitations, there are a host of quasi-experimental family designs that can be used to test hypotheses about specific environmental influences. These designs examine the within- and between-family variation, resulting in less biased statistical estimates of the potential impact of a particular environmental manipulation for improving healthy developmental outcomes in childhood and throughout the lifespan.
Footnotes
Acknowledgements
The author is grateful to the research participants, research assistants, colleagues and fellow investigators (in particular Stephen A. Petrill, PI for the Western Reserve Reading and Math Projects (NICHD: HD38075, HD46167) and co-PI for the Northeast-Northwest Collaborative Adoption Projects (NSF: BCS-9907860, BCS-9907811, BCS-0196511). The content is solely the responsibility of the author and does not necessarily represent the official views of the NICHD, NIMH, NIH or NSF.
Funding
The author(s) received the following financial support for the research, authorship, and/or publication of this article: The author's time was supported by NIMH grant MH99437 and NSF grant DRL-1118571.
