Abstract
Introduction
Social activity has been examined as a protective factor for several health outcomes (Cohen, 2004). Higher levels of social activity across the life span may be associated with better cognitive function and reduced risk of cognitive decline and Alzheimer’s disease (AD; Fratiglioni, Wang, Ericsson, Maytan, & Winblad, 2000; James, Wilson, Barnes, & Bennett, 2011; Seeman, Lusignolo, Albert, & Berkman, 2001; Seeman et al., 2011; Zuelsdorff et al., 2013); however, mechanisms behind this relationship are not well documented. Here, we consider two dimensions of social activity, social support and social (verbal) interaction, both of which may plausibly and independently influence cognitive function and rate of decline with advancing age. Research on social engagement and cognitive health has often interpreted the reduced risk of decline within a mechanistic framework of stress and coping, wherein social support provides relief from health-damaging stress (Cohen & Wills, 1985). Few studies have methodologically addressed other effects of social activity, such as intellectual stimulation during verbal interaction, despite the common conclusion that such stimulation represents a form of environmental enrichment and likely contributes to brain health and cognitive reserve (Bassuk, Glass, & Berkman, 1999; Hultsch, Hertzog, Small, & Dixon, 1999; Seeman et al., 2011; Ybarra et al., 2008; Ybarra & Winkielman, 2012).
The lack of tools designed to quantify verbal interaction in large populations of adults represents a gap that must be addressed before more rigorous exploration of potential benefits is possible. Although some studies have specifically sought to determine the potentially enriching role of verbal interaction, no validated instrument for collecting detailed information on quantity and quality of such interaction currently exists. One of the most ambitious time-use surveys (Hamermesh, Frazis, & Stewart, 2005) does attempt to quantify time spent socializing and communicating within select social domains, but the diary format used may be difficult to utilize in large cohort studies and/or those seeking to minimize additional participant burden. In addition, a few validated instruments assess the valence (positive vs. negative), but not quantity, of social exchanges (Newsom, Rook, Nishishiba, Sorkin, & Mahan, 2005; Ruehlman & Karoly, 1991). The ideal instrument should serve several purposes. First, it should accurately approximate the total amount of time that a participant spends actively engaged in conversation with other people. Second, it should assess average quality or valence of those interactions, which might be pleasant, unpleasant, or neutral. Valence of interaction could play a role in mitigation or generation of stress, thereby becoming important in stress-and-coping pathways (Ingersoll-Dayton, Morgan, & Antonucci, 1997; Rook, Luong, Sorkin, Newsom, & Krause, 2012; Sneed & Cohen, 2014; Windsor, Gerstorf, Pearson, Ryan, & Anstey, 2014); ergo, controlling for valence is essential in distinguishing the benefits of neuronally stimulating but affect–neutral verbal interaction from the cognitive health effects of interaction-related feelings of social support (positive valence) or stress (negative valence). Third, it should interrogate different domains of interaction, both to stimulate more accurate recall by participants and to differentiate between domains that represent a positive, neutral, or negative, potentially stress-generating, experience.
We developed and piloted a novel verbal interaction questionnaire, assessing quantity and quality of verbal interaction across a broad range of social milieu within the Wisconsin Registry for Alzheimer’s Prevention (WRAP; Sager, Hermann, & La Rue, 2005). The WRAP study is a longitudinal cohort study collecting rich biological, sociodemographic, and lifestyle data from middle-aged and older adults to explore factors that may, over the adult life course, influence risk of developing AD. Our larger goal is to rigorously examine the cognitive benefits of modifiable sociobehavioral factors, including support as well as the more rarely explored verbal interaction construct, in the WRAP sample; we also hope to encourage and enable similar studies in other samples. As such, the primary aim of this preliminary study was to evaluate both the short- and long-term reliability of our social activity instrument. We hypothesized that among the population as a whole, test–retest reliability should be high over both intervals, and that it should be higher over a short 6-week interval than over a 2-year interval. In secondary analyses, undertaken as an early exploration of construct validity, we examined the impact of key life events on reliability. We hypothesized that if discrepancies in test–retest data are due to real changes in network rather than noise, then test–retest reliability should be notably lower for individuals reporting such upheavals. Finally, in consideration of the theory of socioemotional selectivity, which describes age-related emotional prioritization of close social ties, positing that over time more casual interactions become less emotionally rewarding (Carstensen, 1992), we also explored the relationship between quantity of interaction and the perception of available social support stratified by age. We predicted that interaction quantity would be less strongly associated with perceived support in older participants.
Material and Method
Participants
Study participants were drawn from WRAP, an ongoing, longitudinal cohort study of initially asymptomatic middle-aged and older adults. The WRAP sample is enriched for a parental history of AD; two thirds of the total WRAP sample (n = 1,500) have at least one parent with diagnosed AD. All participants were cognitively normal at baseline as determined by scores on neuropsychological testing at their Wave 1 visit. Exclusion criteria for the current study include a history of multiple sclerosis, Parkinson’s disease, stroke, epilepsy, or meningitis. WRAP’s study design and assessment protocols are described in detail elsewhere (Sager et al., 2005). In brief, enrollment began in 2001; participants returned for follow-up visits 4 years after baseline (Wave 2) and every 2 years after that. At each visit, participants complete an extensive neuropsychological test battery and respond to questionnaires examining health and lifestyle factors. Although certain elements of the battery and questionnaires are constant across all visits, the study protocol does allow for addition of promising new measures such as the one described in this article.
Development and Implementation of the Questionnaire
The Perceived Social Support and Verbal Interaction Questionnaire described in this study represents a combination of established and novel items. First, social support items were drawn from the previously validated Medical Outcomes Study Social Support Survey (Sherbourne & Stewart, 1991). For each of nine items representing types of social support, participants were asked to rate the frequency of availability of social support on a 5-point scale, with response options ranging from 0 (none of the time) to 4 (all of the time). Responses from all nine items were summed to create a support index score (possible range, 0-36). Second, questions on the quantity and quality of face-to-face verbal interactions were developed for seven distinct social domains; domains were chosen based on existing social network and social exchange literature (Heaney & Israel, 2008; Litwin, 2001; Peek & O’Neill, 2001; van Tilburg, 1998) and tailored for WRAP participants, many of whom are still in the workforce and/or engaged in hobbies or group activities. Participants reported quantity of verbal interaction on a 6-point scale (less than 30 min, between 30 and 59 min, between 1 and 1.5 hr, between 1.5 and 2 hr, more than 2 hr, or a not applicable option) in each of seven social domains (see Supplementary Table 1). To account for the possibility of substitution or trade-off (e.g., a decrease in time spent interacting with friends if time spent interacting with colleagues increased), a summed total time index score, using either midpoint of time range in a given response or written-in quantity, was created to represent the average number of minutes per week spent verbally interacting with others. To assess qualitative valence of verbal interactions, each of the social domains included a follow-up inquiry on quality. Items on mediated forms of synchronous interaction including phone calls and Skype, as well as asynchronous interactions including instant messaging, text messaging, email, and letter-writing, are also included in the questionnaire. However, because evolving technologies and associated learned skills are an intrinsic and time-dependent part of such communications, and because different brain processes may be involved, the data from those items were considered separately and are not analyzed in the present study.
Survey Administration
The full Perceived Social Support and Verbal Interaction Questionnaire was added to the WRAP assessment packet for Wave 2 and all subsequent visits beginning in 2010. Two subsamples of WRAP participants are represented in the current study, designed to provide reliability data for future analyses incorporating the entirety of the longitudinal WRAP sample. Short-interval reliability of data was assessed in a convenience sample of participants (n = 107) returning for a regularly scheduled study visit in the 5 months following institutional review board (IRB) approval of the new items. These participants completed the new questionnaire as part of their visit and, at that visit, consented to complete the verbal interactions portion of the questionnaire again, via postage-paid mail-out, approximately 6 to 8 weeks later. Long-interval reliability was assessed in a convenience sample of participants (n = 136) with data from two regularly scheduled visits, approximately 2 years apart, available subsequent to the addition of the questionnaire to the WRAP assessment. All study activities were conducted with the approval of the University of Wisconsin Health Sciences IRB, and all subjects provided signed informed consent before participation.
Statistical Analyses
Acceptability and short- and long-interval test–retest reliability were assessed for the novel quantity and quality of verbal interaction variables. Long-interval test–retest reliability of social support items was assessed; reliability for up to 1 year has been established previously (Sherbourne & Stewart, 1991).
Acceptability of the verbal interaction items was assessed by a simple count of missing responses. Reliability was determined using quadratic-weighted kappa coefficients, percent agreement, or Pearson correlation coefficients. Pearson correlations were used to examine agreement for continuous variables (total weekly quantity of interaction time and a social support index score). A quadratic-weighted kappa statistic, which gives partial credit to similar (but not identical) categories, was used to determine reliability of within-domain categorical quantity- and quality-of-interaction variables; strength of agreement for these variables was defined according to Landis and Koch (1977) as poor to fair (0.00-0.40), moderate (0.41-0.60), substantial (0.61-0.80), or almost perfect (≥0.81). Because low prevalence of one or more responses for a given item may lead to lower kappa values even when agreement is high, and this proved a concern for quality-of-interaction variables, we calculated overall percent agreement in addition to a weighted kappa coefficient for those variables.
To assess the contribution of real changes in social network connections to discordance in test–retest data, we utilized demographic and lifestyle data collected in WRAP, and stratified by the presence/absence of one or more major life events (change in marital status, living arrangement, or work status; long-term interval only). Strength of agreement was assessed in these “Stable” and “Change” strata. To simplify presentation while accounting for response prevalence, in these stratified analyses, quadratic-weighted kappa coefficients were calculated for quantity of interaction, and simple percent agreement for quality of interaction.
Finally, to explore our hypothesized distinction between the two dimensions of the social activity—perceived social support and verbal interaction—we used Pearson correlations to examine the association between social support and total interaction quantity. We also combined both short- and long-interval samples and looked at the same association after stratifying by age (<65 vs. ≥65) to assess the socioemotional selectivity theory in our data set.
Results
Two subsamples of WRAP participants are represented in the current study; sample characteristics for each group can be seen in Table 1. Mean sample ages were 59.0 and 59.5. Both samples were majority female and highly educated, with near two thirds of each sample possessing at least a bachelor’s degree.
Key Sample Characteristics, Short- (8-Week) and Long- (2-Year) Interval Subsamples a .
A total of three participants were included in both groups.
Frequency of Response and Acceptability
Distribution of responses (combined sample including both short- and long-interval subsamples, n = 243) within each social domain can be seen in Table 2. The distribution of both quantity and quality responses varied by social domain. Acceptability, measured by the number of missing responses, appears excellent.
Percentage Distribution of Response to Quantitative and Qualitative Verbal Interaction Items, by Timepoint.
Only participants (n = 167 at test timepoint and n = 161 at retest timepoint) who reported working outside the home were asked to quantify time spent interacting with colleagues. Those responding that they “don’t have this contact” do work outside the home but do not verbally interact with colleagues.
Sample sizes are variable because only individuals reporting some interaction in a given social domain answered a follow-up quality question.
Quantity of Interaction
Results from reliability analyses are shown in Table 3. For both short and long intervals, weighted kappa values for quantity items indicated moderate to substantial agreement between test and retest points. Only spouse/partner and church/religious meeting domains showed substantial agreement for both intervals, whereas family member and stranger domains demonstrated only moderate repeatability for both intervals. No major differences between short-term and long-term reliability were seen for any domain. Percent agreement, based on number of exactly repeated responses, was generally lower than the quadratic-weighted kappa statistic. Correlation in reported total weekly time spent interacting was reasonably strong for both short (r = .61, p < .001) and long (r = .58, p < .001) intervals.
Measures of Agreement for Quantity and Quality of Verbal Interactions Over Short (8-Week) and Long (2-Year) Intervals.
Note. CI = confidence intervals.
Quality of Interaction
Reports of “Unpleasant” interaction in a given domain were rare in all domains (1.8% of all quality-of-interaction responses, across both samples), and even “neutral” interactions were rarely reported in select domains such as friends and club meetings; that low prevalence is reflected in the very low quality-of-interaction weighted kappa values. Percent agreement was fairly high for both the short (79%-97%) and long (83%-95%) intervals. Agreement was lowest for interactions with strangers and highest for interactions with friends.
Social Support
Association between social support index scores over the long test–retest interval was strong (r = .78, p < .001). Data were not available to assess the test–retest reliability of social support over the short interval.
Life Events
In a stratified analysis, those participants in the long-interval sample who experienced major life events between visits—change in marital status (7%), living arrangement (7%), or employment status (25%)—showed patterns of test–retest agreement that were distinct from those seen in individuals with relative life stability (Figure 1). For domain-specific quantity of interaction, weighted kappa values were lower among the life change group in all but one domain; the church/religious meeting domain represented a marked exception (Figure 1a). For domain-specific quality of interaction variables, percent agreement was similar between the stable and life change groups across all domains (Figure 1b). However, in our summary index scores for quantity of interaction and social support, key differences reappeared (Figure 1c). In the group experiencing a major life event, test–retest agreement on interaction was relatively lower (r = .19, p = .287) than it was in the stable group (r = .67, p < .001). Social support showed somewhat greater agreement in the stable group.

Agreement for quantity of interaction, quality of interaction, and summary index scores among stable and life change groups.
Interaction and Support
In a combined sample of all test–retest comparisons (n = 243), there were statistically significant, but somewhat weak, correlations between total weekly interaction and social support (r = .25, p < .001). In a stratified analysis, participants below 65 years (n = 183) showed relatively stronger correlation between verbal interaction and social support (r = .30, p < .001, n = 183) than did participants aged 65 and older (r = .06, p = .67, n = 60).
Discussion
This preliminary study reports the development of a novel instrument designed to assess verbal interaction in adults, as well as the reliability assessment of this instrument and an established Perceived Social Support Questionnaire. Repeatability of verbal interaction quantity was promising in a population of middle-aged and older men and women. Weighted kappa values, which do accord partial credit for agreement when test–retest responses are similar but not identical, were stronger than percent agreement, indicating that test–retest response shifts to nearby categories were common and should be expected. In contrast, the quality of interaction responses showed excellent percent agreement; moves between responses were rare. Consistent with the body of evidence showing that increased age is associated with reduced exposure to negative social interactions (Birditt, Fingerman, & Almeida, 2005; Fingerman, Miller, & Charles, 2008), the prevalence of the “Pleasant” response was often very high, and the “Unpleasant” response nearly absent. Because the response agreement due to chance alone (expected agreement) is high in such a situation, and kappa values decrease as expected agreement increases, the relatively low weighted kappa values were unsurprising (Viera & Garrett, 2005). Reliability of social support data, assessed only for the long interval, was good. Items were taken from the validated Medical Outcomes Study (MOS) questionnaire, so this finding was expected. This analysis does provide evidence that the good social support reliability seen in a study of a 1-year interval in a community sample (Sherbourne & Stewart, 1991) holds for longer intervals in middle-aged to older samples as well.
A secondary analysis stratifying participants in the long-term interval sample by the presence or absence of at least one of three major life events during the test–retest interim was conducted to assess the contributions of presumably true changes in social network and behavior over both other time-dependent fluctuations and random error or noise. Although other life events, such as health events that limit mobility, affecting network and interaction were likely experienced by some of the individuals included in the stable (no major life events) group, resulting in some misclassification, our results indicate that there were differences in agreement between the groups; the summary score for overall quantity of interaction illustrates this especially well. Although in most cases, agreement was, as expected, lower in the “change” (life event) group, it is interesting to note that interactions at church or religious meetings actually remain most consistent in the change group, perhaps indicating a tendency to strongly adhere to individual religious practices and preferences (whether engaging in or eschewing religious meeting attendance, and associated interactions) in the face of major life changes. Although care must be taken when making assumptions of temporal stability in small samples, the finding that interim occurrence of a major life event led to time-discordant verbal interaction and social support responses was encouraging in terms of instrument validity.
Finally, our finding that the relationship between reported verbal interaction and perception of social support was somewhat weak (r = .25, p < .001) provided evidence for our hypothesis that the two dimensions of social activity are distinct from one another and should be considered separately when assessing cognitive health benefits. Furthermore, we saw preliminary evidence of socioemotional selectivity in our participants; older participants demonstrated a weaker link between their quantity of social interaction and their perception of available social support than did younger participants, indicating that these social dimensions become even more distinct as individuals age.
This preliminary study has limitations. Most notably, data that would provide the opportunity for a rigorous assessment of convergent and divergent validity are unavailable for this sample. Our participants report spending an average of approximately 20 hr per week engaged in active, face-to-face conversation with other people, but the lack of similar instrumentation in other samples makes it difficult to know what “typical” weekly interactions might have been expected. In addition, although our subsamples are representative of the WRAP sample as a whole, the WRAP sample may not be representative of broader aging populations: First, although WRAP enrollment is currently focused on underrepresented ethnic and socioeconomic groups, the current sample is mostly White, education levels are notably high, and women are overrepresented; additionally, although neuropsychological tests revealed no cognitive impairment in our study subsamples, the WRAP sample as a whole is enriched for a family history of AD and, as such, is at risk of earlier cognitive change and incident dementia compared with the general population. Furthermore, sample sizes in stratified analyses were small, giving rise to wide confidence intervals. Finally, our ability to distinguish discordance due to poor recall or random error from discordance due to real changes in lifestyle and social networks was limited. Certainly self-reported verbal interaction data, as a proxy for gold standard measures such as short-term diary keeping or even automated audio recording, are potentially vulnerable to recall bias and to measurement error generally; this vulnerability must be considered particularly carefully in studies of cognitive impairment. However, maximizing recruitment, retention, and generalizability requires minimizing participant burden and many epidemiological studies must rely on self-report. In the absence of similar existing instruments, we believe that our verbal interaction measure, which demonstrated excellent acceptability, provides an important contribution to research on social networks and health; the benefits of casual conversation may be seen by providers and by patients alike as a particularly modifiable aspect of social networks, and an accessible form of activity regardless of age or ability. In consideration of a plausible and important role for social interaction in later life health trajectories, we believe these early findings can inform a more nuanced understanding and analytic strategy in the study of psychosocial and sociobehavioral factors in healthy aging.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the University of Wisconsin–Madison Graduate School and a Ruth L. Kirchstein National Research Service Award (T32HD049302) from the National Institute of Child Health and Human Development. The Wisconsin Registry for Alzheimer’s Prevention (WRAP) program is funded by the Helen Bader Foundation, Northwestern Mutual Foundation, Extendicare Foundation, Clinical and Translational Science Award (CTSA) program through the National Institutes of Health (NIH) National Center for Advancing Translational Sciences (NCATS) Grant UL1-TR000427, and National Institute on Aging Grant 5R01-AG27161-2 (WRAP: Biomarkers of Preclinical AD).
