Abstract
The growing number of high–functioning adults seeking accommodations from testing agencies and postsecondary institutions presents an urgent need to ensure reliable and valid diagnostic decision making. The potential for this population to make significant contributions to society will be greater if we provide the learning and testing accommodations to allow them access to knowledge, as well as the means to demonstrate their extraordinary abilities. The criteria and decision making used to identify high–functioning adults with learning disabilities (LD) must be robust yet flexible enough to account for individual differences, measurement fallibility, and examiner expertise. The purpose of this article is to explore legal, measurement, and clinical issues surrounding the provision of accommodations to high–functioning individuals with LD.
The definition of disability in the American with Disabilities Act of 1990 (ADA) was imported directly from Section 504 of the Rehabilitation Act of 1973. At that time, the legislative focus was primarily on individuals with sensory and physical impairments; adults with hidden disabilities such as learning disabilities (LD) were not at the forefront of the thinking of disability advocates and policy makers (Mather, Gregg, & Simon, 2005). Because fewer adults identified with LD were attending colleges and universities during that era, there was less concern over accommodating learning needs. Today, however, individuals with LD represent the largest group of students with disabilities attending postsecondary colleges and universities (Ward & Berry, 2005). This population also presents the greatest number of requests for accommodations on postsecondary entrance and licensure examinations (Brinckerhoff & Banerjee, 2007; Lindstrom & Gregg, 2007). Therefore, it is understandable why the identification of markers to operationalize such legal terminology as “substantial limitations” or “most people” is receiving significant attention.
Requests for accommodations by high–functioning adults with LD appear to present the most challenges to the educational and legal system. Differences of opinion as to what constitutes a substantial limitation to learning are central to the debate surrounding the educational and legal appropriateness of providing these adults access to accommodations. One view is that only students with severe academic deficits should be provided accommodations for learning (Flanagan & Mascolo, 2005; Gordon, Lewandowski, & Keiser, 1999; Lovett & Lewandowski, 2006). A second view suggests that no single test score or aspect of an individual's history should be used to determine a diagnosis. What is essential for the accurate documentation of a disability is differential diagnosis that explores ways in which the disability interferes with performance on learning tasks (Gregg, Coleman, Davis, Lindstrom, & Hartwig, 2006; Kamphaus, 2005; Mather & Gregg, 2006; Shaywitz, 2003).
Defining High Functioning
The term “gifted” describes individuals demonstrating one or more of the following characteristics: high capability in academic or intellectual areas, creativity, artistic talent, and strong leadership capacity (U.S. Department of Education, 1993). Unfortunately, there is little agreement among professionals on how best to operationalize such a term when an individual also demonstrates LD (Baum, 1990; Bradley, Danielson, & Hallahan, 2002; Brody & Mills, 1997; Lovett & Lewandowski, 2006; McCoach, Kehile, Bray, & Siegle, 2001; Nielson, 2002; Silverman, 2002). Therefore, we have chosen to use the term high–functioning adults with LD (HFLD) rather than “above average,”“gifted,” or “talented” in order to avoid the ambiguity and controversy associated with such terms (McCoach et al., 2001). Because the focus of our discussion (as well as that of controversy among the courts, testing agencies, and postsecondary institutions) relates to individuals with strong cognitive abilities who demonstrate significant problems in learning, this term seems most appropriate.
Most definitions of LD hinge on performance in one or more specific academic areas that is significantly and unexpectedly poor in comparison to the individual's crystallized or fluid processing abilities. The underachievement should persist over time and not be due to lack of instruction, health impairment, or other peripheral factors. This pattern applies to all LD, but the high–functioning LD profile may look somewhat different, psychometrically, from other LD profiles. Imagine, for example, two adults (Anna and Beth) who demonstrate a large discrepancy (two standard deviations) between estimated verbal reasoning and basic reading skills. In each case the discrepancy was identified in childhood, has persisted over time despite remediation attempts, is attributable to documented deficits in phonemic awareness, and causes daily problems with academic and professional tasks. During her most recent evaluation, Anna scored in the average range (∼60th percentile) on the Verbal Comprehension Index of the Wechsler Adult Intelligence Scale–III (WAIS–III; Wechsler, 1997a) and well below the average range on tests of basic reading skills (∼5th percentile). During a comparable evaluation, Beth's respective scores were well above the average range (verbal comprehension ∼92nd percentile) and at the lower end of the average range (reading ∼30th percentile). Beth's case exemplifies a high–functioning LD profile. Because her reading scores are in the average range, however, some professionals may determine that she is not eligible for accommodations.
Legal Issues Influencing Clinical Decision Making
The legal debate presented in the majority of court cases surrounding HFLD centers around whether an individual is entitled to reasonable accommodations under the ADA. Entitlement to accommodations requires individuals to demonstrate proof that they are qualified, and sufficiently limited, to be considered “disabled.” Under the ADA, a disability is defined as “… a substantial limitation to one or more major life activities of that person.” In turn, “substantial limitation” is defined as an inability or a significant restriction in the condition, manner or duration in which one performs a major life activity “as compared to most people” (28 C.F.R. Pt. 35, App. A § 35.104; italics added for emphasis). Legal interpretations of the term substantial limitations have differed in recent cases involving the adult population with LD.
Some courts have adopted an operationalization of substantial limitation in learning that is literally based on derived achievement scores (using the bell curve as the metric) that fall below the average range of the statistical mean (i.e., below the 16th percentile). Support from the courts for this argument has been documented in several cases: Gonzales v. National Board of Medical Examiners, 2000; Love v. Law School Admission Council, Inc., 2005; Price, Singleton, & Morris v. National Board of Medical Examiners, 1997; and Wong v. Regents of the University of California, 2004. Other courts have operationalized substantial limitation for this group of adults as the documentation of cognitive and linguistic integrities that, in the presence of processing deficits, result in unexpected learning failure (see Kavale, Kaufman, Naglieri, & Hale, 2005; Mather et al., 2006). Proponents of the second argument interpret the ADA as not requiring low performance or a poor outcome as the sole determiner of a disability. Rather, they suggest that the ADA requires a comprehensive assessment of the effect of a putative impairment on an adult's life. The focus is on the condition, manner, or duration in which one performs a major life activity—not whether one can perform it, but how one performs (Mather et al., 2005). Support from the courts for this argument has been documented in several cases: Albertson, Inc. v. Kirkingburg, 1999;Bartlett v. New York State of Law Examiners, 1997/1998/1999/2000/2001; Murphy v United Parcel Service, 1999; Sutton v. United Airlines, 1999.
Measurement Issues Influencing Clinical Decision Making
Basic measurement principles guide the professional decision making that is essential to the identification of disabilities. Lack of sensitivity to these principles has led to significant variability across identification practices, causing many HFLD to be denied access to accommodations.
First, it is imperative to underscore that evidence of validity is critical to specificity and reliability for all clinical decision making. Each of the practices we will discuss is dependent on the examination of strong validity evidence. Similar to the view stressed in the 1999 Standards, Messick (1995) asserts that validity is actually a unitary concept and that there are not different types of validity, only different types of validity evidence (see also Benson, 1998). Most forms of score validity are subsumed under the concept of construct validity, which concerns whether scores measure the hypothetical construct the researcher (and/or test developer) believes they do. There is not a single, definitive test of construct validity, nor is it typically established in a single study (Kline, 2005). The Standards specify five “sources of validity evidence… that might be used in evaluating a proposed interpretation of test scores for particular purposes” (p. 11). These sources are evidence based on (a) test content (subtest and factor), (b) response processes, (c) internal structure (measurement invariance), (d) relations to other variables (concurrent), and (e) consequences of testing.
As a means of organizing our thinking about specific measurement practices influencing the diagnosis of HFLD, we have identified five specific assessment constructs to discuss. These concepts include (but are not limited to) norming stratification, subtest score interpretation, factor score interpretation, confidence intervals (CI), and measurement invariance. Careful consideration of how these measurement constructs influence score distributions for high–functioning individuals is critical to equitable and ethical practice.
Norm Stratification
Standardization sampling allows evaluators to compare an individual's performance to that of a normative group. Most test developers use stratification variables to help them select a representative sample of the national population to use as the comparison group. However, it is essential for clinicians to remain cognizant that test norms embody only a sample of performance for a specified group of individuals identified by a test developer. The stratification variables chosen by test developers differ in number, type, and sample size. A clinician must carefully consider how similar the norming sample demographics are to those of the individual being tested. In addition, when an examiner decides to compare the performance of an individual across different measures that were not co–normed, it is important to remember that significant error is introduced into the comparison process because the tests were normed on entirely different samples of individuals (Cicchetti, 1994; Kamphaus, 2005).
The ADA (or 504) specifies that the “average abilities of most people” must be used as the benchmark for differentiating what are functional limitations. While a great deal of attention is placed on the normality of the score performance of the individual being tested, little attention has been given to how representative the distribution of specific test scores is of the target population. The stratification sampling done for a test does not ensure that the average person is captured; it is simply a means of reflecting estimated percentages of subpopulations within a given society. The U.S. Census data is more representative of most people. As an example, in Tables 1 and 2, several cognitive and achievement tests commonly used in the diagnosis of LD are listed with their norming samples across specific ages. In addition, Table 1 provides the percentage of the normal population these test norms represent; as an important reminder, such samples denote only minuscule segments of their target populations (i.e., “most people”) in the United States. Clinicians must be mindful of the small sample sizes, restricted stratification variables, and measurement error of test scores when interpreting the performance of high–functioning adults on psychometric assessment tools.
Norm Sample Sizes for Adult Intelligence Measures
Note. Data based on the Technical Manual for each measure.
U.S. population data based on 2000 Census (American Fact Finder Web site).
Wechsler Adult Intelligence Scale—3rd Edition, Full Scale Intelligence Quotient.
Reynolds Intellectual Assessment Scales, Composite Intelligence Index.
Woodcock–Johnson Tests of Cognitive Ability—3rd Edition, General Intellectual Ability Cluster (Standard Battery).
Norm Sample Sizes for Adult Achievement Measures
Note. Normative data based on the Technical Manual for each measure.
Gray Oral Reading Test—4rth Edition.
Nelson–Denny Reading Test.
Woodcock–Johnson Tests of Achievement—3rd Edition, Total Achievement.
Subtest Score Interpretation
Unfortunately, examination of isolated subtest scores is often the primary source of evidence upon which many professionals base clinical decisions in the diagnosis of HFLD. Decisions to provide or deny accommodations often rest “… on the clinician's acumen and not on any sound research base” (Kamphaus, 2001, p. 598). Historically, the examination of subtest scatter has been advocated by many professionals interested in gifted students with LD (Brody & Mills, 1997; Nielsen, 2002; Silverman, 2003). Such a practice is based on ipsative intepretation or a comparison of the individual to himself or herself rather than to the normative sample. In contrast to the norm–based interpretation, the ipsative approach uses the individual as the standard of reference (Kamphaus, 2005). Through this practice, an examiner investigates the magnitude of an individual's subtest score differences on cognitive and/or achievement measures. As Lovett and Lewandowski (2006) point out, this practice is questionable due to the high incidence of extreme scatter and measurement error across assessment tools. We know that extreme scatter can be significant but not diagnostically useful (Glutting, McDermott, Watkins, Kush, & Konold, 1997). In addition, researchers have found that the scaled score range among subtests increases as intelligence scores increase and that subtest scatter increases as the value of the highest subtest score rises (Detterman & Daniel, 1989; Patchett & Stansfield, 1992; Schinka, Vanderploeg, & Curtiss, 1997).
Subtest interpretation is a frequently used method for arguing that examinees are not functionally limited in learning if their scores on any one or two subareas of an academic domain (e.g., reading fluency, reading comprehension) fall within the average range. Failure to integrate single subtest scores with other data (e.g., test scores from other measures, behavioral observations, background information), in combination with insufficient knowledge of research–based evidence and theory, leads many clinicians to make inappropriate decisions that result in denied access to accommodations for HFLD. Take, for example, the case of Jovita, a 35–year–old African American female who scored at the 96th percentile on the Verbal Intelligence Index (VIX) of the Reynolds Intellectual Assessment Scales (RIAS; Reynolds & Kamphaus, 2002). Her reading scores were markedly lower: the 6th percentile on the Woodcock–Johnson III Tests of Achievement, Word Attack (WJ III: Woodcock, McGrew, & Mather, 2001); the 15th percentile on the WJ III Letter/Word Identification subtest; the 20th percentile on the Nelson–Denny Reading Comprehension Test (Brown, Fisco, & Hanna, 1993); and the 25th percentile on the WJ III Reading Fluency subtest. On the basis that her scores on the latter two measures were average (>16th percentile), she was denied the accommodation of extended time on a high–stakes graduate entrance examination. It is difficult to view such a judgment as being equitable, justifiable, or within the spirit of the ADA.
Factor Score Interpretation
Factor (index) scores provide a more reliable and valid measure of abilities than do subtest scores. Concern over the validity of using full–scale scores over factor–scale scores is critical to any discussion of HFLD (see Glutting, Watkins, Konold, & McDermott, 2006) In a recent study by Bowden et al. (in press), the WAIS/WMS–III performance of three groups of adults was investigated (norm, college LD, college Attention Deficit Hyperactivity Disorder [ADHD]). When the index score means (Verbal Comprehension [VC], Perceptual Organization [PO], Working Memory [WM], and Processing Speed [PS]) were examined, highly significant differences were observed within and between groups. In the group with LD, expressed in terms of Cohen's
We agree with Elwood (1993) that “significance alone does not reflect the size of the group differences, nor does it imply the test can discriminate subjects with sufficient accuracy for clinical use” (p. 409). At the same time, one should not ignore these differences across factor scores. As can be seen in Table 3, which includes data from three populations of adults (i.e., college students with LD, norm sample of WAIS–III, and college students without LD), the pattern of lower performance on the working memory and processing speed factor was observed only in the population with LD. As Bowden et al. (in press) found, most people (regardless of educational attainment) do not appear to demonstrate significant factor score differences.
Comparison of WAIS–III Index Scores Across Three Adult Samples
Note. WAIS–III = Wechsler Adult Intelligence Scale—Third Edition; VCI = Verbal Comprehension Index (Wechsler, 1981).
POI = Perceptual Organization Index; WMI = Working Memory Index; PSI = Processing Speed Index.
Table 4 reflects our investigation of whether this same pattern (i.e., low working memory and processing speed index scores) was present across adults with LD representing different cognitive ability levels (high functioning and average ability). Despite the influence of regression to the mean, the two groups of adults with LD exhibited a similar pattern of lower working memory and processing speed scores. Although these profiles are not specific enough for the purpose of independent differential diagnosis, they do provide a sensitivity to the population. We suggest that examiners follow Kamphaus’ (2005) advice to always use “at least two pieces of corroborating evidence for each test interpretation made. Such a standard forces the examiner to carefully consider other findings and information prior to offering conclusion” (p. 35). A high–functioning adult's performance across cognitive and achievement factor scores represents one form of evidence that can be used to support diagnostic decisions. Utilization of multiple sources of evidence is equally important when a professional is in the position of denying an individual access to accommodations.
WAIS–III Index Scores for Average and High–Functioning Adults with Learning Disabilities
Note. WAIS–III = Wechsler Adult Intelligence Scale—3rd Edition; FSIQ = Full Scale Intelligence Quotient; VCI = Verbal Comprehension Index; POI = Perceptual Organization Index; WMI = Working Memory Index; PSI = Processing Speed Index.
Ci
The popularity of subtest selection methods has contributed to an environment in which professionals focus primarily on single subtest scores and give little attention to the standard error of measurement or confidence intervals. When cutoff methods are applied to determine the extent or severity of an adult's functional limitation in learning, the standard of the 16th percentile is often seen as a single demarcation point when, given the error implicit in assessment, a range of scores would be more appropriate. We strongly suggest the use of CI as a means to establish the probable presence or absence of functional limitations for the population of high–functioning adults.
CI reflect the probability that a particular score represents a true score. According to the 2001 APA Publication Manual, CI are “in general, the best reporting strategy. The use of confidence intervals is therefore strongly recommended” (p. 22). While related to standard error (SE), CI also include the critical value of the relevant test statistic (see Cumming & Finch, 2005 and Thompson, 2006 for in–depth discussions). However, Thompson warns clinicians not to “… fall in love with your point estimate, at least when SE is large” (p. 205). Accordingly, wider CI imply less precise estimates of performance. Fortunately, modern software for most advanced assessment batteries computes CI for an examiner.
Three points related to CI are important in the diagnosis of HFLD. First, the CI (rather than a single standard score) should be the data used for diagnostic decision making as they represent the “true score” as opposed to the observed score. Second, it is important to keep in mind that intervals can overlap and still be significantly different. Further, intervals that do not overlap are always significant, but not necessarily diagnostic (Thompson, 2006). Third, very wide CI might indicate that either the construct being measured or the particular measurement instrument is neither reliable nor valid. In such a situation, additional evidence will be required to determine an adult's actual proficiency on either achievement or cognitive tasks.
Measurement Invariance
Most widely used cognitive and achievement measures are developed based upon theory and factor–analytic studies (Kamphaus, 2005). Professionals must retain a healthy degree of skepticism for the idea that the inferences drawn from test batteries are equivalent across populations with and without disabilities. Accurate identification of LD requires explicit evaluation of the assumption of measurement invariance. This issue is of particular importance in view of the debate regarding the most appropriate methods for operationalizing the definition of HFLD. Examination of measurement equivalence provides a direct test of the hypothesis that the same set of latent variables underlies a set of test scores in different groups and that the metric relationships between observed scores and the corresponding latent variables are the same. Unfortunately, the lack of data related to measurement invariance in the assessment of the adult population with LD has led to untested assumptions regarding cognitive and achievement abilities (Bowden, Cook, Bardenhagen, Shores, & Carstairs, 2004; Vandenberg & Lance, 2000).
Score comparability ensures that the meaning and interpretation of the test score is the same for all groups of students (Pomplun & Omar, 2001). However, despite its common use in the Standards for Educational and Psychological Testing (AERA/APA/NCME, 1999, Standard 10.11) and elsewhere, the term “score comparability” is not defined anywhere in the standards. The lack of a clear definition has led many researchers to conceptualize it within the limited framework of differences in mean scores across groups. In fact, for many years, researchers assumed that to appropriately measure these differences one had only to administer a measure across different testing situations and/or different groups and compute the difference between the two (or more) observed scores (Cronbach & Furby, 1970). However, a number of potentially problematic issues in the use of difference scores have been identified (see Cronbach, 1992; Edwards, 1994). To obtain an adequate assessment of these differences when comparing mean scores across groups, it is essential that the measure is perceived to be used in the same way by the examinees. In other words, in order to make valid comparisons across groups of respondents it is necessary to show that the two measurements are psychometrically equivalent (Horn & McArdle, 1992).
Eligibility Methods for Determining Substantial Limitations
Under the ADA, as mentioned earlier, a disability refers to a substantial limitation to a major life activity. For individuals with LD, that activity is learning. The substantial limitation is defined as an inability or restriction in the condition, manner, and duration of learning. Operationalizing this definition of LD requires valid and reliable eligibility criteria. The two most common practices applied by professionals in identifying functional limitations has been (a) examination of discrepancies across performance and (b) cutoff standards. These practices, if used with little other evidence will lead to professional decisions that either over– or underidentify individuals differing in demographic variables (Brackett & McPherson, 1996; Gregg, 1999; Hoy et al., 1996).
Establishing valid eligibility criteria for HFLD is dependent primarily upon the recognition by professionals that the condition is a developmental disorder. Barkley (2006) provides a scholarly discussion of the adult population with ADHD that has direct implications for individuals with LD. The symptoms demonstrated by an adult with ADHD or LD are, by definition, viewed as a developmental disorder as compared to age–expected behaviors and thus cause individuals to be substantially limited in a major life activity. Barkley concludes his discussion with two relevant points: (a) the symptoms are not “static pathological states or absolute deficits in or less of formerly typical functioning” (p. 265) and (b) the symptoms must be determined by “age–relative thresholds” (p. 265). The idea that we apply fixed symptom thresholds across the life span leads to invalid criteria for the adult population. The importance of looking at how cognitive, language, and achievement abilities change and influence each other differently across the life span is critical to reliable and valid diagnostic decision making appropriate to the adult population (see Gregg, in press, for an in–depth discussion).
Discrepancy Methods
Examining discrepancies across performance areas is common practice and can help guide decision making. By definition, an individual with a disability is substantially limited in a major life activity compared to the average person. Discrepancy scores are a critical component of eligibility criteria for the adult population, but they should never be the single criteria for eligibility decisions as they are sensitive, but not specific enough for diagnosis. Unfortunately, a discrepancy between a full–scale intelligence score (i.e., IQ) and an achievement subtest score became the hallmark of LD diagnosis for many years with little consideration of other diagnostic factors (Mather & Gregg, 2006). We are not going to discuss in depth the problems with ability/achievement discrepancy formulas because a great deal of literature is available pertaining to their limitations (e.g., Berninger, 2001; Fletcher et al., 2001; Fuchs, Mock, Morgan, & Young, 2003; Gregg, et al., 1999; Kavale, Kaufman, Naglieri, & Hale, 2005; Mather et al., 2006). However, what we can conclude from this research is that diagnostic decisions based solely upon an ability–achievement discrepancy formula are not sensitive to the significant shared variance across ability and achievement measures. The ability–achievement discrepancy method treats the full–scale intelligence factor as the predictor of achievement when rarely are items so distinct across these constructs (Anastasi, 1988; Gregg et al., 1999; Kaufman, 1990).
The theoretical distinction between ability and achievement measures is being challenged by some researchers because the correlations between the two are usually very strong. For instance, Kamphaus (2005) states: “I treat intelligence and achievement tests as measures of differing types of achievements that have reciprocal influences upon one another” (p. 475). However, as noted by Vandenburg and Vogler (1985), high correlations between subtests do not necessarily mean that psychometric tasks measure the same construct. The recognition by professionals of the multiplicative effects of knowledge, achievement, and cognitive processing on tasks measuring performance is critical to decision making. For instance, Gregg et al. (2005) raised the question of whether the dependence of college students with LD on such cognitive and language processes as working memory, processing speed, and phonemic awareness is the result of deficits in crystallized knowledge and underlying verbal abilities or whether cognitive and language processes led to deficits in crystallized knowledge.
Intra–cognitive and intra–achievement discrepancy methods are used to investigate an individual's discrepancies across cognitive abilities and across achievement measures. Such within–person discrepancies share some of the same problems as ability–achievement discrepancies if used as the sole criterion for disability determination (Brackett & McPherson, 1996; Hoy et al., 1996; Mather et al., 2005). Scatter across cognitive and/or achievement measures does not define LD any more than a discrepancy between ability and achievement is specific to the diagnosis. In addition, recognizing the influence of the age of the person (adult rather than child), educational attainment, and task demands are critical to diagnostic decision making. Not all scatter is relevant when age, task, and educational experiences are considered. However, significant scatter across cognitive and achievement measures requires further investigation. As Mather and Gregg (2006) note, “Researchers at times focus on isolated abilities (e.g., phonological awareness, working memory, reading decoding, spelling) while ignoring the total system surrounding learning” (p. 102).
Yet, all definitions of LD characterize the condition as a disorder within the basic psychological processes that functionally limit specific types of academic performance. As Kavale et al. (2005) note, it is important to align the definitions with the methods used to identify these individuals. Recognition that an adult with LD demonstrates deficits with specific cognitive and linguistic processes that significantly contribute to functional limitations with different types of learning (e.g., reading, writing, mathematics) differentiates this group from adults with low literacy but no disabilities. While some critics raise the concern that one cannot differentiate between low achievement due to LD and low literacy, recent research is challenging this perspective (Bowden et al., in press; Gregg, in press; Kavale, Fuchs, & Scruggs, 1994; Mather et al., 2005). As with the ability–achievement discrepancy formula, researchers found that neither the presence nor the absence of cognitive and/or achievement discrepancies can be the sole method of valid decision making in the identification of LD (Brackett & McPherson, 1996; Hoy et al., 1996; Mather et al., 2005).
Cutoff methods
Cutoff methods define LD as significant underachievement where academic performance must fall below the 16th percentile (Dombrowski, Kamphaus, & Reynolds, 2004; Gordan et al., 1999; Siegle, 1990). This approach results in a high false positive rate for low–achieving adults and a high false negative rate for high–achieving individuals (Brackett & McPherson, 1996; Hoy et al., 1996). Unfortunately, by defining LD as simply low achievement in the form of test scores below the 16th percentile, the condition becomes confounded with a myriad of reasons for low academic performance that are separate from LD (e.g., malingering; lack of instruction, poor school attendance, affective factors, or health impairments). For instance, Hoy et al. (1996) found that, if one simply uses a cutoff criteria (<16th percentile), approximately 90 percent of adults referred to Vocational Rehabilitation Services for LD testing would qualify for services. By contrast, based on an integrated model only 70 percent of these adults would have met the criteria. Cutoff methods equate underachievement with performance below the 16th percentile and identify this profile as a disability without considering the possible cause(s) for the low literacy.
An adult's past and current experiences with literacy through exposure to print at home and with school discourse has a significant influence on test performance that can mask HFLD (Clay, 1985; Mason, 1992; Wilkinson & Silliman, 2000). As noted earlier, educational attainment influences performance on cognitive and achievement measures. While some HFLD score above the 16th percentile on select tests of reading, this is not sufficient evidence that they do not struggle with literacy tasks. Most people do not struggle to recognize words to which they have had extensive exposure. Most people are not exhausted by the process of reading and writing. Most people do not need extended time for reading and writing. HFLD are not less capable than most people, but they do continue to struggle with literacy tasks as compared to their peers. In other words, the condition makes learning difficult, but it does not prevent these adults from learning and performing academic tasks when provided appropriate accommodations.
Through years of special tutoring and persistence some individuals with LD can improve enough academically to break the 16th percentile barrier. However, this does not mean that a disability is no longer present—it only means that the person has managed to achieve a certain level of academic success on some measure(s). The context, task demands, and format of a test can influence an individual's performance (e.g., use of a multiple–choice format might lead to an overestimate of functional skill). Researchers have provided evidence that LD is not the same as underachievement (Kavale et al., 2005; Kavale & Forness, 2003; Scruggs & Mastropieri, 2002); neither the presence nor absence of underachievement is the sole discriminator for LD.
Integrated Method
The integrated method requires clinicians to integrate (a) quantitative data (standardized and criterion), (b) observational data, (c) background information, and (d) research–based evidence into the decision–making process. Researchers supporting such methods emphasize that a single score or formula (e.g., ability–achievement discrepancy) is not a valid means of identifying LD (Mather & Gregg, 2006; Kamphaus, 2005; Flanagan & Mascola, 2005). The following examples of integrated methods highlight the importance of using multiple data sources of information in decision making. For instance, in addition to norm–referenced measures, all of the researchers advocating an integrated method point out the need to consider the adult's developmental and instructional history; medical and psychological history; family; and environmental factors. As Mather and Gregg (2006) note: “Tests results can be an aid to judgment, but they should not be a substitute” (p. 100).
However, the issue of defining functional limitations differs across these integrated models. Kamphaus (2005) considers that a functional academic limitation occurs when “scores that are in the extreme 10th percentile (approximately) of the population (standard score of 80 or less and 120 or more where mean = 100 and SD = 15) indicate a deviant level where functional impairment or strength is probably present… ” (p. 480). In other words, he recommends a specific threshold (standard scores < 80) for identification of a disability. Although Kamphaus labels this process “therapeutic testing,” it is simply a variation of previous cutoff methods (Dombrowski, Kamphaus, & Reynolds, 2004). No measurement issues or evidence–based research with the adult population are discussed as factors to consider. While Kamphaus (2005) encourages reliance on factor indices rather than subtests, functional limitations appear to hinge on single subtest selection based on achievement deficits.
Flanagan, Keiser, Bernier, and Ortiz (2003) and more recently Flanagan and Mascolo (2005) proposed an integrated model to help operationalize the definition of LD across the life span. While Flanagan et al. encourage investigation of the academic and cognitive processing relationships required for learning, their definition of functional limitations is similar to that of Kamphaus (2005). LD, regardless of the age, educational attainment, and/or ethnic or cultural experiences, is synonymous with academic underachievement. Hoy et al. (1996) and Brackett and McPherson (1996) also advocated using an integrated model. However, they suggest that the functional limitation be based on clinical judgment where gender, severity of disability, ethnicity, age, motivation, experience, correlations between achievement and ability measures, and the validity of the tests are factored into diagnostic decision making. However, the definition of functional limitations was not operationalized, making this replication very difficult.
Eligibility Criteria for High–Functioning Adults
None of the current eligibility methods (i.e., discrepancy, cutoff, integrated) for identifying LD within the adult population, particularly for the high–functioning range, provides professionals a reliable and valid means of operationalizing the definition of LD for the purposes of diagnostic decision making. If high–functioning adults are required to be compared to most people as defined solely by the limited samples on norm–based tests that are not adjusted for educational attainment, or by achievement performance only below the 16th percentile within a subtest selection method, then the majority of them will never be provided accommodations. However, as Mather et al. (2005) point out,
one should not confuse the comparison group with what is being compared—that is, being compared is the condition, manner, and duration in which one performs the activity, not the actual resulting achievement itself. Clearly, use of the term ‘most people in the population’ without consideration of a person's history and the way in which the person studies or learns will result in the denial of accommodations to many high functioning individuals… . (p. 9).
It is our purpose to provide a valid and reliable means by which to operationalize the definition of LD in a manner that would also be inclusive of high–functioning adults (as noted earlier, the term high–functioning applies to individuals with fluid or crystallized abilities in the high average–to–superior range). Our criteria (summarized in Figure 1) are based upon the three constructs underlying the ADA definition of disability: duration, condition, and manner in which one performs the activity of learning.

Guideline for diagnosing learning disabilities in adults identified as high–functioning (i.e., high average–to–superior factor score for crystallized or fluid ability).
Criterion 1
Professionals base their clinical decision making on: (a) advanced measurement theory; (b) current cognitive, linguistic, and achievement research with the adult population without disabilities; and (c) recent research investigating the abilities of the adult population with LD.
Criterion 2
Clinicians identify and document the duration of a disability by collecting background information pertaining to the learning history of an examinee. This entails an examiner to collect multiple sources of information. In agreement with other researchers, we consider multiple data sources of information essential to diagnostic decision making (Dombrowski et al., 2004; Kamphaus, 2005; Flanagan, Keiser, Bernier, & Ortiz, 2003; Flanagan & Mascolo, 2006). A clinician must consider the environmental, cognitive, language, achievement, and behavioral signs influencing the ability to learn specific tasks set across specific contexts. Both intrinsic dynamics and extrinsic factors that affect performance should be part of all clinical decision making (Schrank et al., 2005). However, the quantity of information gathered is not the key to accurate decision making. Rather, the critical factor to effective diagnosis is the expertise and experience of the clinician to compare, contrast, and interpret the obtained results in light of evidence–based research and best practice.
Unfortunately, many professionals and testing agencies appear to believe that, if students with LD were successful in previous schooling (with or without accommodations), they are not substantially limited enough to receive accommodations. In other words, the condition cannot be influenced by the context. First, there is no evidence or body of research to support this bias. Many HFLD have compensated by long hours of study and extra tutoring that is not reflected on school records. Unfortunately, historical documentation is used by some professionals as a double–edged sword. If one succeeds in school (recorded by grades) and/or did not request accommodations—all other information pertaining to current functioning is ignored because history is more relevant than today's context. If one does provide the historical documentation, but current scores are not below the 16th percentile, then history is discounted. As with all evidence, history of accommodations is not specific to diagnosis. Rather, it is information to help support clinical decision making
Criterion 3 (Broad Academic Ability Analysis)
The condition is first addressed by the examination of a broad domain–specific academic factor score. The CI must be 1.5 standard deviations below estimated ability (based on a crystallized or fluid factor score from a broad cognitive ability measure). Again, it is important to consider that CI can overlap and still be significantly different. However, when CI do not overlap, they are always significant. Information would be investigated to determine if performance could be attributed to factors other than LD (exclusionary).
Criterion 4 (Specific Academic Ability Analysis)
The manner in which the individual performs domain–specific academic tasks would be observed from select data sources that document (a) skill, (b) fluency, and (c) application tasks or context. As with the condition criteria (academic and cognitive), CI ranges must fall 1.5 standard deviations below the crystallized or fluid factor score from a broad cognitive ability measure on at least two subtest scores from one or more of the three areas (skill, fluency, and application). Again, efforts would be made to determine if performance could be attributed to factors other than LD (exclusionary).
Documenting the manner of performance is critical to the identification of HFLD. For instance, imagine two adults who both read a list of single words at the 30th percentile—a score within the normal range on a bell curve. What would not be evident from a score comparison is that one adult, John, took three times longer to read the words due to halting self–corrections. The clinician testing him would be observing this reading behavior. In addition, John, despite superior intelligence, performs poorly on specific language measures. By contrast, Mary read the words quickly (though she did not recognize some low–frequency items) and performed all measures of language with no apparent difficulty. Thus, it is necessary to consider the manner in which the act of reading was performed as well as the processing measures to determine whether John, as opposed to Mary, demonstrates an LD and should have access to accommodations. The clinician's observations concerning the student's performance are critical to decision making.
Criterion 5 (Cognitive/language Ability Analysis)
The second part of meeting the criteria to determine the existence of the condition or disability is documentation from cognitive and linguistic processing factor scores. The factor score CI must represent 1.5 standard deviations below the crystallized or fluid factor score from a broad cognitive ability measure. In addition, there must be evidence–based research to support the relationship of the cognitive/language factor score deficits to the achievement score deficits (e.g., phonemic/orthographic awareness deficits to dyslexia). Information is documented to determine if performance could be attributed to factors other than LD (exclusionary).
Summary
The growing number of high–functioning adults seeking accommodations from testing agencies and postsecondary institutions presents an urgent need to provide reliable and valid methods for accommodation decisions. The potential for this population to make significant contributions to society will be greater if we provide the learning and testing accommodations to allow them access to knowledge, as well as the means to demonstrate their extraordinary abilities. The criteria and decision making used to identify HFLD must be robust enough to account for individual differences, measurement fallibility, and examiner expertise. We suggest that the five criteria presented previously become standard practice in the identification of HFLD.
LD are not defined by any one single behavior (e.g., discrepancy, underachievement, history). Rather, substantial limitations are defined by the condition, manner, and the duration in which an individual learns. In addition, care must also be taken by professionals not to be too generous in predicting from assessment data. Many evaluators make direct inferences from the achievement and cognitive evidence collected during an assessment to some external performance criterion (e.g., performance at law or medical school). The fallacy with such inferencing rests with the fact that often there is little to no evidence of a relationship between test scores and criterion performance. For example, it is highly unlikely that an adult's performance on a single timed multiple–choice reading comprehension test or one sentence–level reading fluency measure will or will not accurately predict the manner in which an adult will perform the more complex learning tasks required in the context of law or medical school. In addition, we have little evidence that psychological measures can predict career success or failure. When discussing HFLD, Shaywitz (2003) noted:
From these individuals we learn that reading slowly tells nothing about the ability to comprehend, that poor spelling has little to do with one's ability to write creatively, and that an inability to memorize the names of anatomical structures does not portend one's skills in operating on those same bodily parts (p. 341).
The key to accurate LD identification of high–functioning adults is the use of trained professional judgment—not the test battery or test paradigm (Bateman, 1992). HFLD is no more difficult or confusing to diagnose than anxiety disorders or depression. However, without knowledge of current research, or experience with specific populations, a clinician becomes a less reliable tool.
