Abstract
Elementary and Secondary Education Act (ESEA) flexibility requires states to develop and implement teacher effectiveness measures that consider student assessment results, including assessment results for students with disabilities participating in general and alternate assessments. We describe how alternate assessment results for students with significant cognitive disabilities could appropriately be used in teacher effectiveness measures. In addition, we discuss the unique parameters faced by teachers serving students with significant cognitive disabilities that may warrant a multiple measures approach to evaluating teacher effectiveness. Using one of the two national initiatives presently developing alternate assessments based on the Common Core State Standards as an example, we describe how these new assessments might be applied in measuring teacher effectiveness. Finally, we offer implications for both policy makers and practitioners in measuring teacher effectiveness for teachers serving students with significant cognitive disabilities participating in alternate assessments.
Introduction
Many states have implemented policies that require the inclusion of assessment results into teacher evaluation systems. The Elementary and Secondary Education Act (ESEA) flexibility guidance of 2012 requires that state departments of education receiving flexibility from school accountability requirements develop educator evaluation systems that can be used to assess the performance of all teachers, including those who serve children with disabilities (U.S. Department of Education, 2012). Indeed, the Council for Exceptional Children (CEC; 2012) “recognizes the importance of special education teachers in the education of all children and youth” (p. 9), and that the principles of high-quality evaluation apply to all teachers. This necessarily includes those educators who teach students with significant cognitive disabilities 1 (i.e., those students who participate in states’ alternate assessments based on alternate achievement standards).
The educator evaluation systems are required to include student assessment data, specifically data on student growth for all students, including English learners and students with disabilities (U.S. Department of Education, 2011). Many states have included their large-scale assessments as one component of their system to ensure that there is one consistent measure across schools. Most attention has been paid to the inclusion of data from regular assessments, either the state’s current assessment or a soon-to-be implemented consortium assessment (either the Partnership for Assessment of College and Careers [PARCC] or the Smarter Balanced Assessment Consortium [Smarter Balanced]). Yet, these assessments do not include the group of students with significant cognitive disabilities who participate in alternate assessments based on alternate achievement standards. And, only a few states’ ESEA flexibility requests mentioned alternate assessments in the educator evaluation section of their applications (Lazarus, Edwards, Thurlow, & Hodgson, 2014).
Most educator evaluation systems include other measures in addition to state assessment data. These other data generally recognize the importance of multiple measures and often include observations of teachers in their classrooms (Holdheide, 2013; Holdheide, Goe, Croft, & Reschly, 2010). Given the need to ensure that teacher effectiveness systems include student assessment data under ESEA flexibility waiver requests, and that teacher evaluation systems are typically based on multiple measures (and not just student assessments), it is timely for our field to consider (a) appropriate models for teacher effectiveness for students with significant cognitive disabilities, (b) how student results from large-scale assessments should be incorporated into those models, and (c) the other indicators of teacher effectiveness that should be part of that model.
This article thus reviews the important features of high-quality teacher evaluation systems for all teachers, the unique considerations in evaluating teacher effectiveness for teachers of students participating in an alternate assessment based on alternate achievement standards, and the features of an alternate assessment that could support the use of student results in teacher evaluation systems. Specifically, we pose the following questions that we will address in turn:
What does the literature tell us about effective teacher evaluation systems for all teachers?
For teachers of students with significant cognitive disabilities, what additional considerations are essential in designing an effective teacher evaluation system?
To what extent can alternate assessment scores for students in their respective state alternate assessments be validly and reliably used as part of an effective teacher evaluation system?
What are the policy and practice implications of these questions for students with significant cognitive disabilities and their teachers?
Question 1: Measuring Teacher Effectiveness: What Do We Know?
Measuring teacher effectiveness is not an easy task. In this section, we describe the literature on measuring teacher effectiveness. We consulted two comprehensive reviews of the literature in teacher evaluation (Goe, 2007) and teacher effectiveness (Goe, Bell, & Little, 2008) used generally in addition to other cited resources.
Research-based evidence suggests that the most influential factor affecting student achievement is teacher effectiveness (Darling-Hammond, 2000; Rivkin, Hanushek, & Kain, 2002; Sanders & Horn, 1998). Teacher effectiveness refers specifically to outputs (i.e., student achievement), while teacher quality refers to factors that may be independent of these outputs (e.g., teacher qualifications, teacher characteristics, and practices; Goe, 2007). In a comprehensive review of the literature in teacher effectiveness, Goe et al. (2008) suggested that although there appears to be agreement that teacher effectiveness matters in student achievement, there is less agreement about the specific teacher effectiveness features that matter or about how those features should be evaluated. Indeed, Goe et al. proposed a five-point definition of teacher effectiveness which includes student assessment results, as well as the extent to which teachers achieve positive academic and social outcomes for students, use multiple sources of evidence to plan and evaluate student learning, contribute to the development of schools that value diversity and civic-mindedness, and work collaboratively with other professionals to ensure student success. Working collaboratively with other professionals is particularly important for serving students with significant disabilities, especially those who participate in alternate assessments (Kleinert & Kearns, in press).
Only after the characteristics that define teacher effectiveness have been determined can we consider the measures that evaluate those characteristics. Immediately, with the mention of measurement comes the consideration of the validity of the measure. Validity, in this case, refers to the degree to which the interpretation of the measure or score is supported by evidence and the use of the results that are consistent with the purpose for which the instrument was designed (Kane, 2006). Furthermore, the consequences resulting from appropriate purpose and use of a measure—in this case, personnel decisions related to continued employment, merit recognition, promotion, and tenure—increase the importance of the evidence provided by the measure or measures.
Commonly used measures in evaluating teacher effectiveness include classroom observations, principal evaluation, student work or artifacts, portfolios, student and/or parent rating, teacher self-report, and value-added models for determining teacher contribution to student scores (Goe et al., 2008; National Board for Professional Teaching Standards, 2011; Wiener & Jacobs, 2011). Although classroom observations are a popular measure of teacher effectiveness, few of them have actually been linked to student achievement (Goe et al., 2008). For each of these measures, factors related to validity (e.g., inter-rater agreement, training and calibrating raters, statistical modeling) present considerable challenges for use in high-stakes situations. The extent to which the measure can be linked to student achievement in a manner that is reliable and valid is a necessary consideration (Goe et al., 2008). In a synthesis of research on the link between student outcomes and teacher quality, Goe (2007) found that although many studies have addressed this issue, the findings have been mixed or less than significant. The practice of using student assessment results (value-added modeling) as an indicator of student achievement represents one approach currently used to measure teacher quality. Although assessment results have been linked to teacher quality (Sanders & Horn, 1998), the use of student assessments is particularly challenging because, as Goe pointed out, these assessments are not designed for the purpose of measuring teacher effectiveness, other school or classroom effects are difficult to account for in these models, and student characteristics are not considered. Furthermore, Newton, Darling-Hammond, Haertel, and Thomas (2010) suggested that student gains appear to be related to student characteristics. These authors found that teacher scores varied dramatically depending on the characteristics of students, that is, the same teacher could have high achievement if the class was advantaged or low achievement if the class was disadvantaged. This is particularly important for teachers who have students with disabilities and especially with the documented variability of student characteristics in alternate assessment populations (Kearns, Towles-Reeves, Kleinert, Kleinert, & Thomas, 2011). For this reason, a comprehensive system of professional development that includes multiple sources of evidence, high-quality measures aligned to high-quality teaching standards, and ongoing professional development opportunities is necessary to provide a more complete picture of teacher effectiveness (Bill & Melinda Gates Foundation, 2010; Holdheide et al., 2010).
Those who support the goal of using educator effectiveness measures as one component of a support system for teachers argue that the critical aspect of such a system is not the specific measures that are used, but rather the vision and goals, conceptualization, development, and implementation of the system that are critical. As noted by the National Board for Professional Teaching Standards (2011), A well-functioning teacher evaluation system goes beyond the checklists commonly used in schools. The system must specify what will be measured, define how it will be measured, clarify how the measures will be applied consistently, lay out a plan for providing feedback and continuous support, and have buy-in and leadership from key stakeholders. It will also highlight how to use the evaluation results to improve the school culture, teacher practice, and student outcomes. (p. 9)
Wiener and Jacobs (2011) agreed in their identification of six principles of a comprehensive performance management system, especially in Principle 5 (Continuous improvement is modeled throughout the system). Wiener and Jacobs noted that one of the ways in which continuous improvement is modeled is by schools and districts becoming “learning organizations” in which learning from experience is encouraged, mid-course corrections are made, and employees are invited to surface concerns so that they can be addressed.
A second feature of effective teacher evaluation systems is a model of professional development focused on sustainable improvements in student achievement. As school districts and educators consider the implications of Common Core State Standards (CCSS) on the curriculum and instructional variables related to improving achievement for all students, including students participating in alternate assessments, a professional development model is essential. Making and sustaining such improvements on a district-wide basis often involves the use of a framework (e.g., Universal Design for Learning, Multi-Tiered System of Support) or set of guiding core beliefs for improving the quality, consistency, and coherence of instruction provided to all students across the district. Districts successful in implementing system-wide improvement focus their goals, select and implement shared instructional practices, implement those practices deeply, monitor and provide feedback and support, use data continuously and systematically, and require all personnel to engage in continuous inquiry and learning (McNulty & Besser, 2011; Telfer, 2012). Inherent in this district-wide approach is the expectation that all adults in the district share responsibility for the foundational belief that every student can learn at a higher level (see www.movingyournumbers.org for additional information) and that they act to improve the learning and achievement of all students, including those who participate in alternate assessments.
Effective teacher evaluation systems assist principals and teachers in the identification and selection of ongoing professional development opportunities that build both the individual and collective expertise of teachers in the implementation of evidence-based practices. In districts and their schools that have used assessment and accountability to improve student learning and results, the delivery of targeted professional development is directly aligned with a limited number of district-wide goals and strategies, and intentionally provided to all personnel to support the consistent and full implementation of identified instructional practices. In addition to building teacher expertise, high-quality teacher evaluation systems also contribute to the development of principal knowledge and skills about the appropriate use of assessment results, observation tools, service delivery options, and evidence-based practices for more effectively meeting the differentiated instructional needs of all students, including students with disabilities.
Question 2: What Are the Additional Considerations in Teacher Effectiveness for Students With Significant Cognitive Disabilities?
Assessing and supporting teacher effectiveness for teachers serving students with disabilities present unique challenges for teacher effectiveness systems (Holdheide, 2013; Holdheide et al., 2010; Steinbrecher, Selig, Cosbey, & Thorstensen, 2014; Warren et al., 2012). A number of factors should be considered, including the availability of teachers with the appropriate credentials, teacher knowledge and skills including highly qualified teachers in particular academic content areas, and the use of evidence-based practices (Holdheide et al., 2010). These challenges apply to measuring the effectiveness of teachers of students with significant cognitive disabilities as well.
Moreover, alternate assessment populations necessitate the consideration of additional factors in measuring teacher effectiveness. For example, the characteristics of the learner, complexity of learner needs, and lack of opportunity to learn all contribute to a high degree of variability in the sophistication with which students engage in academic content that is grade-specific and chronologically age-appropriate (Steinbrecher et al., 2014).
Special Considerations in Measuring Teacher Effectiveness for Students With Significant Cognitive Disabilities
Students participating in alternate assessments present unique challenges to the use of assessment results in teacher evaluation. These include the following: (a) a need for collaborative team supports, (b) a range of educational placements and teacher academic content knowledge, and (c) size and complexity of the student population, including the status of student communicative competence. We describe each of these considerations below.
Need for collaborative team supports
Students participating in alternate assessments represent approximately 1% or less of the total assessed population statewide and in most districts (Kearns et al., 2011). These students typically receive services from a team of professionals, including speech-language pathologists, occupational and physical therapists, and vision and hearing specialists, as well as general and special education teachers. Best practice suggests that each member of the service delivery team incorporate the recommendations from other disciplines when working with the student. For example, the physical therapist might recommend a particular positioning strategy that enables a student to provide a response; the other members of the student’s team would use this particular positioning strategy to best support the student’s ability to respond. Similarly, speech-language pathologists should utilize the vocabulary from the classroom to the extent it is appropriate for addressing individual student needs. A collaborative team model (Orelove, Sobsey, & Gilles, in press; Orelove, Sobsey, & Silberman, 2004) ensures that new knowledge and skills, implemented with fidelity by the entire team, provide for multiple opportunities to practice across the student’s daily activities, and are guided by consistent and specific feedback. Although evidence-based practices, including a collaborative model of related services, systematic instruction with consistent and specific feedback, and distributed practice opportunities, are designed to insure acquisition of knowledge and skills specifically for this population of students, the extent to which these specialized, evidence-based practices are considered as an integral part of teacher evaluation systems is not well understood. Although collaborative teams service delivery is itself an exemplary practice, the extent to which student assessment results are considered in the evaluation of each team member’s performance has not been well articulated in teacher evaluation systems (Steinbrecher et al., 2014).
Range of educational placements and teacher academic content knowledge
Students who participate in alternate assessments may receive services in a variety of placements ranging from general education classrooms with the special education teacher providing supports to classrooms in which only special education teachers deliver instruction; however, the vast majority (approximately 93%) of students with significant disabilities participating in state alternate assessments are served primarily in self-contained classrooms or separate school settings (Kleinert et al., in press). Yet the majority of students in the Kleinert et al. study had some opportunities for inclusive experiences: A significant percentage had at least some level of either academic (15.9%) or non-academic inclusion (70.7%) with their peers without disabilities, and Kleinert et al. found substantial variation in access to general education settings across states for students in alternate assessments.
The importance of educational setting is further highlighted by studies finding that instructional practices and expectations themselves vary across educational placements. For example, in a comparative study of teachers in regular public schools versus separate schools for students with severe disabilities, Ruppar, Dymond, and Gaffney (2011) found that teachers of students who use augmentative/alternative communication systems (AAC) placed a higher degree of emphasis on teaching literacy skills (defining words, identifying relevant phrases, answering comprehension questions) when they taught in regular schools as opposed to separate schools, and teachers who taught in inclusive settings placed the most emphasis on those skills.
Educational context does matter: As Jackson, Ryndak, and Wehmeyer (2008/2009) have noted, the general education classroom provides specific factors, including opportunities for learning with peers without disabilities, increased opportunities for incidental learning, and core academic instruction provided by a teacher trained in the general curriculum. Moreover, the extent to which special education teachers serving this population have academic content knowledge and expertise depends on the requirements for teacher credentialing in each state and the presence of ongoing professional development. States should consider how to ensure that all teachers serving the population of students participating in alternate assessment receive content support. This is certainly the case for mathematics, given the evidence correlating content expertise and student results (Goe, 2007). It is also the case for literacy skills, given the pervasive belief that children in this population do not develop literacy despite evidence that they do indeed develop literacy skills regardless of whether they use oral speech (Kliewer, 2008). States need to consider how to ensure that effective pre-service and ongoing professional development are occurring (Delano, Keefe, & Perner, 2008/2009). Opportunities to learn within the context of the grade-level curriculum for all students is a significant challenge for our field (Hunt, McDonnell, & Crockett, 2012; Kleinert et al., in press) and one that is integral to effective teacher evaluation systems.
Variability in the student population and communicative competence
Although students participating in alternate assessments represent a very small percentage of the total assessed population of school-age students in a state (generally less than 1% of all students participate in alternate assessments 2 ), the variability and complexity among students with significant cognitive disabilities in this group is also more pronounced. It is important to note that 70% of students with significant cognitive disabilities use oral speech to communicate, are likely to have sight word reading skills, and use a calculator to compute basic math functions (Kearns et al., 2011). Characteristics of typical learners in this 70% of students with significant cognitive disabilities include slower rates of knowledge and skill acquisition, and limited application of knowledge and skills to new situations without systematic opportunities for practice, often resulting in a failure to maintain and generalize skills (Kleinert, Browder, & Towles-Reeves, 2009). The lack of opportunity to learn academic knowledge and skills, with a commensurate focus on mainly functional or life skills curriculum, is also pronounced (Towles-Reeves et al., 2012) for this population.
The remaining 30% of students with significant cognitive disabilities are emerging in their use of symbolic language or have been rated as pre-symbolic (no formal mode of expressive communication) by their teachers. Emerging and pre-symbolic communicators are more likely to have multiple disabilities including hearing, vision, and mobility problems, and health challenges or degenerative conditions. Communication and symbolic language form the foundation for the acquisition of academic knowledge and skills. Students receiving appropriate intervention in communication do increase in the acquisition of symbolic language (Rowland & Schweigert, 2000; Snell et al., 2010), and increases in communicative competence for students with limited communication skills are an important part of effective teaching. It is certainly reasonable to consider whether increased communicative competence should be one measure of teacher effectiveness for this population.
Measurement Issues in Designing Effective Evaluation Systems
In the previous section, we considered the unique characteristics, learning and support needs of students with significant cognitive disabilities that have important ramifications for the design of a valid teacher evaluation model. In this section, we consider the measurement issues in the design of such a model, given the small numbers of students with significant cognitive disabilities assigned to individual schools and teachers.
Although the percentage of students participating in an alternate assessment based on alternate achievement standards is extremely small, the number of students assigned to any one teacher should also be considered in teacher evaluation if assessment results are to be used. General education teachers may provide services to only 1 or 2 students who participate in alternate assessments, balanced by 24 or more students who are “typical” learners. As we have noted, most students participating in alternate assessments receive services in separate/self-contained classrooms or other separate settings (Kleinert, 2015). Self-contained classrooms generally focus on 10 to 12 students across 3 to 4 grades, which averages to about 2 students per grade. The impact of a student’s alternate assessment score on a general education teacher’s evaluation when that student is 1 of 22 in the classroom, all the rest of whom have general assessment scores, will be different from the impact of a student’s alternate assessment score when he or she is 1 of 12 (or fewer) students in a classroom. Furthermore, as Steinbrecher et al. (2014) noted, teachers with small caseloads are differentially affected by the statistical properties of certain models for calculating teacher effectiveness (e.g., value-added modeling). In effect, it is statistically much more difficult to show a value other than “expected” growth for a teacher with 10 students than it is for a teacher who has 22 students. Steinbrecher et al. have referred to this as “shrinkage to the mean” (Steinbrecher et al., 2014, p. 330).
Furthermore, some students in the class may not be attending their neighborhood school or the school they would attend if they did not have a disability. This requires thoughtful policy decisions as to how that student’s score should be assigned for school accountability purposes. Finally, special schools have unique issues related to the number, concentration, and complexity of needs for their students, as well as potentially limited teacher expertise in grade-level academic instruction. Measurement expertise will be needed to ensure the use of formulas that provide for the appropriate attribution of student results for students not attending their neighborhood schools, as well as for those students still served in separate schools.
Differentiation of measures
The measures used for teacher evaluation should be appropriately differentiated to reflect the evidence-based practices and instructional roles through which teachers provide services. For example, a teacher observation system must be differentiated based on the unique conditions under which both general and special education teachers deliver services to students with disabilities (Holdheide, 2013; Holdheide et al., 2010; Warren et al., 2012). Observing teachers in inclusive classrooms in which general and special educators co-teach may look different from observing a special education teacher serving students with disabilities in a resource or self-contained setting, and is certainly different from observing general education teachers in general education classrooms in terms of the number of students, the sophistication of the content, and the number of support teachers or paraprofessionals. In short, the use of multiple measures is not just a question of which measures (student assessment results, classroom observations) should be included in a teacher evaluation system but the form those measures should take, given the context of instruction.
Question 3: What Is the Place of Alternate Assessment Scores in Teacher Evaluation?
Student results from large-scale assessments represent one source of information for teacher evaluation of all teachers. In this section, we consider specific models of how alternate assessment scores might be used in teacher evaluation systems, and then through the description of one multi-state alternate assessment consortium, we offer specific examples of how the use of alternate assessment scores might translate into practice.
Use of Large-Scale Assessments in Teacher Evaluation: Current Models
Three prevalent teacher evaluation models use student assessment results in teacher evaluation systems. In Table 1, we describe the value-added model, student growth percentile, and student learning objectives (SLO) models, as well as two additional teacher evaluation models that could provide enhanced information in a multiple measures model of teacher evaluation. We briefly note how alternate assessment scores might be used in the context of these models.
Current Models in Teacher Evaluation.
Note. VAM = value-added model; SLO = student learning objectives; IEP = individualized education program.
Multi-State Consortia Alternate Assessments Linked to Grade-Level Content Standards
In their policy analysis of the use of state assessment scores in teacher evaluation systems for students with disabilities, Steinbrecher et al. (2014) noted substantial problems in using student scores as one measure in calculating teacher effectiveness, specifically in the use of those scores in calculating teacher effectiveness through a value-added model. Moreover, Steinbrecher et al. went on to add that alternate assessments for students with significant cognitive disabilities, given significant problems with technical adequacy, alignment or linkage to grade-level content, small numbers of students, and so on, were especially inappropriate to be included as a part of teacher effectiveness calculations. Recently with the U.S. Department of Education funding of two large, multi-state consortia to develop high-quality alternate assessments, clearly linked to grade-level content standards, it is possible—though not yet assured—that these fundamental issues with alternate assessment can be addressed.
Those two initiatives, National Center and State Collaborative (NCSC; http://www.ncscpartners.org/) and Dynamic Learning Maps (DLM™; http://dynamiclearningmaps.org/), are presently developing item-based assessments linked to the CCSS that will be delivered primarily through technology systems. Both are undergoing rigorous analysis of item content validity, technical adequacy, and scoring reliability, and both will have the advantage of comparatively large numbers of students with significant cognitive disabilities across many states to provide estimates of reasonable or expected growth. Both the NCSC and DLM™ consortia are striving to develop alternate assessments that incorporate the elements of universal design and that are compatible with students’ communication modes and assistive technology applications.
As of late 2014, the NCSC included 24 states to build alternate assessments on alternate achievement standards (AA-AAS) for students with the most significant cognitive disabilities. The goal of the NCSC (2013) project “is to ensure that students with the most significant cognitive disabilities achieve higher academic outcomes. All students should aim to leave high school ready for college and/or careers.”
The DLM™ project is “guided by the core belief that all students should have access to challenging grade-level content.” A fundamental dimension of the DLM™ approach is the capacity to embed assessment tasks in ongoing or daily instruction. Like NCSC, the DLM™ project is also developing and conducting a large-scale pilot of an end of the year assessment that will function as a summative assessment for ESEA accountability requirements. Currently DLM™ is working with 18 states. Both the NCSC and DLM™ assessments are scheduled for full-scale implementation in their respective participating states in 2014-2015. Both assessments will feature on-line delivery systems, and both assessments have been designed specifically to link to good instruction on academic content linked to grade-level standards, and not as simple stand-alone tests.
Although we argue for a multi-methods teacher effectiveness approach for students with significant cognitive disabilities throughout this article, as well as a limited and very cautious use of alternate assessment scores in any formula for teacher effectiveness, these multi-state consortia projects may provide alternate assessments that could legitimately be used (as one of several essential components) in both measuring teacher effectiveness and creating a platform to improve evidence-based instructional practices. We explore here both the potential and the limitations of this approach using the NCSC summative alternate assessment as an example.
One Example: The NCSC Alternate Assessment
The NCSC summative alternate assessment system is one example of a comprehensive approach that includes curriculum and instructional materials, classroom assessments, and end-of-year summative assessments. The NCSC assessment is linked to the CCSS for each grade, and a learning progression framework (Hess & Kearns, 2011) focuses key content from the CCSS and promotes continuous skill acquisition across grades. Finally, core content connectors (CCCs) represent accessible and assessable learning targets for individual students.
The theory of action undergirding the NCSC summative alternate assessment suggests that teachers must have knowledge and skills, as well as professional development opportunities, to provide high-quality instruction that supports higher levels of student learning. Resources to enhance teacher knowledge and skill are an essential support for any alternate assessment. They become even more critical if those assessments are also used to measure teacher effectiveness. Ongoing professional development to provide high-quality instruction in relevant academic content linked to grade-level content standards is critical to an evaluation of teacher effectiveness based on an alternate assessment linked to those standards.
As one example of ensuring that alternate assessments are linked to professional development in such areas as access to grade-level curriculum, research-based instructional strategies, and communicative competence for students with the most significant disabilities, the NCSC foundational curriculum and instructional materials are incorporated in an electronic system that provides professional development opportunities for teachers endeavoring to link student learning to the CCSS and the learning progression framework. Included are on-demand professional development modules for understanding the CCSS in English language arts and mathematics, as well as modules for supporting students who are emerging in their use of symbolic language. The system also includes progress monitoring tools, a critical aspect of data-based decision making for this population (Quenemoen, Flowers, & Forte, 2014). Each module follows a consistent format that includes embedded quizzes and a tracking system for documentation.
In addition to the learning management system, a curriculum and instructional WIKI provides quick and systematic electronic access to the CCSS, learning progression framework, CCCs sample grade-level instructional units with appropriate supports, and systematic sequenced instruction. The goal of the curriculum and instruction WIKI is to provide quick information about academic content in the CCSS for teachers who may not have had experience with the content at a particular grade or content area. Our point here is not necessarily to suggest that the NCSC model of professional development is one that other states should adopt, but rather that a model embodying similar components should be in place when alternate assessment scores are used as part of teacher effectiveness measures. A comprehensive system of curriculum, instruction, and assessment—undergirded by a robust system of professional development—supports the use of alternate assessment scores as an indicator of teacher effectiveness for students with the most significant disabilities.
NCSC summative assessment use in teacher evaluation
The way in which any alternate assessment can be used in a state educator evaluation system depends on the type of educator evaluation system the state has adopted. Given this proviso, large-scale state consortia like NCSC and DLM™ provide an important added element to traditional state approaches to alternate assessment. Because of the number of students they include, the assessment developers are better able to address both reliability and validity concerns, and possibly also demonstrate vertical alignment (e.g., learning progressions; see Case & Zucker, 2005) and growth across grades. Using the NCSC summative alternate assessment as an example, we note how its assessment results might be applied within the most prevalent teacher evaluation approaches identified earlier in this article (see Table 1).
Value-added approach
As an example, NCSC summative assessment scores can be included in almost all value-added approaches because the NCSC alternate assessment will have scale scores based on a robust scale, will be tightly connected to the CCSS, will be appropriately standardized across students and conditions, will benefit from technical analyses based on more students than any one state could muster, and will be equated and comparable across years. As we have previously noted, most value-added approaches require at least 2 years of scores. The first score from NCSC will be available from the spring 2015 administration.
Student growth percentile approach
A student growth percentile approach could be constructed for NCSC summative assessment scores. However, NCSC scores should not be mixed with other assessment scores in a student growth percentile approach because they are based on different achievement standards. For example, using alternate assessment scores, combined with progress monitoring data, would not yield an accurate student growth percentile, because it is not possible to weight these two distinct approaches within a single, continuous model of growth.
SLO approach
The curriculum resources being developed by each of the two multi-state alternate assessment consortia provide an overall framework for this approach. For example, NCSC’s curriculum, instruction, and formative assessment resources are designed to support an SLO approach. We recommend that state-level, teacher communities of practice aligned with each of these consortia assist in identifying appropriate learning targets, aligned directly to the CCCs, and in ensuring comparability and scoring accuracy.
Curriculum-based measures (CBM) approach
The NCSC summative assessment, as a single “snapshot in time,” is inconsistent with a progress monitoring approach, although it has embedded progress monitoring tools within its curriculum materials. In contrast, the DLM™ alternate assessment can be embedded in instructional tasks throughout the year, and thus is a form of progress monitoring. Indeed, both consortia are developing curriculum and instructional materials that could be used as models for designing CBM or a progress monitoring approach.
Teacher observation approach
Teacher observations typically are used to evaluate teaching practices. Most often, the Danielson (2012) approach is used, although its appropriateness for classrooms in which there are students with significant cognitive disabilities has not been addressed. Teacher observation approaches combined with summative assessment results may provide a better understanding of teacher performance than either methodology used individually.
Question 4: What Are the Policy and Practice Implications for Students With Significant Cognitive Disabilities and Their Teachers?
In this final section, we provide recommendations for the use of alternate assessments as one indicator of teacher effectiveness for students with significant cognitive disabilities. As we have noted consistently above, a basic proviso for the use of any alternate assessment is that it be linked to grade-level content standards, with a clear set of learning progressions across grades in prioritized content for students with significant disabilities.
Recommendations and Implications for Policymakers
Policymakers can contribute to the development of effective educator effectiveness evaluation systems through a number of initiatives. We consider both capacity and measurement issues here:
Support school districts in building the collective capacity of both instructional and related service personnel to achieve higher levels of learning for all students, including students with the most significant cognitive disabilities (Telfer, 2012).
In the design of teacher effectiveness models that incorporate alternate assessment scores, ensure that teacher supports, including curriculum and instructional materials, electronic organization of those materials, and ample opportunities for professional development, are readily available for teachers to improve their own instruction and to enhance student achievement. Examples of such curricular and instructional supports to enhance student participation in the general curriculum can be found in both NCSC (http://www.ncscpartners.org/) and DLM™ (http://dynamiclearningmaps.org/). The intent is the design of materials that support participation in the general curriculum (Browder & Spooner, 2011), and not as a separate or parallel curriculum.
Include representatives from each of the key stakeholders (i.e., general education teachers, special education teachers, related services providers) in designing teacher evaluation systems that align with and are used to support improvements in the implementation of effective instructional practices.
Determine the extent to which the specific contributions of instructional and related service personnel can be directly linked to student growth or value-added models; understand that such calculations are problematic at best even for teachers who have primary responsibility for specific students, and the problem of attribution of added value or growth is further compounded in “teasing out” contributions of other instructional and therapy staff involved with that student.
Consider a multiple measures approach to teacher evaluation, with large-scale assessment results being just one measure (Warren et al., 2012). Within any teacher evaluation model that uses alternate assessment scores, especially with value-added models of calculating student growth, consider the impact of “shrinkage to the mean” (Steinbrecher et al., 2014) that results from small caseloads. The weight of alternate assessment scores in evaluating teacher effectiveness should be determined in proportion to the confidence we can place in their results. Even with the advent of the multi-state consortia alternate assessments based on the CCSS (NCSC and DLM™), alternate assessments have not been developed or specifically designed for this use.
Develop and differentiate observation tools and procedures so that they accurately measure specialized teacher knowledge and skills applicable to supporting higher levels of learning among this population of students. Although it is difficult to conceptualize how to “tease out” the relative impact of related service personnel (speech/language pathologists, physical therapists) within individual student scores, it is possible to differentiate effectiveness based on observation tools that clearly measure the extent to which related service personnel are using evidence-based practices to improve student learning for this population (e.g., data-based decision making; embedding therapy targets that support enhanced participation into the contexts of daily routines; promoting generalization across general education, other school, and community settings; implementing strategies that increase opportunities for interactions with peers without disabilities).
Study the effects of attributing scores in terms of the unique service delivery models, the array of service providers, and the characteristics of learners as they relate to teacher evaluation models. This recommendation parallels, in part, the one immediately preceding. As we noted, it would be highly problematic to use alternate assessment score in evaluating the effectiveness of related service personnel. Still, it might be possible to determine models that differentiate the contribution of educators on the basis of service delivery models. For example, for a student with a significant cognitive disability who is served primarily in a separate classroom (the most frequent placement for students in states’ alternate assessments—see Kleinert et al., 2015), it seems reasonable that most, if not all, of the student’s alternate assessment score (or growth in performance) should be attributed to that student’s special education teacher of record. For a student with a less restrictive educational placement, and especially for those students who are served primarily in general education settings, it is reasonable to assume that a portion of that student’s score (or growth) can be attributed to the general education teachers. Whether it is indeed feasible to do so, especially for middle and high school students who may have multiple general education teachers, is a matter that merits discussion as states develop their teacher evaluation systems.
For students with the most severe disabilities, who have limited communicative competence (students at either a pre-symbolic or emerging symbolic level of communication), consider whether a part of teacher effectiveness should be based on the student’s improved communicative competence. Clearly, there is no more fundamental educational outcome than the ability to communicate (Kearns et al., 2011; Kleinert et al., 2015).
Consider whether other indicators of academic engagement in the general curriculum (e.g., increased opportunities for active participation in general education settings with peer supports; see Carter, Cushing, Clark, & Kennedy, 2005; Carter, Sisco, Brown, Brickham, & Al-Khabbaz, 2008) or important non-academic outcomes (increased self-determination, participation in community-based vocational instruction, and paid employment related to positive post-school outcomes; see Kleinert, Kearns, Quenemoen, & Thurlow, 2013) could be fairly included in a teacher evaluation system.
Facilitate university–district–state partnerships that incorporate effective clinical practice to ensure that new teachers are well prepared to support the learning and achievement of all students. University faculty are in an ideal position not only to ensure that the new teachers are well equipped to deliver effective instruction in relevant academic content but also to work with state departments in the design of teacher internships that require the demonstration of evidence-based practices (see Whetstone, Abell, Collins, & Kleinert, 2013) and to work with their state department of education on multi-measure teacher evaluation systems that do fairly represent teacher effectiveness for students with significant disabilities.
Recommendations and Implications for Practitioners
Ensure that all students with significant cognitive disabilities have a reliable and understandable mode of communication. As Kleinert et al. (2015) have found, approximately 10% of high school students in state alternate assessments are rated by their teachers as functioning at a pre-symbolic level of communication (i.e., they do not yet have a regularized mode of communication). Yet as Creech-Galloway, Collins, Knight, and Bausch (2014) and Kleinert et al. have all noted, without a reliable mode of communication, we simply do not know what students with the most significant disabilities know and can do, and we certainly cannot use their alternate assessments scores for measuring either their own learning or the effectiveness of their teachers. However, the use of reliable communication systems among students is evidence of effective practice that could be determined by direct observation.
Develop coaching models that support all teachers in the implementation of instructional practices that improve student achievement. For teachers of low-incidence students, especially in rural districts, having a peer coach who understands both the population of students and core academic content can be challenging (Whetstone et al., 2013). Pairing veteran teachers with new teachers in neighboring schools or districts, using technology to provide coaching when distance creates travel barriers, and ongoing administrative support to ensure that professional development training is more than “just a one-time event,” are all important elements in enabling teachers of students with significant disabilities to receive the support they need to be effective educators.
Conclusion
As states struggle with the creation of teacher evaluation systems under the ESEA Waiver requirements, and especially with the requirement that large-scale assessment scores be part of that equation of teacher effectiveness, it is essential that teachers and policymakers knowledgeable of the needs of students in state alternate assessments be at the table and involved in creating and implementing these evaluation models. We have attempted in this article to outline the possibilities, as well as the limitations and the cautions, of the use of alternate assessment scores in calculating teacher effectiveness for teachers and related service personnel working with students with significant cognitive disabilities. We have advocated for a multi-method model of teacher evaluation, one that is founded upon a solid system of ongoing teacher support and professional development. It can never be fair to anyone—teacher or student—to evaluate student learning or teacher effectiveness on tests developed without the support of robust instructional and curricular resources linked to grade-level content standards for all students (Kleinert et al., 2013). We believe very strongly, with Creech-Galloway et al. (2014), that students with the most significant disabilities should never be limited in their learning by what their teachers believe they can accomplish, but we also recognize that teachers’ expectations are shaped by what they themselves know. We agree with authors cited previously in this article (Curtis & Wiener, 2012; Goe, 2007; Goe et al., 2008; Wiener & Jacobs, 2011): Teacher effectiveness is much more than simply measuring teachers by how their students do on a test—It is really about documenting what their students know and can do on the educational outcomes we value for all students. In the best sense, only when students have both access to the core curriculum and a reliable mode of communication, students have the opportunity to learn alongside their peers without disabilities, and teachers have access to high-quality professional development, can we fairly use alternate assessments based on the core academic content as an integral element of teacher evaluation.
Footnotes
Acknowledgements
The authors would like to thank the following members of the NCSC Teacher Evaluation Expert panel for their significant contributions to this work: Michael Abell, Lynn Holdheide, Lori Nixon, Diane Ryndak, Deb Telfer, and Sandra Warren.
Authors’ Note
We use the term significant cognitive disability throughout this article, as this is the term used by the U.S. Department of Education to identify students who participate in state alternate assessments based on alternate achievement standards. The contents do not necessarily represent the policy of the U.S. Department of Education, and no assumption of endorsement by the Federal Government should be made.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a grant from the U.S. Department of Education, Office of Special Education Programs (H373X100002, Project Officer:
