Comparing Validity Evidence of Two ECERS-R Scoring Systems
MetadataShow full item record
Over 30 states have adopted the Early Childhood Environmental Rating Scale-Revised (ECERS-R) as a component of their program quality assessment systems, but the use of ECERS-R on such a large scale has raised important questions about implementation. One of the most pressing question centers upon decisions users must make between two scoring systems: stop scoring, in which scoring on an item is ceased if indicators of lower categories are not fulfilled, and alternative scoring, in which all indicators are scored regardless of anchor scores. This question has implications not just for researchers interested in the psychometric properties of assessments, but also for coaches who use the ECERS-R for training, coaching, or technical assistance in their state’s QRIS. The purpose of this study, therefore, was to compare the differences of validity evidence based on the two scoring systems in the context of a state’s QRIS. Utilizing a state representative early childcare sample collected in 2013-2015, I evaluated the descriptive differences between the two scoring methods and compared the convergent validity with the CLASS tool. Moreover, I conducted a series of regressions to examine its predictive validity with child learning outcomes. To gather consequential validity data, I interviewed 13 coaches about their use of ECERS-R scoring systems to identify coaching goals and to implement data-based decision making. Quantitative findings suggested that the quality scores between the two scoring systems could be dramatically different. Some ECERS-R subscales significantly predicted children’s language and early math outcomes. However, the effect sizes were small for both scoring systems. Qualitative findings indicated that coaches felt frustrated with expectations with some quality indicators due to lack of feasibility, cultural adaptation, and compatibility with program philosophy and facilities. Coaches preferred the alternative scoring system as it provided more information and cultivated a strength-based coaching partnership. Data-based decision making was grounded in valid information, coaches’ knowledge about the tools and the programs, and the ability to interpret and negotiate with program providers, which could vary across coaches. This study is the first attempt to deepen our understanding of how scoring systems may affect other aspects of validity evidence. Implications and suggestions for future research direction are provided.
- Education - Seattle