VALIDATING A BATTERY OF READING LITERACY ASSESSMENTS FOR KAZAKHSTANI HIGH-SCHOOL STUDENTS

dc.contributor.authorOlzhayeva, Aliya
dc.date.accessioned2025-05-13T11:30:45Z
dc.date.available2025-05-13T11:30:45Z
dc.date.issued2025-02-04
dc.description.abstractShifts toward a neoliberal agenda and human capital approach to education have led to the spread of standardized tests globally. Standardized testing is one of the means of implicit control and governance that allows policymakers and politicians to audit education systems. At the same time, such testing can provide reliable information about student performance that can help identify and address learners’ needs and facilitate teaching and learning. However, it is important to ensure that assessment is aligned with teacher instruction and curricula standards in order to maximize its reliability and validity. If standardized tests are not properly aligned with curricula, this may lead to negative unintended testing consequences such as teaching-to-the-test and an over-prioritization of the tested subjects and tested content within the curriculum. The present study aims to address the issues of alignment between testing and curriculum for one subject, English Reading Literacy, within one selective school network in Kazakhstan. For many years, senior high-school (i.e., Grades 11 and 12) performance in this system was assessed by the international high-stakes exam (IELTS) which had little alignment with the curriculum standards. After recent changes in policies, IELTS was substituted by a locally developed high-stakes English exam; however, it is not clear whether this high-stakes test was developed under the rigorous standards of test design. Moreover, the test itself is not used for diagnostic or formative purposes to inform teaching and student learning. Employing an evidence-centered design framework (Mislevy & Riconscente, 2006), the current study validates a battery of reading literacy assessment instruments that are aligned with local Kazakhstani curriculum objectives and are locally appropriate. The study is mixed-method in nature and involved three main stages: pre-pilot, pilot, and main studies. Each stage ensured that a sufficient amount of evidence is accumulated to derive valid interpretations about the test instruments. The study underscores the importance of all stages in test development and validation. The subject expert review panel provided feedback on the test items and examined the alignment of test questions with curriculum objectives and cognitive reading levels. Based on inductive and deductive analytical approaches, feedback from experts was classified into five categories: (1) thematic appropriateness of the reading passages to the target population of students; (2) adequacy of complexity of the test items; (3) clarity and comprehensiveness of test items and distractors; (4) visual perception of the texts, questions, and images; and (5) punctuation. Following the feedback of the review panel, test questions were edited and piloted with Grade 11 and Grade 12 students in one of the selective schools. This stage enabled an initial inspection of the quality of test items in terms of their difficulty level, alignment with the reading construct, and gender bias. Furthermore, high-school student test-takers took part in interviews where they shared their perceptions about the tests and provided feedback on the test items in relation to (1) the clarity of questions and response options, (2) the difficulties they experienced with understanding particular texts and vocabulary, and other aspects related to the vocabulary, and (3) the choice of the reading texts, test timing, test format, and visual perception of the tests. Feedback from students was instrumental in further refinement and improvement of final tests. Two final sets of final tests were administered in four schools (in three different regions) in Kazakhstan: Final Test I was conducted in autumn-winter 2022, and Final Test II was administered in spring, 2023, six months after the first test. The results illustrate that both tests exhibited acceptable levels of reliability. There were some items that were underfitting to the Rasch model suggesting that they may not have been optimally aligned with the measured construct of reading. Some items disadvantaged female test takers and some items did not favour male test-takers. Notably, the sizes of the biases, i.e., Cohen’s d effect sizes, from medium to large. Moreover, nine link items (constituting 26% and 30% of items for Final Tests I and II, respectively) were included in order to facilitate the estimation of growth in reading ability over a six-month period. The findings illustrate the link items functioned well with a minimal equating error (0.12 logits) and that growth was substantial at d = 0.84. The application of confirmatory factor analysis and multidimensional Rasch modelling was also suggestive of an Adjusted Two-Factor TOEFL Reading Framework that could be used as a guiding framework for developing reading assessments for senior high school EFL learners. Constituting a substantive contribution to the literature and further informing the test development process, the study’s framework conceptualizes senior high school EFL reading competency into two factors, (1) Reading for Basic Comprehension and Learning, and (2) Reading to Integrate Information. Furthermore, factors predicting student reading ability in English and growth in reading ability were also explored through multi-level regression-based analyses. Gender and school location were statistically significant predictors of student reading ability. Moreover, mother occupation was a statistically significant predictor of the rate of growth in student reading. Overall, the validated battery of test instruments could be used among high school students for diagnostic or formative purposes, however, some moderate-to-high difficulty test items could be added to better target high ability students. In addition, items that were not properly aligned with the reading construct and displayed large effect sizes against female and male test takers should be further revised and edited.
dc.identifier.citationOlzhayeva, A. (2025). Validating a battery of reading literacy assessments for Kazakhstani high-school students. Nazarbayev University Graduate School of Education.
dc.identifier.urihttps://nur.nu.edu.kz/handle/123456789/8463
dc.language.isoen
dc.publisherNazarbayev University Graduate School of Education
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United Statesen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/
dc.subjectAccess type: Embargo
dc.subjectAssessment
dc.subjectPsychometrics
dc.subjectReading literacy
dc.subjectRasch
dc.titleVALIDATING A BATTERY OF READING LITERACY ASSESSMENTS FOR KAZAKHSTANI HIGH-SCHOOL STUDENTS
dc.typePhD thesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Aliya_Olzhayeva_Thesis.pdf
Size:
8.37 MB
Format:
Adobe Portable Document Format
Description:
PhD Thesis
Access status: Embargo until 2028-02-01 , Download

Collections