When a test instrument or a set of measures perform consistently in assessments, it can be termed as a reliable test or a test instrument. As the reliability depends on many factors, there can be several classes of reliability in relation to a particular assessment. These are,
-Inter rater reliability
-Test retest reliability
-Inter method reliability
-Internal consistency reliability
The first class of reliability deals with the variation that can be seen when different persons assess using the same test instrument under the same conditions whereas the ‘test retest’ reliability refers to the variation that could occur when the same person or an instrument measures at two different instances under the same conditions. Inter method reliability refers to the error that takes place when different methods are used in assessing the same whereas the internal consistency reliability delineates the variation that taken place within the same test in different test items.
Accordingly, the ‘true score’ of a person undergoing an assessment would be affected by many external factors which lead to the inference that,
Observed score = True score + Error
True score of an individual sitting an exam would not be assessed in any of the existing test instruments and therefore, quantifying the reliability would give us the best chance of getting at the candidates ‘true score’. Therefore, methods such as ANOVA can be used in statistically generating the reliability and therefore the expected true score according to the following formulae.
Reliability = TCSVC / CSVC + EVC
TCSVC = True/candidate score variance component
CSVC = Candidate score variance component
EVC = Error variance component
Measuring reliability:
Literature describes several methods of deciding the reliability of a test score and following are the two available approaches.
-Multiple administration method
-Single administration method
In multiple administrations, the same measure is applied at two different times and the correlations between the two administrations are statistically analyzed. In contrast, the single administration method will require administering of two different forms of the same measure at a single instance. The correlation would therefore be calculated between the different forms of the measure.
For example, a test administered before conducting a learning activity and after activity can be analyzed to arrive at a reliability figure that comply with multiple-administration method. At the same time, a test administered once can be divided in to two half’s (Split half method), e.g odd and even numbered items, and the correlation between the two halves can be used to analyze the reliability as well.
Similarly, another approach known as the ‘internal consistency’ can be used to derive the reliability at times of single-administration methods which make use of tools such as Cronbach’s alpha to determine the same.