Sunday 16 September 2012

ACHIEVING BENEFICIAL BACKWASH



·         Test abilities whose development is encouraged/wanted:   test what is important to test, not what is easy to test (Content Validity—if oral skills are to be encouraged, then test only oral abilities) + give them more importance than other activities (if they are given very less marks in exams, students will ignore preparing it).
·         Sample widely/unpredictably:   when tests are predictable, students/teachers focus on only those areas. Test samples should include variety of specifications (of objectives).
·         Use direct testing:    content and results are more authentic in direct testing.
·         Make testing criterion-referenced:   students are motivated since only their abilities are measured instead of them being compared to others. Students taking a norm-referenced test already have a fair idea of their position and fear of failure in competition keeps them from improving.
·         Base achievement tests on objectives:    syllabus content apch vs. basing tests on objectives
·         Ensure test is understood by students:   clear/explicit instructions
·         Provide necessary assistance to teachers:   a phonetics teacher (objective marking) should be given guidance before he is made to mark a subjective paper like novel because he might not be acquainted with the latter. 
Practicality:     along with validity and reliability, it is desired that the test should be cheap to construct, administer, score and interpret. However, validity/reliability should not be sacrificed for the sake of practicality. Instead of lowering standards, concerned authorities should pre-plan about reasonable construction and administration of a valid and reliable test. Money allocated for comparatively less important issues could be directed towards this purpose.       

RELIABILITY



·         Human beings don’t behave exactly the same way on every occasion, even when circumstances seem identical. The solution is to construct, administer, and score tests in such a way that the results seem similar on different days/timings (Reliability).
·         Reliability Coefficient:                  Two tests on the same subject by the same group of students are compared (like validity coefficient). Ideal coefficient is 1 (test which would give same results for a particular group of students regardless of the day/time of test). 0 means unconnected results (score of one day doesn’t predict score on another day)—it is different for different skills (0.9 for vocab, 0.7 for oral test).  Test reliability (performance of candidates from occasion to occasion) and Scorer reliability (scorer’s consistency) are interrelated: if scoring is unreliable, test results are also unreliable. Objective tests (evaluated by computers) almost always give a coefficient lower than subjective tests (marked by a human)—since computers are perfectly reliable (consistent) scorers.
·         How to make tests more Reliable:
Test Reliability
Ø  Take enough samples of behavior:    the more items (questions, passages, etc) we have on a test, the more reliable that test is because many samples of behavior are representative of a person’s true behavior. However, tests shouldn’t be so long that students feel bored or tired (unrepresentative of their ability).
Ø  Restrict freedom of candidates:    students should not be given too many choices because a very broad subject might have different results on different occasions by same student (writing an essay on tourism vs. writing an essay on tourism in Kashmir—views might change with changing conditions in Kashmir). However, too much restriction might distort the task.
Ø  Write unambiguous items:    don’t ask open-ended Qs that have double interpretations or an answer different from the one anticipated by the examiner.
Ø  Provide clear/explicit instructions:   written/oral tests should have clear instructions so that students attempt the test in accordance with the scorer’s requirements.
Ø  Ensure legibility of tests:   students should not be expected to do unwanted tasks—correcting badly typed/handwritten tests (variation in font, spacing, print, etc).
Ø  Familiarity of candidates with format/testing techniques:    teachers should tell students in advance with the test format + his/her requirements
Ø  Uniform/non-distracting administration:    more differences in administrative conditions will result in more differences in test results
Scorer Reliability
Ø  Use items that need more objective marking:   fill in the blanks, MCQs, and where these are not possible (such as comprehension tasks), ask direct/unambiguous questions
Ø  Make direct comparison between students:   similar to restrict freedom of candidates. Scoring on one topic is more reliable than giving students choice to write on any one of four topics and then comparing results.
Ø  Scoring key:   examiner should anticipate all different answers/apchs of students. His scoring key should clearly state the points of totally correct and partially correct answers.
Ø  Train scorers:   scorers scoring patterns should be analyzed from time to time to ensure consistency   
Ø  Identify candidates by number:   candidates’ names/photograph should not be mentioned while marking to ensure objective scoring (because of the scorer’s prejudice for certain names, nationality, or gender).
Ø  Employ multiple/independent scoring:   type of syndicate marking. This way, a senior colleague can investigate discrepancies between scorers
·         Reliability and Validity:    a valid test must be reliable, but a reliable test may not be valid, e.g. a writing test requiring its students to write down translation equivalents of 500 words might be reliable (result will be the same/similar on different occasions), but not valid for a test of writing (doesn’t really measure a student’s writing ability).