This article throws light upon the nine main requisites of a good test of intelligence. The requisites are: 1. A Clear Conceptual Frame 2. Testing Conditions 3. Item Characteristics 4. Psychometric Characteristics 5. Range of Difficulty Level 6. Objectivity of Scoring 7. Reliability 8. Validity 9. Norms.
Requisite # 1. A Clear Conceptual Frame:
While intelligence testing as a practice, has become extensive and there has almost been a mushroom growth of intelligence tests, nevertheless there is still no definition of intelligence available on which a large majority of psychologists agree and which is also comprehensive, taking into account the different operations or abilities which are involved in intelligent action.
In view of this, a person using a particular test must be clear as to what aspects of intelligence the particular test measures, whether it measures a ‘g’ or a group factor or any specific ability like perception, etc. While interpreting the test score, this should provide us a reasonable accurate score when used with children who live in a disadvantaged milieu. There is ample and consistent evidence that test scores are affected by socio-economic variations.
Requisite # 2. Testing Conditions:
All of us can do our best in any sphere of activity only if the conditions are congenial. One may be an expert driver but the expression of this expertise is affected by traffic conditions and the driving competence of others driving their vehicles which are on the road. The subject should be tested only under conditions, where he or she is comfortable, relaxed, and free from tension and curiosity as to what would happen if he or she did not perform well.
The purpose and implications of the testing should be made clear. Further, if a number of persons are being tested, the conditions under which the testing is done should be constant and uniform for all persons being tested. The physical conditions, including temperature, seating arrangements, ventilation, freedom from noise-all should be carefully arranged and set up.
Requisite # 3. Item Characteristics:
The wording of items whether questions or problems or puzzles should be clear and unambiguous. Where verbal items are involved, the language should be simple and easily comprehensible by the persons taking the test. The term ambiguity refers to the degree of clarity. When two people independently read the item, they should get the same meaning.
There are certain other characteristics, like the total number of items to be answered, the length of the test, and consequently the time taken, etc. An extremely long test with a large number of items takes a long time which may result in the subject getting fatigued and even loss of motivation especially when the person reaches the later stages of the test. This can be a serious problem because in most tests, the items are arranged in increasing order of difficulty, the more difficult items being placed towards the end.
We may now understand why very long tests may not be effective and if the test is too long and takes a long time, it may so happen that the person may be fatigued and disinterested when he reaches the more difficult items.
On the other hand, it is at this stage that he really has to be more attentive, energetic, and motivated. A very short test on the other hand may not do justice to the real ability of the individual to translate into action all his potentialities.
Further, where comparisons have to be made as when an intelligence test is used to screen some people from a large number of aspirants either for job purpose or for admissions to educational programmes, a very short test may not clearly discriminate or bring out the real differences in the intelligence of the people being tested. The relation of the length of the test to what is known as the reliability of the test has been researched upon extensively.
Requisite # 4. Psychometric Characteristics:
Some of the important characteristics that are essential for ensuring correct and meaningful measurement of human behaviour and an intelligence test is no exception. These characteristics are called psychometric because they are answered in quantitative terms and also because they are essential for other psychological tests.
Requisite # 5. Range of Difficulty Level:
The items in an intelligence test should reflect a wide range of difficulty levels from very easy items to very difficult items. Items which are easy and have a low level of difficulty level are necessary to measure low levels of intelligence.
For example, if all the items are difficult and a person, who takes the test, does not pass any of them this results in a score of zero, this would be a ridiculous situation because no normal person can have an IQ of zero. Secondly items which are low in difficulty level, if placed at the beginning level help the individual to sense the possibility of success and build up his confidence. It may also be mentioned that such easy items may help the persons to get an orientation to the test.
Difficult items are essential to discriminate between those who are more intelligent and those who are less intelligent. If all the items are successfully completed by most if not all people, then individual differences cannot be brought out. The reader may wonder what should be the range of the level of difficulty.
This is a question for which a categorical answer cannot be given. The required range of difficulty level depends on the persons for whom the testing has been developed. For example, if a school meant for gifted or very bright children decides to use an intelligence test to select 30 students for admission out of about 200 applicants, then there should be greater proportion of difficult items and also items with a high level of difficulty.
On the other hand, if the authorities of an ordinary school decide to study the patterns of distribution of the IQ of their students, then there should be a proper representation of difficult items, moderate difficult items and very difficult items and probably very few items of very high difficulty. The difficulty level of items is decided by a percentage of people passing an item, at a trial testing held before the finalization of the test.
A range which is usually followed is to have items passed by 90% of the trial group (low difficult are at one end and ranging up to items passed by not more than 10% difficult items). Of course the total number of items should be preferably distributed over varying levels of difficulty. As may be seen there cannot be any categorical rule in this region.
Requisite # 6. Objectivity of Scoring:
An important requirement is that there should be only one correct answer to every item and if the tests are scored by different people, all of them should agree that the assumed correct answer is really the correct answer, without any doubt. The test responses scored or examined by different evaluators independently must result in the same scores. Of course, in the case of some tests, the authors of the tests award partial scores to a particular correct answer. But here again clear guidelines are essential.
Requisite # 7. Reliability:
Reliability refers to that characteristic of the scores on the test, which allows one to conclude that the score obtained by an individual will not very much, over a short period of time, if other conditions remain the same. For example, if an individual’s IQ is indicated by his score on average as 115, and if the test is repeated after a few days, perhaps the score on the second occasion may be a few points less or a few points more.
But if on the second occasion the IQ estimated turns out to be much lower or much higher, then it is clear that the score obtained on the tests are not stable. If this is the case, then no major decision can be made on the basis of the score. For example, if a student is denied admission to a course, on the basis of the score on such a test, then there will be a serious problem.
The same will be the case if a student is admitted on the basis of his or her score which is not reliable. Here a really bright student may be overlooked in favour of a student whose score is higher but unreliable. The problem will be much more serious if the test involved is for selection for jobs.
Still more serious will be the case if the IQ indicated is low (unreliable) and the person is considered as not normally developed. In many instances, the progress of psychological treatment for disorders or disability is monitored by scores on intelligence tests and if the scores are not reliable then the treatment process may be misleading.
Thus one can see how important ‘reliability’ is. When we discuss the quality of reliability there are two considerations. The first is that the scores obtained will be stable and not change much over a reasonable period of time. The other one is that the score is free from errors in the test, errors in scoring and the error of unsatisfying conditions under which the test was admitted. There are different statistical measures of estimating the reliability.
Requisite # 8. Validity:
The term validity refers to the correctness of the test. If a thermometer is used to measure body temperature it must measure body temperature and not something else. Similarly, if a test is designed to measure intelligence in terms of a conceptual description of intelligence, the test must really test it.
Thus if the author of a test defines intelligence as ability for abstract thinking, then the test must measure ability for abstract thinking. Let us take our examinations. While they may measure intelligence to some extent, they also measure language ability, ability to remember, speed of writing, amount of reading done, etc. While the latter may be important, they are not essential components of intelligence. Difficulty levels, reliability, and the validity of a test need to be established before the test form is finalised.
Requisite # 9. Norms:
The intelligence of a person is not considered as fixed characteristic of a person. It is not something fixed or absolute like a person’s height. While in the earlier days of intelligence testing, when the Stanford-Binet tests were the only tests, the concept of mental age was used implying that the intelligence of a person was something relatively fixed and true of him or her, over subsequent years, the concept of a fixed and absolute intelligence has given place to a comparative or relative measure of intelligence.
All intelligence tests today report IQs as representing the standing or position of an individual in comparison with others belonging to a similar group. Thus a person whose IQ is reported as 130 and is regarded as bright may not be so if his scores were compared with the standard performance of some other groups. Such standards of comparison against which an individual performance is compared and interpreted are known as Norms.
Tests have different kinds of norms, sex-wise norms, age-wise norms, socioeconomic-educational level-wise norms, etc. Against this background it is obvious that the score of a person should, be evaluated only against the norms of the groups to which he or she belongs as otherwise the assessment can be very erroneous.
This was felt necessary because, to a certain extent intelligence tests have spread widely and are used for very important purposes. While intelligence testing and measurement of intelligence can be definitely helpful, there are a lot of intricacies involved.
One cannot produce a test of intelligence overnight. It takes years and a lot of effort and care to develop a sound test of intelligence and even when so developed, there is a constant need for regular review, and improvement of the test. In view of this, the student can appreciate the need to exercise great care in interpretation of test scores and taking decisions on the basis of just IQ scores.
There is probably no decision that can be taken entirely on the basis of measured IQ value, except perhaps where the scores are repeatedly found to be low, in the case of individuals who have not developed normally. In choosing an intelligence test one should ensure that the given requisites are satisfied.