Methodology of Test Development for ISAI Tests
ISAI tests are based on the Classical Testing Theory.
Classical test theory (CTT) has been the foundation for measurement theory for over 80 years. The conceptual foundations, assumptions, and extensions of the basic premises of CTT have allowed for the development of some excellent psychometrically sound scales.
Classical test theory (CTT) is based on the assumption that the raw score (X) obtained by any one individual is made up of a true component (T) and a random error (E) component:
X = T + E.
The true score of a person can be found by taking the mean score that the person would get on the same test if they had an infinite number of testing sessions. Because it is not possible to obtain an infinite number of test scores, T is a hypothetical, yet central, aspect of CTT.
There are several types of CTTs. The foundation for them all rests on aspects of a total test score made up of multiple items. Domain sampling is the most common CTT used for practical purposes. Domain sampling theory assumes that the items that have been selected for any one test are just a sample of items from an infinite domain of potential items. The parallel test theory assumes that two or more tests with different domains sampled (i.e., each is made up of different but parallel items) will give similar true scores but have different error scores.
For example if one was to administer the Analytical ability test to a candidate every day for two years. Sometimes his score would be higher and sometimes lower. The average of his raw scores (X), however would be the best estimate of his true score (T). It is also expected that the random errors around his true score would be normally distributed. That is, sometimes when he took the test his scores would be higher and sometimes when he took the test his scores would be lower (maybe he was tired, distracted). Because the random errors are normally distributed, the expected value of the error (i.e., the mean of the distribution of errors over an infinite number of trials) is 0. In addition, those random errors are uncorrelated with each other; that is, there is no systematic pattern to why his scores would fluctuate from time to time. Finally, those random errors are also uncorrelated to the true score, T, in that is there is no systematic relationship between a true score (T) and whether or not that person will have positive or negative errors. All of these assumptions about the random errors form the foundations of CTT. An important point to remember is that the theory of true and error scores developed over multiple samplings of the same person (i.e., the candidate taking the test himself 1,000 times) can be extended to a single administration of an instrument over multiple persons (i.e., administering the test to a group of 1,000 different people once). Thus it is possible to collect data once (single administration) on a sample of individuals (multiple persons).
Based on the assumption of the 'True score' and the 'Error score', the index of the usefulness of ISAI test is assessed though the following:
1. The Standard error of measurement and Reliability Index- The standard deviation of the distribution of random errors around the true score is called the standard error of measurement. The lower it is, the more tightly packed around the true score the random errors will be. Instead of taking the test for two years to get an estimate of the standard error for one person, the test can be given at once to 1,000 people to the same standard error of measurement that will generalize to the population. The equation for this process is as follows
VAR(X) = VAR(T) + VAR(E).
From the above equation it can be understood that the variance of the observed scores VAR(X) that is due to true score variance VAR (T) provides the reliability index of the test
VAR (T) / VAR (X) = R.
When the variance of true scores is high relative to the variance of the observed scores, the reliability (R) of the measure will be high (e.g., 50/60 = 0.83), whereas if the variance of true scores is low relative to the variance of the observed scores, the reliability (R) of the measure will be low (e.g., 20/60 = 0.33). Reliability values range from 0.00 to 1.00
2) Internal Consistency Reliability – For ISAI Tests the internal consistency of the tests are checked. In internal consistency reliability estimation we use our single measurement instrument administered to a group of people on one occasion to estimate reliability. In effect we judge the reliability of the instrument by estimating how well the items that reflect the same construct yield similar results. We are looking at how consistent the results are for different items for the same construct within the measure. Two ways in which internal consistency is calculated are:
- Homogeneity (alpha) - As a measure of scale internal consistency, Cronbach's coefficient alpha is also calculated, which is essentially the average of all possible spilt-half reliabilities.
- Kuder-Richardson Formula 20 - is a measure of internal consistency reliability for measures with dichotomous choices.
3. Validity – For ISAI tests the validity of the tests are estimated. Validity is a judgment of the degree to which test assesses what it purports to assess. The value of the correlation coefficient can take on any value between -1 and +1. Validity is computed through the following methods:
A) Content Validity - Match between test questions and the content to assess. Comes from content experts
B) Criterion Validity - Match between a test score and some desired outcome. Comes from correlating test scores to the criterion (which is some measure of performance, such as interview performance or performance during training etc)
C) Construct Validity - Degree to which a test assesses the underlying theoretical construct. Example: a numerical ability test with test questions phrased in long and complex reading passages, inadvertently assesses reading skills instead of factual knowledge of basic mathematics. Comes from correlating two tests that measure closely related constructs
4. Descriptive Statistics – For ISAI tests a crucial step is to calculate the Descriptive statistics such as the Mean and Standard deviations of items, which can provide clues about which items will be useful and which ones will not.
a) Mean - The average score of a group of test takers
b) Standard Deviation - Measure of the variance from the mean in a group of test takers. A measure of the "spread" of scores between test takers.
For example, if the variance of an item is low, this means that there is little variability on the item and it may not be useful. If the mean response to an item is 4.5 on a 5-point scale, then the item is negatively skewed and may not provide the kind of information needed. Generally, the higher the variability of the item and the more the mean of the item is at the center point of the distribution, the better the item will perform.
All ISAI Tests conform to psychometric properties of validity and reliability, certain standards of which are required for such tests especially those pertaining to Skills, Abilities and Personality. Knowledge tests are essentially those that are domain specific e.g. Computer Skills and per se are not as sensitive in terms of test properties as those of the others.
| ISAI Tests vis-a-vis similar tests of international repute |
Communication Tests in English: ISAI Tests compared to TOEFL
The ISAI Tests in English Communication are the Spoken English Test, Written English Test and the cognitive Verbal Ability Test. The TOEFL is a battery of tests measuring similar components in varying measures and forms.
| TOEFL |
|
Equivalent ISAI Test |
| Test |
Duration |
No. of Questions |
Parameter/s evaluated |
|
| SPEAKING |
| |
|
|
|
|
|
| Expressing an opinion on a familiar topic |
20 Minutes |
2 (Tasks |
Spoken English skills |
ASET (10-15 minutes) |
| |
|
|
|
|
| WRITING |
| |
|
|
|
|
|
| Writing based on what is read to |
50 Minutes |
2 (Tasks) |
Written English skills |
WET (E-mail) – 3 Emails x 7 minutes = Approx 20 minutes |
| Supporting an opinion on a topic |
WET (Descriptive writing) - 2 Essays x 15 minutes = 30 minutes |
| Answer multiple-choice questions |
25 Minutes |
40 |
Structure and Grammar |
VAT 4.0 (Standard form) |
| |
|
|
|
|
| LISTENING |
| |
|
|
|
|
|
| Listening to clips and answering questions |
30 - 40 minutes |
50 |
Ability to understand short and long conversations, English vocabulary and idioms |
LCT Short Conversations and Passages (3 LCTs x 15 questions each = 45 questions |
| |
|
|
|
|
| READING |
| |
|
|
|
|
|
| Answering questions based on a given passage |
55 minutes |
50 |
Ability to understand non-technical information from a written text |
RC, VAT (10 RCs x 5 questions each = 50 questions |
Personality Test: GPQ in comparison to 16PF
Introduction:
I) The 16PF Questionnaire (16PF) - is a Personality test that assesses an individual on 16 Personality Factors. The 16PF model was developed by Raymond Cattell. The 16 Personality Factors are shown hereinafter:
| Factors |
Factor Name |
Low Score Description |
High Score Description |
| A |
Warmth |
Reserved, Critical Outgoing, |
Easy going, Warm hearted |
| B |
Reasoning |
Concete Thinking |
Bright, More intelligent |
| C |
Emotional Stability |
Affected by feelings, Easily upset |
Emotionally Stable, Faces Reality |
| E |
Dominance |
Humble, Accommodating, Confirming |
Assertive, Independent, Aggressive |
| F |
Liveliness |
Sober, Serious |
Happy-go-lucky, Enthusiastic, Impulsive |
| G |
Rule Consciousness |
Evades Rules, Expedient |
Conscientious, Rule Bound |
| H |
Social Boldness |
Shy, Restrained, Timid |
Venturesome, Socially bold |
| I |
Sensitivity |
Tough-minded, Self Reliant, No nonsense |
Tender minded, Over protected |
| L |
Sensitivity |
Trusting, Free of jealousy |
Suspicious, Hard to fool |
| M |
Abstractedness |
Practical, Careful, Conventional |
Imaginative, Careless of practical matters |
| N |
Privateness |
Privateness |
Shrewd, Calculating |
| O |
Apprehension |
Placid, Confident |
Apprehensive, Worrying |
| Q1 |
Openness to Change |
Conservative, Tolerant of tradition |
Experimenting, Liberal, Analytical |
| Q2 |
Self Reliance |
Group dependent, Sound follower |
Self Sufficient, Resourceful |
| Q3 |
Perfectionism |
Undisciplined, Careless of protocol |
Controlled, Socially precise |
| Q4 |
Tension |
Relaxed, Unfrustrated |
Tense, Frustrated |
The 16PF Questionnaire also evaluates Personality in terms of Five Global Factors. Each of these Global Factors is a combination of some of the 16 Factors. The five Global Factors are shown below:
1. Relating to Others (Extroversion)
- Warmth (Factor A)
- Liveliness (Factor F)
- Social Boldness (Factor H)
- Forthright (Factor N)
- Group Orientation (Factor Q2)
2. Management of Pressure (Anxiety)
- Emotionally Stable (Factor C)
- Trusting (Factor L)
- Self Assured (Factor O)
- Relaxed (Factor Q4)
3. Thinking Style (Tough Mindedness)
- Warm (Factor A)
- Sensitive (Factor I)
- Abstractness (Factor M) - Openness to change (Factor Q1)
4. Influence and Collaboration (Independence)
- Deferential (Factor E)
- Timid (Factor H)
- Trusting (Factor L)
- Traditional (Factor Q1)
5. Structure and Flexibility (Self Control)
- Lively (Factor F)
- Expedient (Factor G)
- Abstractness (Factor M)
- Tolerates Disorder (Factor Q3)
II) The Generic Personality Questionnaire (GPQ), has been designed to assess a test taker on 5 Personality traits. Applied in corporate, business and personal situations the Generic Personality Questionnaire can lead to professional and personal insights.
The Generic Personality Questionnaire has been conceptualized on lines of the Five Factor Model used widely in the field of Organizational behavior and Human Resource Management. This Model is designed to measure 5 major dimensions of normal personality, as shown in the table below:
1. Conscientiousness - Dependable, hardworking, organized, self disciplined, persistent, responsible
2. Emotional Stability - Calm, secure, happy, unworried
3. Agreeableness - Cooperative, warm, caring, good natured, courteous, trusting
4. Extraversion - Sociable, outgoing, talkative, assertive, gregarious
5. Openness to experience - Curious, intellectual, creative, cultured, artistically sensitive, flexible, imaginative
Description of Items:
I) The 16PF - contains 185 items that compose the 16 primary personality factor scales. Each scale contains 10-15 items with a three choice response option format. It takes 30-50 minutes to complete the questionnaire.
II) The GPQ - contains 50 items that compose the 5 dimensions. Each scale contains 10 items with a five choice response option format. It takes 30minutes to complete the questionnaire.
Similarity between the 16PF and the GPQ Framework:
- In the validation of the 16PF, this test was compared to a test called the Neo Personality Inventory, Revised (NEO PI-R). The NEO PI-R is designed to measure 5 major dimensions of the Five Factor Model (which is the basis of developing the GPQ) that measures factors such as Neuroticism, Extraversion, Openness, Conscientiousness, and Agreeableness.
The 16PF and Neo PI-R were administered on 257 US University Undergraduates. The findings of the study were as follow:
- Predictably, the 16PF Extraversion correlated strongly with the NEO Extraversion facets (r = 0.43)
- Anxiety Global Factor correlated with the Neuroticism domain. (r = 0.61)
- The Tough mindedness Global Factor correlated negatively with Openness facets.
- The Independence Global Factor correlated with the Neo Extraversion facets especially with
- Assertiveness (r = 0.60).
- The Anxiety factor also correlates negatively with the NEO facet of Trust, suggesting that high anxiety is associated with skepticism and distrust (r = -0.44).
From the above it can be summarized that the Five Global Factors of the 16PF can be compared to the Five factors of the GPQ, as summarized in the table below:
| 16PF Global Factors |
GPQ Parameters |
Descriptive characteristics of 'High' scorers |
| Structure and Flexibility (Self Control) |
Conscientiousness |
Dependable, hardworking, organized, self disciplined, persistent, responsible |
| Management of Pressure (Anxiety) |
Emotional Stability |
Calm, secure, happy, unworried |
| Thinking Style (Tough Mindedness) |
Thinking Style (Tough Mindedness) |
Cooperative, warm, caring, good natured, courteous, trusting |
| Relating to Others (Extroversion) |
Extraversion |
Sociable, outgoing, talkative, assertive, gregarious |
| Influence and Collaboration (Independence) |
Openness to experience |
Curious, intellectual, creative, cultured, artistically sensitive,flexible, imaginative. |
Cognitive Abilities Tests: Numerical Ability, Analytical Ability & Verbal Ability vis-à-vis the Differential Aptitude Test (DAT) battery
The DAT battery is renowned worldwide for testing aptitude in various aspects of abilities and reasoning as detailed herein below:
Subtests Help Measure Aptitude for Success
- Verbal Reasoning - is appropriate for measuring general cognitive ability and for placing employees in professional, managerial, and other positions of responsibility requiring higher order thinking skills.
- Numerical Ability - test the understanding of numerical relationships and facility in handling numerical concepts. Good prediction of success of applicants in such fields as mathematics, physics, chemistry, engineering, and in occupations such as laboratory assistant, bookkeeper, statistician, shipping clerk, carpenter, tool-making, and other professions related to the physical sciences.
- Abstract Reasoning - is a nonverbal measure of the ability to perceive relationships in abstract figure patterns. Useful in selection when the job requires perception of relationships among things rather than among words or numbers, such as mathematics, computer programming, drafting, and automobile repair.
- Clerical Speed and Accuracy Paper Administration Only- measures the speed of response in a simple perceptual task. This is important for jobs such as filing and coding, and for jobs involving technical and scientific data.
- Mechanical Reasoning - closely parallels the Bennett Mechanical Comprehension Test and measures the ability to understand basic mechanical principles of machinery, tools, and motion. It is useful in selection decisions about applicants for jobs such as carpenter, mechanic, maintenance worker, and assembler.
- Space Relations - measures the ability to visualize a three dimensional object from a two dimensional pattern, and how this object would look if rotated in space. This ability is important in fields such as drafting, clothing design, architecture, art, die making, decorating, carpentry, and dentistry
- Spelling Paper Administration Only - measures an applicant's ability to spell common English words, a basic skill necessary for success in a wide range of jobs including business, journalism, proofreading, advertising, or any occupation involving written language.
- Language Usage - measures the ability to detect errors in grammar, punctuation, and capitalization. When Language Usage and Spelling are both administered, they provide a good estimate of the ability to distinguish correct from incorrect English usage, which is important in business communication.
ISAI Tests in cognitive abilities i.e. Verbal Ability Test, Analytical Ability Test and Numerical Ability Tests address most of the subtests of the DAT, especially the cognitive ones, as required in testing candidates' aptitude for career development and recruitment in the Indian socio-economic space. These tests can be used flexibly as per requirements to identify candidates for hiring, training, and career development and provides clear indications of a candidate's strengths and weaknesses.
Each of the tests is developed for ISAI as per international standard by internationally acclaimed organization(s) and is supported by strong psychometric properties as established by the validity & reliability measures.
|