Test norms
need to be specific to the user and the test context. This should be obvious,
still is often ignored, perhaps due to the expenses involved. What happens if
norms are not specific?
1. A very important aspect is that of faking.
Faking is abundant in job applicants. If norms are collected from incumbents
or, even worse, the population at large, test scores can be grossly misleading.
The reason is that many applicants fake and the distribution of their test
scores is shifted towards a higher mean than for incumbents who fake very
little or not at all. As a consequence, test scores for applicants will be systematically
overestimated. In a stanine scale, the error could easily be 2 or 3 steps. This
problem could be greatly mitigated by using a correction procedure using one or
several scales for measuring the tendency to respond in a socially desirable
manner. In our data, about 95 % of the effect is eliminated this way. Note,
however, that the correction model must be scale specific since scales are
usually not equally vulnerable to distortion.
2. Test
scores may be strongly dependent on the organizational context. In some contexts,
independences is not a desired trait and people will on the average have low
scores on this trait. Another example is perseverance in the face of failure. If
failure is rarely obvious, test takers will report low perseverance. For
reasons such as these, norms need to be specific to the organizations.
It is not
excessively demanding to construct specific norms, given modern IT technology,
and the sample size need to be only as small as 300, or even in some cases 120.
The first step is to realize the importance of specific norms, of norms
corrected for impression management if they are based on incumbents or the population
at large, and the fact that the sample size can be fairly small. In our practice
we work with such norms, but many Swedish test providers seem unaware of the
issue and that the problems can be solved with relatively modest resources.