# Re: Hello, and questions (long!)

Arne Raeithel (raeithel@rzdspc1.informatik.uni-hamburg.de)
Wed, 23 Feb 1994 09:57:28 +0100 (MET)

Dear Sari Luoma,

> First problem: size and shape of the resulting grids. I cannot get sqare
> grids - there simply don't appear to be more than eight (at the most)
> important assessment features that can be considered to be different from
> each other.
It is not unusual to get rectangular grids, especially with carefully
constrained possibility of constructs offered. With regard to any possible
analysis that I know of, this is of no concern whatever.

> Second problem: I have these interesting-looking grids, the assessors have
> gone over them again and verified that this is indeed what they feel, and
> found explanations in them to the qualms they had regarding the
> (previously made) holistic assessment of some of the performances, but: I
> don't know how to analyse them. I have run correlations, they are
> mostly very high, .7 to .9, some individual ones at .5-range. Naturally,
> factor analysis produces one factor.
Generally, I do strongly advise against using Factor Analysis in the strict
sense used in psychometrics and testing as the model to understand the
mathematical procedure called Principal Components Analysis (PCA, see
Slater's definition).

These are some reasons for my opinion:
(1) The usual packages for statistical analysis, like the SPSS you have
used, have standard thumb-rule criteria developed for data matrices
with a few variables (about 50 at most), and many respondents (minimally
two times as much as variables). If you just get one factor, it is most
probably because of the rule "eigenvalue greater than one".

In a Grid PCA it usually pays, to look at all the "factors" (principal
axes), especially the last ones, where rating errors may be identified,
or a contradictory construct may show itself.

Remedy: Tell SPSS to extract four factors.

(2) Factor analysis is based on correlation coefficients. This means
that either the constructs or the elements (depending on the way
you give the matrix to SPSS, elements as rows or elements as columns)
are "centered". That is, the original scores on the constructs get
replaced by the deviation from the mean score. This shifts the
neutral point used by the respondents to a new zero that is only
formally defined, and the position of which is heavily dependent
on the choice of the elements (in the case when constructs are
centered). I think that in anaylzing individual grids the raw
scores should not be tampered with, at all. I regard those raw
scores as estimates of values on a ratio scale (no zero shifting
allowed, zero at the middle point of the score range).

Remedy: A SPSS Matrix procedure I have written. (Will send it soon,
must re-edit).

that I am under time pressure, and want this to go out today.

Sincerely,

Arne Raeithel
Isestrasse 7
D-20144 Hamburg

Internet: raeithel@swt1.informatik.uni-hamburg.de

> ... This should mean there's only one
> construct operating. In that case, I would call this construct language
> ability, rather than any of the individual labels elicited. But what of
> the labels, should I say they don't mean anything? But they do! They
> are very real to the assessors, and clearly separable, rate of speaking
> for instance, from extent of vocabulary, which is again different from
> structural knowledge. It just so happens that language is an integrated
> skill, as ability grows, all of these features improve, some more
> linearlythan others. But this doesn't mean the features don't
> exist, not to me at least. Will I ever be able to do more than label
> comparisons?
>
> I have Mancuso & Jaccard's PAREP and SELFGRID, but they will not
> work with my present setup without substantial modifications, and I am in
> the process of acquiring Metzler's PCGRID. For now, all I have is SPSS
> for Windows.
>
> I don't mean to underestimate the value (of sorts) of label comparisons,
> qualitatively the whole process was very rewarding to the assessors, and
> comparing the labels helps develop interesting hypotheses about language
> test performance. Some almost self-evident differences were detected, we
> just hadn't thought of them before in that way. But. Where's the proof, I