QUESTIONS AND ANSWERS My advisor tells me to choose a VARIMAX rotation when doing factor analysis, while you choose DIRECT OBLIMIN. What is the difference and why should I believe you? OBLIMIN is SPSS’s option for oblique rotation (‘scheve rotatie’), VARIMAX (and the other options) refers to orthogonal rotation (‘rechte rotatie’). Forget about the word rotation and what it means (it refers to representing factor analysis in a vector space), and try to understand the differences when you see factor analysis as a bunch of regression equations. OBLIMIN allows for correlation between the latent factors (indeed estimates it), VARIMAX constraints this correlation to be 0.00 So, the choice should refer to your understanding of the set of items that you are analyzing: if you really think they represent multiple latent variables and that these latent variables cannot be correlated, choose VARIMAX. But why would you be analyzing a set of items in one analysis if you think the underlying dimensions are independent? I find it hard to think of such research situations. OBLIMIN rotations produce more (complicated) output than VARIMAX. Apart from the factor correlations, the factor loadings matrix (=relationships between latent factors and observed indicators) is decomposed in the factor pattern and factor structure matrix which represent the direct and total effects of the latent factors. [Try to understand why these two matrices coincide in orthogonal rotations.] You want to interpret the pattern matrix, in particular when you are looking for ‘simple structure’ (!). Why do you construct the index variable using COMPUTE index = MEAN(indicators) in stead of using factor scores? The main
reason for not using factor scores is that SPSS produces these only for
complete cases and leaves out any case that has some missing value. The
COMPUTE / MEAN construction uses all available information and also
produces a score for cases that have missing values in the indicators
(‘item non-response’). So this is a great way to preserve
data. (You can
instruct SPSS FACTOR to do MEAN SUBSTITUTION to get a factor score for all
cases, but this a weaker procedure to treat missing
values.) Other
differences between factor score scales and mean value scales
are: ·
Factor
scores use different weights for the indicators and are in this sense
‘optimal’. ·
(Cronbach’s) alpha reliability refers to mean
value scales. For factor score scales you should refer to theta or omega
reliability (see appendix in Carmines and Zeller). If the
indicators are relatively homogeneous and there is a clear simple
structure, and there are few missing values, the two alternatives will be
vary close. What is the difference between factor
analysis (extraction = PAF) and component analysis (extraction = PC).
Which one should I use? The
difference between PAF and PC is very large conceptually, but makes often
little difference in practice. Briefly: in PAF the observed indicators are
the consequences of the latent factors (this is the LISREL measurement
model), in PC the components are the result (consequences) of the
indicators. Understanding these different (causal) logics is crucial in
understanding LISREL models, so I will say much more about this in the
future. Note that
for the time being I use PC analysis, but explain the results as if it was
PAF analysis. Introducing the difference will be done next
week. |