Factor Analysis for scaling a test with pre-post data

In summary, the person is discussing how to create scales for a test before running t-tests and ANOVA. They mention factor analysis and how it is used to develop scales. They also mention that the assumptions needed for FA need to be explained to a "general audience" of mathematicians.
  • #1
thelema418
132
4
I need to create scales for a test before running t-tests / ANOVA.

Instrument: One attitude survey test with 37 questions. Each question is a Likert type question with 1 to 5 points.

Data Set: The set of data includes pre-test and post-test scores. There was a 6 month delay between the pre-test and post-test.

The question is this -- my original tables have rows of participants and columns for Pre-scores and Post-scores. Some people have told me to run FA on the difference between Pre-Scores and Post-Scores. (This reduces the sample size because some people skipped questions or did not take the post-test). The KMO becomes less than .300 because of this.

Another idea is to treat the participants as different in terms of time, so that there is a Pre-Participant and Post-Participant in the rows. The columns will only be questions. This way we can create factor scores and then merge these scores as pre- and post-scores. When the data is handled this way, the KMO is greater than .700. But some have issued concerns about "replicating" individuals when doing this.

In the statistical literature, I haven't seen anything written about either issue with data for FA in terms of making scales. Any insight would be appreciated. Thanks.
 
Physics news on Phys.org
  • #2
The mathematics section isn't a good place to find social scientists. To get advice, I think you need to explain the mathematical models that are involved to a "general audience" of mathematicians.

From my casual acquaintance with the subject
:
1) A "likert" question is a question that asks the respondent to give an answer that (intuitively) represents the value of some scalar variable. (e.g. It might be a multiple choice question with answers such as a) never b) sometimes c) often d) most of the time e) always.)

2) A "scale" is a real valued function of all the respondent's answers to a questionnaire (Often a weighted average of the ordinal numbers corresponding to the answers - e.g. 1 for a), 2 for b),...etc.). The scale attempts to measure some aspect of the respondent, which I will call a "state".

Intuitively, a "scale" is a more reliable measure of a state than a single question. For example, answering the question:

How often do you carry a hatchet with you when you answer the door?

a) never b) sometimes c) often d) most of the time e) always

may involve both the state of "aggressiveness" and the state of "paranoia". So if we wish to measure the "aggressiveness" of respondents, it seems more reliable to use a function that depends on answers to several questions.

I suppose factor analysis is supposed to develop scales by detecting functions ("scales") that are uncorrelated.

I think you have to explain the states your are tyring to measure and how "pre-test" and "post-test" condition is liable to affect them. You need to answer some (social) scientific questions.
 
  • #3
I suppose this falls under "measurement theory." FA does a lot of things; it is a huge domain of study in itself with multiple branches.

I may be providing too much information by stating that the items are Likert scales: the responses are just ordinal measurements. I usually use factor analysis for dimension reduction on interval measurements, such as SAT-math scores, math exams, etc.

This type of factor analysis I am speaking of was developed by Charles Spearman (R-method statistics). There are some other types of factor analysis such as Stephenson's Q-method (which uses ipsatic testing devices and transforms the tables) and Cattell's P-method (which is factor analysis with a time variable - I have no background at all with this method).

The central idea of Spearman's factor analysis is to find latent variables which correlate with the known variables. The best example I can give is the purpose he had to develop it: to create a measure of general intelligence. The idea is that intelligence must be somehow related to tests of English, Math, Music, Foreign Language, etc. So, if you find the latent variable that correlates to all of them, then you have a measure of "something" (which Spearman labeled "general intelligence"). You can find any number of latent variables, but in Spearman's case the first latent variables explained almost all of the variance -- again, satisfying his claims of general intelligence. Spearman used the correlations to calculate a measure for each participant.

In case of Zimbardo's development of the ZTPI, this generated 5 latent variables, e.g. Present-
hedonic. In Zimbardo's case, he found the items that load together with FA, he named the factors, then he scored the factors by averaging the scores of loaded items after inverting any score that loaded negatively.

My question though is really about statistical assumptions for doing the FA. Maybe I'll post the question to a social science group too.
 

Related to Factor Analysis for scaling a test with pre-post data

1. What is factor analysis?

Factor analysis is a statistical method used to identify underlying factors or dimensions that explain patterns of relationships among a set of observed variables. It is commonly used in psychological and educational research to understand the underlying structure of a test or questionnaire.

2. How is factor analysis used for scaling a test with pre-post data?

Factor analysis can be used to assess the reliability and validity of a test by examining how well the observed variables (e.g. test items) are related to each other and to the underlying factor. It can also be used to reduce the number of variables and simplify data interpretation, which is particularly useful for analyzing pre-post data.

3. What are the steps involved in conducting a factor analysis?

The first step is to determine the number of factors to extract using methods such as Kaiser's rule or scree plot. Next, the data is analyzed using techniques such as principal component analysis or maximum likelihood estimation to extract the factors. Finally, the factors are interpreted and named based on the underlying variables they represent.

4. How can factor analysis be used to evaluate the effectiveness of a test?

Factor analysis can be used to assess the reliability and validity of a test by examining the relationships among the observed variables. A high degree of correlation among the variables indicates that the test is measuring a single underlying construct, while a lack of correlation may suggest that the test is not effectively measuring the desired construct.

5. What are some limitations of factor analysis for scaling a test with pre-post data?

One limitation is that factor analysis assumes that the observed variables are linearly related to the underlying factor. If this assumption is not met, the results of the analysis may be inaccurate. Additionally, factor analysis may not be appropriate for data sets with small sample sizes or highly correlated variables.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Programming and Computer Science
Replies
7
Views
483
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
985
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
6K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
3K
Back
Top