Exploratory Factor Analysis and How to do It?

Exploratory factor analysis (EFA) is a statistical method that aims to discover the underlying structure of a set of observed variables. It is often used to reduce the dimensionality of data, identify latent factors, and test hypotheses about the relationships among variables. EFA can also help to simplify data interpretation, improve measurement reliability, and validate scales or instruments.

A scree plot showing the eigenvalues of each factor in descending order and the elbow point indicating the optimal number of factors to retain for exploratory factor analysis.

In this article, you will learn what EFA is, how it works, and how to perform it. You will also learn about the different types of factor analysis, the steps involved in conducting EFA, the criteria for choosing the number of factors, the methods of factor extraction and rotation, and how to interpret and report the results of EFA. By the end of this article, you will be able to apply EFA to your data and answer questions such as:

  • What are the main factors that explain the variation in my data?
  • How many factors should I retain for my analysis?
  • How are the observed variables related to the extracted factors?
  • How can I name and describe the factors based on their loadings?
  • How can I use the factor scores for further analysis?

What is Exploratory Factor Analysis?

Exploratory factor analysis (EFA) is a type of factor analysis that explores the underlying structure of a set of observed variables. It assumes that some unobserved or latent factors account for the common variance among the observed variables. In other words, EFA tries to discover the hidden dimensions or constructs that explain why some variables correlate.

For example, suppose you have a questionnaire that measures different aspects of personality, such as extraversion, agreeableness, conscientiousness, neuroticism, and openness. You can use EFA to find out how many factors are needed to represent these personality traits and how each trait is related to each factor. You may find out that five factors correspond to the five personality traits, or you may find out that there are fewer or more factors that capture some higher-order dimensions of personality.

EFA differs from confirmatory factor analysis (CFA), another type of factor analysis that tests whether a predefined model or theory fits the data. CFA requires specifying in advance how many factors there are and which variables are associated with which factors. EFA has no restrictions and allows the data to reveal the factor structure.

How Does Exploratory Factor Analysis Work?

It extracts one or more factors from a correlation matrix of observed variables. A correlation matrix shows how each variable is correlated with every other variable in the data set. A factor is a linear combination of observed variables that captures as much common variance as possible. A factor loading is a coefficient that indicates how strongly each variable contributes to a factor.

EFA aims to find factors that best reproduce the original correlation matrix. The extracted factors should account for most of the variance in the observed variables and have high correlations with them. The extracted factors should also be independent and have low correlations among themselves.

There are three main steps involved in conducting EFA:

  1. Choosing the number of factors to retain
  2. Choosing the method of factor extraction
  3. Choosing the method of factor rotation

These steps will be explained in more detail in the following sections.

How to Choose the Number of Factors to Retain?

One of the most important decisions in exploratory factor analysis (EFA) is how many factors to retain for further analysis. There is no definitive rule for choosing the number of factors, but some criteria can help guide this decision.

Some of these criteria are:

  • Eigenvalue criterion: An eigenvalue measures how much variance a factor explains. A common rule is to retain only those factors with eigenvalues greater than 1, meaning they explain more variance than a single variable.
  • Scree plot criterion: A scree plot shows each factor's eigenvalues in descending order. A common rule is to retain only those factors before the curve flattens out or “elbows”.
  • Parallel analysis criterion: Parallel analysis is a method that compares the eigenvalues of the actual data with those of randomly generated data. A common rule is to retain only those factors whose eigenvalues are larger than those of the random data.
  • Variance explained criterion: This criterion considers how much total variance in the observed variables is explained by the extracted factors. A common rule is to retain enough factors to explain at least 50% or 60% of the total variance.
  • Interpretability criterion: This criterion considers the extracted factors' meaning and coherence based on their factor loadings and theoretical relevance. A common rule is to retain only those factors that can be named and described clearly and logically.

How to Choose the Method of Factor Extraction?

Another important decision in exploratory factor analysis (EFA) is how to extract the factors from the correlation matrix. There are different methods of factor extraction, each with its assumptions and advantages. Some of the most common methods are:

  • Principal component analysis (PCA): This method extracts factors that account for the maximum variance in the observed variables. It does not distinguish between common and unique variance and assumes that all variables are measured without error. PCA is useful for data reduction and summarization but may not reflect the true factor structure.
  • Common factor analysis (CFA): This method extracts factors that account for the maximum common variance in the observed variables. It distinguishes between common and unique variance and allows for measurement error in the variables. CFA is useful for finding latent constructs and testing hypotheses but may not converge or produce a unique solution.
  • Principal axis factor (PAF): This type of CFA extracts factors based on the reduced correlation matrix, which removes the unique variance and error variance from the observed variables. PAF is useful for finding latent constructs and testing hypotheses, but it may underestimate the number of factors and the factor loadings.
  • Maximum likelihood (ML): This method is another type of CFA that extracts factors based on the likelihood function, which estimates how well the factor model fits the observed data. ML is useful for finding latent constructs and testing hypotheses, but it requires large sample sizes and multivariate normality of the data.

Factor Loading: How to Interpret and Report the Relationship between Variables and Factors in EFA

Factor loading is a coefficient that indicates how strongly each variable contributes to each factor in exploratory factor analysis (EFA). It ranges from -1 to 1, where -1 indicates a perfect negative relationship, 0 indicates no, and 1 indicates a perfect positive relationship. Factor loading can help to interpret and report the factor structure, the factor names, and the factor reliability and validity.

To interpret the factor structure, you need to look at the factor loading matrix, which shows the factor loadings of each variable on each factor. A high factor loading (>0.4) means that the variable has a strong association with the factor, while a low factor loading (<0.4) means that the variable has a weak or no association with the factor. You can also look at the factor pattern, which shows the standardized factor loadings, or the factor space, which shows the graphical representation of the factors and variables.

To report the factor names, you need to look at the variables with high factor loadings on each factor and try to find a common theme or concept that describes them. For example, if you have a factor with high loadings on variables such as talkative, active, adventurous, and dominant, you can name it extraversion. You can also use theoretical or empirical evidence to support your naming.

A flowchart showing the steps involved in performing exploratory factor analysis in SPSS, such as preparing the data, choosing the number of factors, selecting the extraction and rotation methods, interpreting and reporting the results, and comparing with confirmatory factor analysis.

To report the factor reliability and validity, you need to compute and report some statistics that measure how well the factors measure their intended constructs. For example, you can use Cronbach’s alpha or other internal consistency measures to assess the reliability of each factor. You can also use convergent or discriminant validity measures to assess how well your factors relate to other measures of similar or different constructs.

How do you choose the Method of Factor Rotation?

The final step in exploratory factor analysis (EFA) is to rotate the factors to achieve a simpler and more interpretable factor structure. Rotation is a process that changes the orientation of the factors without changing their meaning or fit to the data. Rotation can make the factor loadings higher or lower, depending on how close or far they are from the factor axes.

Types of factor rotation

  • Orthogonal rotation: It assumes that the factors are uncorrelated with each other
  • Oblique rotation: allows for some correlation among the factors.

Methods of factor rotation 

  • Varimax: This orthogonal rotation method maximizes the variance of the squared factor loadings within each factor. Varimax produces a simple structure, where each variable has a high load on one factor and low loadings on others.
  • Quartimax: This is another orthogonal rotation method that maximizes the variance of the squared factor loadings across all factors. Quartimax produces a simple structure, where each factor has a few variables with high and many with low loadings.
  • Equamax combines varimax and quartimax, which balances the simplicity within and across factors.
  • Oblimin: This oblique rotation method minimizes the large loadings within each factor. Oblimin produces a complex structure, where each variable may load highly on multiple factors.
  • Promax: This is another oblique rotation method that applies an orthogonal rotation first, followed by an oblique transformation. Promax produces a complex structure where each variable may load highly on multiple factors.

How to Interpret and Report the Results of Exploratory Factor Analysis?

After performing exploratory factor analysis (EFA), you mus interpret and report the results clearly and concisely.

You should include the following information in your report:

  • The purpose and rationale of conducting EFA
  • The data source, sample size, and variables used in EFA
  • The criteria for choosing the number of factors
  • The method of factor extraction and rotation
  • The amount of variance explained by each factor
  • The factor loading matrix, showing how each variable relates to each factor
  • The names and descriptions of each factor based on their factor loadings
  • The reliability and validity of each factor
  • The implications and limitations of EFA

Example: How to report the results of EFA

We conducted exploratory factor analysis (EFA) to examine the underlying structure of 10 personality traits measured by a questionnaire. We used data from 500 respondents who completed the questionnaire online. We used Rstudio to perform EFA with principal axis factoring as the extraction method and varimax as the rotation method. We decided to retain four factors based on the eigenvalue criterion (>1), scree plot criterion (elbow point), parallel analysis criterion (eigenvalues > random eigenvalues), variance explained criterion (>50%), and interpretability criterion (meaningful factors).

The four factors explained 62% of the total variance in the personality traits. The factor loading matrix showed that each trait had a high loading (<0.4) on other factors. The names and descriptions of the four factors based on their factor loadings are as follows:

  • Factor 1: The extraversion factor reflects the tendency to be outgoing, sociable, energetic, and assertive. It includes traits such as talkative, active, adventurous, and dominant.
  • Factor 2: The agreeableness factor reflects the tendency to be cooperative, friendly, compassionate, and trusting. It includes traits such as warm, sympathetic, helpful, and forgiving.
  • Factor 3: The conscientiousness factor reflects the tendency to be organized, diligent, responsible, and self-disciplined. It includes being careful, thorough, reliable, and hardworking.
  • Factor 4: The neuroticism factor reflects the tendency to experience negative emotions, such as anxiety, anger, sadness, and insecurity. It includes traits such as nervous, moody, emotional, and self-conscious.

We assessed the reliability and validity of each factor by computing Cronbach’s alpha and correlating the factor scores with other personality measures. The results showed that each factor had a high internal consistency (alpha > 0.8) and a moderate convergent validity (r > 0.5) with the corresponding dimensions of the Big Five personality model.

EFA implies that we can use the four factors as a parsimonious and meaningful representation of the personality traits measured by the questionnaire. We can also use the factor scores as predictors or outcomes in further analysis, such as regression or cluster analysis. The limitations of EFA are that it is an exploratory and data-driven method that may not generalize to other samples or contexts. It also depends on the choices and assumptions made by the researcher, such as the number of factors, the extraction method, and the rotation method.

Factor Loading: How to Interpret and Report the Relationship between Variables and Factors in EFA

Factor loading is a coefficient that indicates how strongly each variable contributes to each factor in exploratory factor analysis (EFA). It ranges from -1 to 1, where -1 indicates a perfect negative relationship, 0 indicates no, and 1 indicates a perfect positive relationship. Factor loading can help to interpret and report the factor structure, the factor names, and the factor reliability and validity.

To interpret the factor structure, you need to look at the factor loading matrix, which shows the factor loadings of each variable on each factor. A high factor loading (>0.4) means that the variable has a strong association with the factor, while a low factor loading (<0.4) means that the variable has a weak or no association with the factor. You can also look at the factor pattern, which shows the standardized factor loadings, or the factor space, which shows the graphical representation of the factors and variables.

To report the factor names, you need to look at the variables with high factor loadings on each factor and try to find a common theme or concept that describes them. For example, if you have a factor with high loadings on variables such as talkative, active, adventurous, and dominant, you can name it extraversion. You can also use theoretical or empirical evidence to support your naming.

To report the factor reliability and validity, you need to compute and report some statistics that measure how well the factors measure their intended constructs. For example, you can use Cronbach’s alpha or other internal consistency measures to assess the reliability of each factor. You can also use convergent or discriminant validity measures to assess how well your factors relate to other measures of similar or different constructs.

EFA: An Exploratory and Data-Driven Method for Finding Latent Factors in Your Data

Exploratory factor analysis (EFA) is a statistical method that aims to discover the underlying structure of a set of observed variables by extracting one or more latent factors that account for their common variance. EFA is an exploratory and data-driven method which does not require specifying in advance how many factors there are and which variables are associated with which factors. Instead, it allows the data to reveal the factor structure.

EFA is useful for finding latent factors in your data, especially when many variables are correlated. EFA can help you to reduce the dimensionality of your data, identify hidden patterns or dimensions in your data, test hypotheses about the relationships among variables, simplify data interpretation, improve measurement reliability, and validate scales or instruments.

Limitations and Challenges

  • EFA may not reflect the true factor structure or generalize to other samples or contexts.
  • EFA depends on the choices and assumptions made by the researcher, such as the number of factors, the extraction method, and the rotation method.
  • EFA requires some criteria and judgment to determine the number of factors, name and describe the factors, and evaluate the reliability and validity of the factors.
  • EFA may not converge or produce a unique solution.

Therefore, EFA should be used cautiously and complemented by other methods, such as confirmatory factor analysis (CFA) or structural equation modelling (SEM).

Rotation Method: How to Choose and Apply the Best Type of Rotation for Your Factor Analysis

The rotation method is a process that changes the orientation of the factors without changing their meaning or fit to the data. The rotation method can make the factor loadings higher or lower, depending on how close or far they are from the factor axes. The rotation method can help to achieve a simpler and more interpretable factor structure.

Types of rotation method

  1. The orthogonal rotation method assumes that the factors are uncorrelated.
  2. The oblique rotation method allows for some correlation among the factors.

The most common orthogonal and oblique rotation methods:

  • Varimax
  • Quartimax
  • Equamax

The most common oblique rotation methods:

  • Oblimin
  • Promax
  • Direct oblimin

Choice of rotation method

The choice of rotation method depends on the purpose and nature of the data and the research question. Some of the factors that can influence the choice of rotation method are:

  • The number of factors: Oblique rotation may be preferred if there are many factors, as it can capture the complex relationships among them. If there are few factors, orthogonal rotation may be sufficient, as it can produce clear and distinct factors.
  • The type of factors: If the factors are expected to be independent and orthogonal, orthogonal rotation may be appropriate, preserving the factor variance and correlation. If the factors are expected to be interrelated and oblique, oblique rotation may be appropriate, as it can reveal the factor covariance and correlation.
  • The interpretability of factors: If the factors are easy to interpret and name, orthogonal rotation may be suitable, as it can produce a simple structure with high and low factor loadings. If the factors are difficult to interpret and name, oblique rotation may be suitable, as it can produce a complex structure with moderate and high factor loadings.
  • The goal of factor rotation: If factor rotation aims to find distinct and independent factors that are easy to interpret and name, orthogonal rotation may be preferred to maximize the simplicity within and across factors. Suppose factor rotation aims to find interrelated and complex factors that reflect the true factor structure. In that case, oblique rotation may be preferred, as it can minimize the large loadings within each factor.

To apply the best type of rotation for your factor analysis, you need to use factor analysis software that allows you to choose and compare different rotation methods. For example, you can use SPSS to perform factor analysis with different rotation methods and compare their results in terms of variance explained, factor loading matrix, factor correlation matrix, and factor interpretation. You can also use graphical methods, such as scree plot or factor space plot, to visualize the effect of rotation on your factor structure.

Conclusion

In this article, we learned about exploratory factor analysis (EFA), a statistical method for finding hidden patterns in data. We learned how to perform EFA using SPSS and how to interpret and report the results of EFA. Here are some key points to remember:

  • EFA aims to discover the underlying structure of a set of observed variables by extracting one or more factors that account for their common variance.
  • EFA involves choosing the number of factors to retain, the method of factor extraction, and the factor rotation method.
  • EFA results include the amount of variance explained by each factor, the factor loading matrix, and the names and descriptions of each factor.
  • EFA can help reduce the data's dimensionality, identify latent factors, and test hypotheses about the relationships among variables.
  • EFA is an exploratory and data-driven method that may not reflect the true factor structure or generalize to other samples or contexts.

Exploratory factor analysis (EFA) is a useful statistical method for finding hidden patterns in data. It can help reduce data's dimensionality, identify latent factors, and test hypotheses about the relationships among variables. However, EFA is also an exploratory and data-driven method that may not reflect the true factor structure or generalize to other samples or contexts. Therefore, EFA should be used cautiously and complemented by other methods, such as confirmatory factor analysis (CFA) or structural equation modelling (SEM).

We hope you enjoyed this article and learned something new about EFA. If you have any questions or feedback, please get in touch with us at info@rstudiodatalab.com or hire us at [Order Now]. 

Frequently Asked Questions (FAQs)

What is the difference between EFA and PCA?

EFA and PCA are both methods of factor extraction, but they have different assumptions and goals. EFA assumes that some latent factors account for the common variance among the observed variables, while PCA assumes that some components account for the total variance in the observed variables. EFA aims to find the underlying structure of the data, while PCA aims to reduce the dimensionality of the data.

How is my data suitable for EFA?

There are some criteria that you can use to check if your data is suitable for EFA, such as:

  • Sample size: You should have at least 100 observations and 5 observations per variable. A rule of thumb is to have a ratio of at least 10 observations per variable.
  • Variable type: You should have continuous or ordinal variables measured on an interval or ratio scale. Categorical or nominal variables are not suitable for EFA.
  • Variable distribution: You should have approximately normally distributed or at least symmetric variables. Skewed or kurtotic variables may affect the factor extraction and rotation.
  • Variable correlation: You should have variables that are moderately correlated with each other (r > 0.3). There may be no common factors if the variables are too weakly correlated (r < 0.1). If the variables are too strongly correlated (r > 0.9), there may be multicollinearity problems.

How do I report the reliability and validity of my factors?

There are different ways to report the reliability and validity of your factors, such as:

  • Reliability: You can use Cronbach’s alpha or other internal consistency measures to assess how well the items within each factor measure the same construct. A high alpha value (> 0.7) indicates a reliable factor.
  • Validity: You can use convergent or discriminant validity measures to assess how well your factors relate to other measures of similar or different constructs. A high convergent validity (r > 0.5) indicates that your factors measure what they are supposed to measure. A low discriminant validity (r < 0.3) indicates that your factors measure different constructs.

How do I use the factor scores for further analysis?

Factor scores are standardized values that indicate how each observation scores on each factor. You can use them for further analysis, such as:

  • Descriptive statistics: You can calculate the mean, standard deviation, median, range, and other statistics of your factor scores to describe your sample characteristics.
  • Inferential statistics: You can use your factor scores as independent or dependent variables in hypothesis testing, such as t-tests, ANOVA, regression, or correlation analysis.
  • Cluster analysis: You can use your factor scores to group your observations into homogeneous clusters based on their similarity or dissimilarity on the factors.

What is the exploratory factor analysis?

Exploratory factor analysis (EFA) is a statistical method that aims to discover the underlying structure of a set of observed variables by extracting one or more latent factors that account for their common variance. EFA is a technique within factor analysis that aims to identify the underlying relationships between measured variables. 

How do you use exploratory factor analysis in research?

You can use exploratory factor analysis (EFA) in research to understand the relationships between variables, develop questions about your research topics, and identify latent variables. EFA can help you to reduce the dimensionality of your data, identify hidden patterns or dimensions in your data, test hypotheses about the relationships among variables, simplify data interpretation, improve measurement reliability, and validate scales or instruments. 

What is the difference between EFA and CFA?

EFA and CFA are both types of factor analysis, but they have different assumptions and goals. EFA assumes that some latent factors account for the common variance among the observed variables, while CFA tests whether a predefined model or theory fits the data. EFA does not require specifying how many factors there are and which variables are associated with which factors, while CFA does. EFA is an exploratory and data-driven method that allows the data to reveal the factor structure, while CFA is a confirmatory and theory-driven method that tests the fit of the factor structure.

How do you interpret exploratory factor analysis results?

To interpret exploratory factor analysis (EFA) results, you need to look at the following information:

  • The number of factors to retain indicates how many latent factors are needed to represent your data. You can use different criteria to choose the number of factors, such as eigenvalue, scree plot, parallel analysis, variance explained, and interpretability.
  • The factor extraction method indicates how the factors are extracted from the correlation matrix of observed variables. You can use different methods of factor extraction, such as principal component analysis, common factor analysis, principal axis factoring, or maximum likelihood.
  • The factor rotation method indicates how the factors are rotated to achieve a simpler, more interpretable structure. You can use different methods of factor rotation, such as orthogonal or oblique rotation and varimax, quartimax, equamax, oblimin, or promax rotation.
  • The amount of variance explained by each factor indicates how much common variance in the observed variables is accounted for by each factor. You can use this information to assess the importance and relevance of each factor.
  • The factor loading matrix shows how each variable relates to each factor. A high factor loading (>0.4) means that the variable has a strong association with the factor, while a low factor loading (<0.4) means that the variable has a weak or no association with the factor.
  • The names and descriptions of each factor indicate what each factor represents based on their factor loadings and theoretical relevance. You can use your judgment and evidence to name and describe each factor clearly and logically.

What is the main purpose of EFA?

The main purpose of EFA is to explore the underlying structure of a set of observed variables by extracting one or more latent factors that account for their common variance. EFA can help you to find out what are the hidden dimensions or constructs that explain why some variables are correlated with each other.

What is the p-value in exploratory factor analysis?

The p-value in exploratory factor analysis (EFA) measures how well the factor model fits the observed data. It is usually obtained from the maximum likelihood (ML) factor extraction method, which estimates the likelihood that the observed data would occur under the assumed model.

Join Our Community) Allow us to Assist You

About the author

Zubair Goraya
Ph.D. Scholar | Certified Data Analyst | Blogger | Completed 5000+ data projects | Passionate about unravelling insights through data.
-->

Post a Comment

Have A Question?We will reply within minutes
Hello, how can we help you?
Start chat...