Skip to main content

Psychometric evaluation of an Adverse Childhood Experiences (ACEs) measurement tool: an equitable assessment or reinforcing biases?



Utilizing Adverse Childhood Experiences (ACEs) measurement scales to assess youths’ adversities has expanded exponentially in health and justice studies. However, most of the ACEs assessment scales have yet to meet critical psychometric standards, especially for key demographic and minority groups. It is critical that any assessment or screening tool is not reinforcing bias, warranting the need for validating ACEs tools that are equitable, reliable and accurate. The current study aimed to examine the structural validity of an ACEs scale. Using data from the 2019 Behavioral Risk Factor Surveillance System (BRFSS), which collected of 97,314 responses collected from adults across sixteen states. This study assessed the psychometric properties and measurement invariance of the ACEs tool under the structural equation modeling framework.


We found the 11-item ACEs screening tool as a second-order factor with three subscales, all of which passed the measurement invariance tests at metric and scalar levels across age, race, sex, socioeconomic status, gender identity, and sexual orientation. We also found that minority groups experienced more childhood adversity with small effect size, with the exception of the gender identity.


The ACEs measurement scale from the BRFSS is equitable and free from measurement bias regardless of one’s age, race, sex, socioeconomic status, gender identity, and sexual orientation, and thus is valid to be used to compare group mean differences within these groups. The scale is a potentially valid, viable, and predictive risk assessment in health and justice and research settings to identify high-risk groups or individuals for treatments.


A relatively recent public health and justice concept, the adverse childhood experiences (ACEs) scale, (Anda et al., 2010; Ford et al., 2020), is defined as “potentially traumatic events that occur in childhood (Centers for Disease Control and Prevention, 2022).” The American Academy of Pediatrics’ (AAP’s) policy statement encourages pediatricians to screen ACEs for the toxic stress of children and adolescents early (Committee on Psychosocial Aspects of Child and Family Health et al., 2012). ACEs have been found to be associated with increased physical and mental illness through the engagement of health-risk behaviors (Baldwin et al., 2021a; Centers for Disease Control and Prevention, 2022; Hughes et al., 2021), and has been linked to $748 billion in related health costs (Bellis et al., 2019). Recently, the COVID-19 pandemic worsened the youths’ ACEs and toxic stress (Ortiz et al., 2022), as counties in the world implemented lockdown policies, closed schools, and disrupted governmental and private services, which left many children unprotected. Several countries started implementing ACEs screening through either universal (e.g., well-childcare) or targeted platforms (e.g., pediatricians), but the ACE-induced health issues are unlikely to be neutralized without the appropriate treatments and interventions; however, the limited studies suggested that screening for ACEs improves adversity identification and receiving community-based services.

ACE assessment, structural validity and measurement invariance/bias

The bridge between adversity identification, risk assessment and intervention/treatment referral or resource allocation is the ACEs screen and assessment tools (Gordon et al., 2020). Despite the utility of the ACE assessment or screening, no instrument has accumulated sufficient psychometric evidence to demonstrate its superiority in terms of its predictive accuracy and economic viabilityFootnote 1 (Loveday et al., 2022). Despite its utility, there are many methodological concerns of the ACEs assessment remains to be resolved (Holden et al., 2020). One and the most fundamental methodological concern is the how well ACEs are assessed or the validity of the assessment itself (Holden et al., 2020). This concern is two-fold. First, “what is the underlying factor structure of childhood adversities?” and second, “does the instrument demonstrate measurement invariance,” or “is the instrument equally appropriate for assessing adversity from a variety of individuals? (Holden et al., 2020, p.169)”

While the first question pertains to the ACEs assessment of structural validity, the second question deals with measurement bias imbedded in the assessment instrument itself, which would produce biased estimation for key demographic groups. While there are twenty different versions of the ACEs assessment scales, ranging from 8 to 70 items per instrument, only four studies explored the structural validity of the ACEs instruments, among which only three studies investigated the measurement invariances/bias across certain demographic groups, such as age and sex (see Holden et al., 2020).

Briefly, when evaluating the measurement invariance, researchers must provide at least three levels or tiers (configural, metric and scalar) of evidence to claim the assessment instrument is not biased toward any of the subgroups (Ford et al., 2014). Nevertheless, one the of ACEs assessment scale validation study claimed that they achieved measurement invariance, but in fact the measurement invariance failed at the scalar level for youth gender groups (girls and boys) (Meinck et al., 2017). The other two studies tested and passed the measurement invariance of two different version of ACEs scales (from the Panel Study of Income Dynamics and the BRFSS project) across gender and age (Olofson, 2018; Ford et al., 2014). Unfortunately, the equality of the assessment based on ones’ group memberships, especially for social disadvantaged minority groups yet to be tested and validated (i.e., sexual and gender minorities). There is evidence that disadvantaged, or minority groups are more likely to suffer various types of early childhood adversities (Centers for Disease Control and Prevention, 2022). Although disadvantaged groups might score higher on the ACEs assessment scales, the differences found between disadvantaged and non-disadvantaged groups might be artificial and could only be the product of measurement biases imbedded in the assessment itself due to the lack of measurement invariance validations.

Health equity tourism

Since the seminal research on ACEs (Felitti et al., 1998), the ACEs studies on justice and health outcomes have proliferated (Baglivio et al., 2014). Researchers recognized that the interrelationship between the ACEs and social inequalities (McEwen & Gregerson, 2019; Racine et al., 2022). However, in order to address the ACEs among the youth’s population through preventive measures, both the clinicians and researchers must have the assessment tool to accurate measure the underlying ACEs constructs across all key demographic groups.

Lett and colleges defined ‘health equity tourism’ as researchers jumped on the bandwagon of equity research for pursuing health and justice publications or fundings without investigating the sources of the resource of inequality (i.e., structural racism) (2022). The ACEs research to date have not resolve the measurement methodology issues (i.e., measurement invariance) for ACEs assessment, which questions the validity of some research findings of all ACEs studies, especially for minority groups (Holden et al., 2020). Without empirical evidence that an ACE assessment scale is unbiased across disadvantaged groups, the ACE assessment might, instead of addressing and improving, reinforce social inequality because the assessment contents and items might be inappropriate to assessing adversity for the minority groups.

Current study

Therefore, to fill the gap in the literature regarding the lack of critical evaluation of the ACEs assessment scales used by researchers and clinical professionals, we attempt to investigate the structural validity of the ACEs scale from the BRFSS while evaluating the measurement bias for the vulnerable and marginalized populations, such as racial ethnic minorities, people with lower socioeconomic status, sexual and gender minorities. We select the ACEs scale from the BRFSS because it is one of the few promising instruments that has accumulated considerable psychometric evidence (Holden et al., 2020). This instrument can also be used in a self-report format, which has been used to generate a national representative sample to obtain external validity. This instrument economic with only has 11 items, which can be easily to be adopted and incorporated into many research projects without burden the participants. Therefore, in this study we attempt to validate the structural validity and the lack of measurement bias of this ACEs assessment scale from the BRFSS with a national representative sample. We hypothesized that this ACEs assessment scale are free from measurement biases across all major minority groups.


Study sample

In this study, we evaluated the internal latent structure (structural validity) and the measurement bias of the ACEs scale using data from the 2019 Behavioral Risk Factor Surveillance System (BRFSS; The BRFSS was initiated by the Centers for Disease Control and Prevention (CDC) in 1984. According to the BRFSS Data User Guide (2013), state health departments, assisted by the CDC, conducted yearly telephone surveys to collect data with standard protocols on adults’ risk behaviors, preventive health practices, and health status. For each year, the annual sample contains more than 4,000 telephone interviews that were conducted for each state. The BRFSS used a stratified random sampling approach with a weighting protocol, ensuring the generalizability and representativeness of many demographic characteristics, such as sex, age, race and education. We used the 2019 BRFSS sample (N = 97,314) from the BRFSS, which collected ACEs assessments from sixteen states, including Alabama, Delaware, Florida, Indiana, Iowa, Michigan, Mississippi, Missouri, New Mexico, North Dakota, Pennsylvania, Rhode Island, South Carolina, Tennessee, Virginia, West Virginia, and Wisconsin. The sample characteristics are reported in Table 1.

Table 1 Sample descriptive (N = 97,314)


The outcome measure is the ACEs, which contains eleven binary and ordinal items assessing whether an individual suffered various types of adverse childhood abuses, such as physical, verbal, and sexual abuse, as well as experienced any traumatic events, such as the parental incarceration and separation. The full item descriptive statistics were reported in Table 2.

Table 2 ACEs Item descriptive statistics (N = 97,314)

When testing ACEs’ measurement bias, we used six nominal grouping variables, including age, race, sex, socioeconomic status, sexual identity, and sexual orientation. Age was operationalized into six categories, including “18–24,” “25–34,” “35–44,” “45–54,” “55–64” and “65+.” The biological sex was operationalized as either “male” or “female.” Income was operationalized six categories, including “less than 15,000,” “15,000 to less than 25,000,” “25,000 to less than 35,000,” “35,000 to less than 50,000,” “50,000+,”Footnote 2 and “Don’t know/Not sure/Missing.” Race was operationalized into five categories, including “white only”, “non-Hispanic,” “black only, non-Hispanic,” “other race only, non-Hispanic,” “multiracial, non-Hispanic,” “Hispanic.” Sexual orientation is measured as “straight” and “others”, which include gay, bisexual, something else, and I don’t know the answer. Sexual identity was measured as “not transgender” and “transgender.”

Analytical Strategy

We first conducted an Exploratory Factor Analysis (EFA) to discover the underlying factorial pattern. Second, we conducted a sequential Multi-group Confirmatory Factor Analysis (MGCFA) to confirm the suggested factorial pattern. We extracted a second-order factor through higher-order modeling when we identified that the factors shared a substantial amount of common variance (Chen et al., 2005; Putnick & Bornstein, 2016). Once the internal latent structure of the ACEs was identified, we tested three essential forms of measurement bias or invariances, including configural, metric, and scalar, across all the group memberships (Schmitt & Kuljanin, 2008). Moreover, we reported the latent mean difference (i.e., true mean difference) represented by the Cohen’s d (Fritz et al., 2012) across all six group memberships. We followed the interpretations provided by Cohen (Cohen, 1988) when evaluating the effect size of the mean difference, ranging from small (0.20), medium (0.50), and large (0.80) effect size. In addition, we performed a common factor model when measurement invariance was achieved at all three invariance levels.

We followed guidelines for testing sequences of measurement invariance and higher-order factors (Chen et al., 2005; Rudnev et al., 2018). The fixed factor approach was used and we followed the model specification and identification suggestions by previous studies (Byrne & Stewart, 2006; Millsap & Yun-Tein, 2004). We performed omnibus tests for higher-order modeling and measurement invariance tests and conducted further testing when the omnibus tests failed (Little, 2013). The Weighted Least Square Mean and Variance Adjusted (MLSMV) is the preferred estimator because the items are categorical/ordinal and polytomous. The ‘Theta’ parameterizations is selected because it allowed us to test all forms of measurement invariances (Muthén & Asparouhov, 2002). Because the items are categorical, we conducted all tests within the Item Factor Analysis (IFA)/Item Response Theory (IRT) framework (Thomas, 2011). The missing data are handled with the full information maximum-likelihood (FIML) approach with MLSMV estimator when there is non-substantial missing at random data (Asparouhov & Muthen, 2010). The FIML is a superior method than the listwise deletion, pairwise deletion and imputation approaches (Enders & Bandalos, 2001).

Next, we computed the coefficient omega (ω) to evaluate the construct reliability of the G-factor and subscales. Using the Omega coefficient is advantageous over Cronbach’s Alpha because it assumes a parallel construct measurement structure (Deng & Chan, 2017; Geldhof et al., 2014; Nájera Catalán, 2019) and it enables researchers to accurately evaluate the construct reliability for higher-order factors (Nájera Catalán, 2019). A threshold of 0.65 for multidimensional (higher-order) and 0.80 for unidimensional (first-order) measures were used as thresholds to determine the ‘acceptable’ level of construct reliability (Nájera Catalán, 2019).

When evaluating the goodness of the EFA model, we followed the industry standard which considers both theory and the empirical evidence, such as the Kaiser-Guttman rule and goodness of fit, to determine the number of factors (Brown, 2015). For item loadings and cross-loadings, we also followed Comrey and Lee’s (Comrey & Lee, 1992) guidelines that the strength of the loadings and cross-loadings range from poor (.32), fair (.45), good (.55), very good (.63) or excellent (.71) fit. When evaluating the goodness of the CFA models, we compared item and factor loadings/cross-loadings with industry-standard loading thresholds of poor (.32), fair (.45), good (.55), very good (.63), and excellent (.71) (Tabachnick et al., 2007). Model fit is ‘acceptable’ if the Comparative Fit Index (CFI)/Tucker Lewis Index (TLI) are equal or greater than .90 and the Root Mean Square Error of Approximation (RMSEA) is equal/less than .08. The model fit is ‘good’ when CFI/TLI are equal or exceed .95 and the RMSEA is equal/less than .05 (Brown, 2015; Little, 2013). Models were evaluated with constraints added to each additional and progressive model for higher-order and group invariance tests. Higher-order models and those with additional measurement invariance constraints were retained if the ∆CFI and ∆TLI values were equal/less than .01, indicating that the nested higher-order modeling or additional measurement invariance constraints did not produce any detrimental effect on the models (Little, 2013).


We identified that the ACEs scale was a second-order model with three subscales. The EFA suggested a two-factor model because there were two Eigenvalues above 1, yet the SRMR model fit was not ideal (CFI = .985, TLI = .976, RMSEA = .027, SRMR = .055). Also, compared with a 2-factor model, the 3-factor model made significant improvement in all models’ fit indices with ∆CFI and ∆TLI above .10 (CFI = .997, TLI = .993, RMSEA = .015, SRMR = .023). The assessment content is aligned with the suggested factorial pattern in the 3-factor model, assessing three sub-types of ACEs: household dysfunction, emotional/physical abuse, and sexual abuse.

Next, we retained the three-factor model and subjected the measurement model to measurement invariant tests and higher-order model tests. As a result, we successfully exacted a second-order model, as the second-order model did not produce detrimental model fits (with ∆CFI and ∆TLI below .010) compared to the measurement model across all six grouping models. Also, as shown in Table 3, the second-order model passed all three levels of invariances (i.e., configural, metric, and scalar) for all six groups as the ∆CFI and ∆TLI did not exceed .10 for all models (Table 4). Finally, we conducted a common factor model, combing all the groups, and the final model fits exceeded all thresholds to be at least considered “acceptable” (CFI = .986, TLI = .985, RMSEA = .021, SRMR = .066). The reliability of the ACE scale reached an acceptable level of reliability, which passed the threshold of .65 for multidimensional measures (ω = .906). We provided a visual illustration of the ACE final model in Fig. 1.

Table 3 Measurement invariance tests across age, race, sex, income, sexual identity, and sexual orientation groups
Table 4 Latent mean difference for ACEs across age, race, sex, income, sexual identity, and sexual orientation
Fig. 1
figure 1

Final model of Adverse Childhood Experiences (ACEs)

Now, we present the true score differences across six group memberships in Table 4. We found that compared people aged between 18 and 24, people aged between 45 and 54 (d = .02, p < .05), 55 and 64 (d = .04, p < .001), and people who are above the age of 65 (d = 0.14, p < .001) reported statistically lower ACEs scores. Compared to non-Hispanic whites, Black only (d = .08, p < .001), non-Hispanic multiracial (d = .15, p < .001), and Hispanic (d = .06, p < .001) scored significantly higher. Females scored higher on ACEs than male participants (d = .08, p < .001). Compared to people whose income was less than $15,000, people in higher-income groups scored significantly lower ACEs (d = .04 − .16, p < .001). Compared to heterosexual people, sexual minorities scored significantly higher (d = 0.18, p < .001). Gender minorities (i.e., people who identified as transgender) scored higher than people who are cisgender (d = .18, p > .05), yet the mean difference is not statistically significant. With Cohen’d less than 0.20, all the statistical differences we found were small.


The current study made several contributions. First, consistent with a previous study that used an early version of the BRFSS data (D. C. Ford et al., 2014), we found the CDC’s ACEs Scale contains three subscales, including household dysfunction, emotional/physical abuse, and sexual abuse. Compared to the ACEs total score, each of its subscales has fewer items and, therefore, less variation and range. We advocate for using the composite scores of the ACEs scale with all items for screening instead of using three subscales separately because the common variance of the three subscales can be explained by one underlying factor, namely the ACEs, through second-order modeling. Given each of the three subscales has a limited number of items and range, and the utility of the subscales is yet to be fully explored, greater weight should be given to the entire ACE assessment in clinical practice for screening and public health research. Once the screening is completed, clinical practitioner could use more extensive and comprehensive tools to fully assess youths’ ACEs, and which type or subtype of the ACEs is the most stressful and traumatic for the youths. The finding of the current study demonstrated the ACEs assessment instrument can provides the clinicians a potentially promising, viable and economic screening tool to assess ACEs.

Second, the ACE scale passed the three levels of invariance tests (i.e., configural, metric, and scalar) across six group memberships, indicating that the ACEs assessment is equitable and free from measurement bias regardless of one’s age, race, sex, socioeconomic status, sexual identity, and sexual orientation. In other words, the ACEs scale is a valid screening tool to assess the group mean differences within these groups.

Third, since the ACEs scale is invariant, we used it to examine the group differences in age, race, gender, income, sexual identity, and sexual orientation. We found evidence suggesting that as one’s age increases, their ACEs scores decrease, such as significant relationship no longer holds for people were 45 and older. Given that the data were collected through the participants’ memory, there was an increased risk of recall bias for people aged 45 and older, suggesting that using the ACEs might not be suitable for clinical and research use if the individuals are older than 45-years-old because of the recall bias.

Furthermore, previous studies on group differences, such as gender, racial, and sexual minorities group differences, in ACEs often examine different types of ACEs separately (Andersen & Blosnich, 2013; Fang et al., 2016; Lee & Chen, 2017), and this study filled this gap by examining group differences in the ACEs as a single construct. Consistent with the previous findings, we found that non-Hispanic black, Hispanic, and non-Hispanic multiracial people reported higher ACEs, which indicated that people of racial minority experienced more adverse childhood experiences than white people. Similar to a previous study that females were at more risk of multiple types of ACES (Fang et al., 2016), females in this study reported higher ACES than males. In addition, we found that people’s socioeconomic status is significantly and negatively associated with ACEs.

Moreover, gender minorities reported higher ACEs than people who are cisgender. However, such a relationship is not statistically significant. Also, sexual minorities scored higher than heterosexual people. A possible explanation is that the difference and disparity can be attributed to structural racism (Dougherty et al., 2020). Alternatively, multi-level (micro and macro) and multisystem (family and neighborhood) characteristics could also explain said disparities. Unfortunately, without adequately designed research, the challenges of explaining the health disparity cannot be properly investigated in the current study (Jeffries et al., 2019).

We found that the effect sizes of the reported group differences are small. Overall, the findings support the theory that the vulnerable population, including women, young adults, racial ethnic minorities, people lower on the socioeconomic ladder, and LGBT groups, suffered more adverse, traumatic, physical, psychological, and sexual abuses in their early lives. Due to the limited scope of this research, we did not examine the intersectionality of the disadvantaged groups could experience more ACEs. Given the current finding, it is reasonable to speculate that the youths belong to multiple disadvantaged groups could have experiences more ACEs than non-disadvantaged population.

The current public, justice and health system might not have the capacity to address the needs for all individuals (McLennan et al., 2020). Fortunately, with this validated ACEs, it is possible to accurately identify these high-risk vulnerable individuals. Also, the traditional prevention strategy framework recognizes that children with higher risk should be prioritized to receive prevention treatments (Brennan et al., 2020). The traditional prevention strategy often consider race, sex, sexual orientation, sexual identity, socioeconomic status and age independently and therefore fails to address the multiple intersecting needs of the individuals (Qureshi et al., 2022). While the finding of this research calls for critical examination of the underlying structure and factors that contributed to the disparities and how the prevention programs could be tailored to multiple intersecting higher-than-average needs of the minority populations who are likely belong multiple disadvantaged groups.


The current study has several limitations. First, not all states collected data on ACEs in the 2019 study. Although the sample is large, the generalizability to the entire U.S. population of the findings remains to be further validated. Even with more states’ participation, the generalizability of the result is still confined to the U.S. and North America. Second, due to the limited scope, the predictive accuracy of the CDC’s (11-item) version of the ACEs (both the composite total and subscales’ score) remains to be further tested in future research across various justice and health outcomes as well as across various groups of children as the previous research demonstrated the lack of prediction precision for health issues (Baldwin et al., 2021b). Future research could maximize the ACEs predictive accuracy by using more sophisticated weighting schemes based on its empirical relationship with various health outcome interests to further support the screening practices (Holden et al., 2020). Future research could use longitudinal instead of cross-sectional data to validate the precision of the ACE assessment when used to predict or explain justice or health outcomes, such as illegal substance abuse, chorionic disease, and mental health. Researchers may even consider a more complicated model to account for the mediation or neutralizing effect of the positive childhood experience on ACEs (Ortiz et al., 2022).

Third, the data probably underestimated the prevalence of ACEs especially for sexual minorities because of the housing insecurity or instability (Tran et al., 2022). It is difficult to estimate the prevalence of homeless because of the heterogeneity of the samples in previous studies. Yet, it is evidentiary that sexual minority in general at disproportionately higher risk of homeless (Corliss et al., 2011). Therefore, sexual minorities are likely underrepresented in the 2019 BRFSS which is a household sample. The low frequency of gender minority youths in our sample might produce the non-significant group mean difference between gender identity groups. Future research should reinvestigate the ACEs difference between such groups with larger samples.

Also, the data were collected from adults’ recollection of the memory and therefore further underestimated the prevalence, especially for the older population (Tran et al., 2022). Next, the utility of the CDC’s ACEs screen instrument’s forecasting utility remains to be validated among the youth and children’s populations. Hence, longitudinal research tracking youths’ health development over time might offer more definitive evidence (Lacey et al., 2022). In addition, this study used EFA to identify subscale of the ACE and used (Multigroup) CFA to confirm the ACEs constructs with the same sample. Although the results are unlikely to differ from the current finding, future research could further revalidate the ACEs scale from the BRFSS with new data.

Last, researchers have identified that most of the current ACEs (including CDC’s version) did not follow the scale creation processes and standards and therefore lacked construct validity. Although the current research offered convincing evidence to support the potential utility of using it as a screening instrument, it still lacks content validity because of the limited items (n = 11) and measured types of adversity (Holden et al., 2020). While using existing tools with more screening content (Brennan et al., 2020), or expanding the content validity of the ACE might be beneficial, the CDC’s version of ACEs might be offered to health platforms as an economically viable screen tool, upon demonstrating its preferred level of prediction precision for various health outcomes. The limitation of the range and assessment content of the tool might be further mediated with more sophisticated psychometric methods, which allow clinicians and researchers to produce weighted latent factor scores (Grice, 2001). Applying more sophisticated methods from machine learning techniques might offer a sizeable boost to the predictive accuracy in clinical practices, which has been used in justice settings to predict health outcomes, such as substance abuse or drug crimes (Hamilton et al., 2021). Once the screen is completed and high-risk individuals are identified, a complete or a more comprehensive ACE clinical assessment might be employed to toxic stress risk (Harris, 2020).

Availability of data and materials

The datasets analyzed during the current study are available in the 2019 Behavioral Risk Factor Surveillance System (BRFSS;


  1. Economic viability is referred to as whether the tool is short but reliable and valid that can be used to assess ACEs in relatively cost-effective fashion.

  2. We recognize this categorization which does not further distinguished people who are in higher socio-economic status (i.e., maybe 80,000 + or 150,000+). Unfortunately, this is the one of the limitations that this study which used a secondary data (i.e., the BRFSS) for analyses.


  • Anda, R. F., Butchart, A., Felitti, V. J., & Brown, D. W. (2010). Building a Framework for Global Surveillance of the Public Health Implications of adverse childhood experiences. American Journal of Preventive Medicine, 39(1), 93–98.

    Article  Google Scholar 

  • Andersen, J. P., & Blosnich, J. (2013). Disparities in adverse childhood experiences among sexual minority and heterosexual adults: results from a multi-state probability-based sample. PLOS ONE, 8(1), e54691.

    Article  Google Scholar 

  • Asparouhov, T., & Muthen, B. (2010). Weighted Least Squares Estimation with Missing Data (p. 10)

    Google Scholar 

  • Baglivio, M. T., Epps, N., Swartz, K., Huq, M. S., Sheer, A., Hardt, N. S. (2014). The prevalence of adverse childhood experiences (ACE) in the lives of juvenile offenders. Journal of Juvenile Justice, 3(2), 1–17.

  • Baldwin, J. R., Caspi, A., Meehan, A. J., Ambler, A., Arseneault, L., Fisher, H. L., Harrington, H., Matthews, T., Odgers, C. L., Poulton, R., Ramrakha, S., Moffitt, T. E., & Danese, A. (2021a). Population vs Individual Prediction of Poor Health from results of adverse childhood Experiences Screening. JAMA Pediatrics, 175(4), 385–393.

    Article  Google Scholar 

  • Baldwin, J. R., Caspi, A., Meehan, A. J., Ambler, A., Arseneault, L., Fisher, H. L., Harrington, H., Matthews, T., Odgers, C. L., Poulton, R., Ramrakha, S., Moffitt, T. E., & Danese, A. (2021b). Population vs Individual Prediction of Poor Health from results of adverse childhood Experiences Screening. JAMA Pediatrics, 175(4), 385–393.

    Article  Google Scholar 

  • Bellis, M. A., Hughes, K., Ford, K., Rodriguez, G. R., Sethi, D., & Passmore, J. (2019). Life course health consequences and associated annual costs of adverse childhood experiences across Europe and North America: a systematic review and meta-analysis. The Lancet Public Health, 4(10), e517–e528.

    Article  Google Scholar 

  • Brennan, B., Stavas, N., & Scribano, P. (2020). Effective prevention of ACEs. In Adverse Childhood Experiences (pp. 233–264). Elsevier.

    Chapter  Google Scholar 

  • Brown, T. A. (2015). Confirmatory factor analysis for applied research. Guilford publications.

    Google Scholar 

  • Byrne, B. M., & Stewart, S. M. (2006). Teacher’s corner: the MACS approach to testing for multigroup invariance of a second-order structure: a walk through the process. Structural Equation Modeling, 13(2), 287–321.

    Article  Google Scholar 

  • Centers for Disease Control and Prevention (2013). The BRFSS Data User Guide. Retrieved from

  • Centers for Disease Control and Prevention. (2022). Fast facts: preventing adverse childhood experiences. National Center for Injury Prevention and Control, Division of Violence Prevention.

    Google Scholar 

  • Chen, F. F., Sousa, K. H., & West, S. G. (2005). Teacher’s corner: testing measurement invariance of second-order factor models. Structural Equation Modeling, 12(3), 471–492.

    Article  Google Scholar 

  • Cohen, J. (1988). Statistical power analysis Jbr the behavioral. Sciences. Hillsdale (NJ): Lawrence Erlbaum Associates, 18–74.

  • Committee on Psychosocial Aspects of Child and Family, Health, C. on, Garner, E. C., Shonkoff, A. S., Siegel, J. P., Dobbins, B. S., Earls, M. I., Garner, M. F., McGuinn, A. S., Pascoe, L., J., & Wood, D. L. (2012). Early childhood adversity, toxic stress, and the role of the pediatrician: translating developmental science into lifelong health. Pediatrics, 129(1), e224–e231.

  • Comrey, A. L., & Lee, H. B. (1992). Interpretation and application of factor analytic results. Comrey AL, Lee HB. A First Course in Factor Analysis, 2, 1992.

  • Corliss, H. L., Goodenow, C. S., Nichols, L., & Austin, S. B. (2011). High burden of homelessness among sexual-minority adolescents: findings from a representative Massachusetts high school sample. American journal of public health, 101(9), 1683–1689.

    Article  Google Scholar 

  • Deng, L., & Chan, W. (2017). Testing the difference between reliability coefficients alpha and omega. Educational and Psychological Measurement, 77(2), 185–203.

    Article  Google Scholar 

  • Dougherty, G. B., Golden, S. H., Gross, A. L., Colantuoni, E., & Dean, L. T. (2020). Measuring structural racism and its Association with BMI. American Journal of Preventive Medicine, 59(4), 530–537.

    Article  Google Scholar 

  • Enders, C. K., & Bandalos, D. L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural equation modeling, 8(3), 430–457.

    Article  Google Scholar 

  • Fang, L., Chuang, D. M., & Lee, Y. (2016). Adverse childhood experiences, gender, and HIV risk behaviors: results from a population-based sample. Preventive Medicine Reports, 4, 113–120.

    Article  Google Scholar 

  • Felitti, V. J., Anda, R. F., Nordenberg, D., Williamson, D. F., Spitz, A. M., Edwards, V., & Marks, J. S. (1998). Relationship of childhood abuse and household dysfunction to many of the leading causes of death in adults: the adverse childhood experiences (ACE) study. American journal of preventive medicine, 14(4), 245–258.

    Article  Google Scholar 

  • Ford, D. C., Merrick, M. T., Parks, S. E., Breiding, M. J., Gilbert, L. K., Edwards, V. J., Dhingra, S. S., Barile, J. P., & Thompson, W. W. (2014). Examination of the factorial structure of adverse childhood experiences and recommendations for three subscale scores. Psychology of Violence, 4(4), 432–444.

    Article  Google Scholar 

  • Ford, K., Bellis, M. A., Hughes, K., Barton, E. R., & Newbury, A. (2020). Adverse childhood experiences: a retrospective study to understand their associations with lifetime mental health diagnosis, self-harm or suicide attempt, and current low mental wellbeing in a male Welsh prison population. Health & Justice, 8(1), 13.

    Article  Google Scholar 

  • Fritz, C. O., Morris, P. E., & Richler, J. J. (2012). Effect size estimates: current use, calculations, and interpretation. Journal of Experimental Psychology: General, 141(1), 2.

    Article  Google Scholar 

  • Geldhof, G. J., Preacher, K. J., & Zyphur, M. J. (2014). Reliability estimation in a multilevel confirmatory factor analysis framework. Psychological Methods, 19(1), 72.

    Article  Google Scholar 

  • Gordon, J. B., Nemeroff, C. B., & Felitti, V. (2020). Screening for adverse childhood experiences. Journal Of The American Medical Association, 324(17), 1789.

    Article  Google Scholar 

  • Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6(4), 430–450.

    Article  Google Scholar 

  • Hamilton, Z., Kigerl, A., & Kowalski, M. (2021). Prediction is local: the benefits of Risk Assessment optimization. Justice Quarterly, 1–23.

  • Harris, N. B. (2020). Screening for adverse childhood experiences. Journal Of The American Medical Association, 324(17), 1788.

    Article  Google Scholar 

  • Holden, G. W., Gower, T., & Chmielewski, M. (2020). Methodological considerations in ACEs research. In Adverse Childhood Experiences (pp. 161–182). Elsevier.

  • Hughes, K., Ford, K., Bellis, M. A., Glendinning, F., Harrison, E., & Passmore, J. (2021). Health and financial costs of adverse childhood experiences in 28 european countries: a systematic review and meta-analysis. The Lancet Public Health, 6(11), e848–e857.

    Article  Google Scholar 

  • Jeffries, N., Zaslavsky, A. M., Roux, D., Creswell, A. V., Palmer, J. W., Gregorich, R. C., Reschovsky, S. E., Graubard, J. D., Choi, B. I., Pfeiffer, K., Zhang, R. M., X., & Breen, N. (2019). Methodological approaches to understanding causes of Health Disparities. American Journal of Public Health, 109(S1), S28–S33.

    Article  Google Scholar 

  • Lacey, R. E., Howe, L. D., Kelly-Irving, M., Bartley, M., & Kelly, Y. (2022). The clustering of adverse childhood experiences in the Avon Longitudinal Study of parents and children: are gender and poverty important? Journal of Interpersonal Violence, 37(5–6), 2218–2241.

    Article  Google Scholar 

  • Lee, R. D., & Chen, J. (2017). Adverse childhood experiences, mental health, and excessive alcohol use: examination of race/ethnicity and sex differences. Child Abuse & Neglect, 69, 40–48.

    Article  Google Scholar 

  • Lett, E., Adekunle, D., McMurray, P., Asabor, E. N., Irie, W., Simon, M. A., Hardeman, R., & McLemore, M. R. (2022). Health Equity Tourism: ravaging the Justice Landscape. Journal of Medical Systems, 46(3), 17.

    Article  Google Scholar 

  • Little, T. D. (2013). Longitudinal structural equation modeling. Guilford Press.

    Google Scholar 

  • Loveday, S., Hall, T., Constable, L., Paton, K., Sanci, L., Goldfeld, S., & Hiscock, H. (2022). Screening for adverse childhood experiences in children: a systematic review. Pediatrics, 149(2), e2021051884.

    Article  Google Scholar 

  • McEwen, C. A., & Gregerson, S. F. (2019). A critical assessment of the adverse childhood experiences study at 20 years. American journal of preventive medicine, 56(6), 790–794.

    Article  Google Scholar 

  • McLennan, J. D., McTavish, J. R., & MacMillan, H. L. (2020). Routine screening of ACEs: Should we or shouldn’t we? Adverse Childhood Experiences (pp. 145–159). Elsevier.

    Chapter  Google Scholar 

  • Meinck, F., Cosma, A. P., Mikton, C., & Baban, A. (2017). Psychometric properties of the adverse childhood experiences abuse short form (ACE-ASF) among romanian high school students. Child abuse & neglect, 72, 326–337.

    Article  Google Scholar 

  • Millsap, R. E., & Yun-Tein, J. (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39(3), 479–515.

    Article  Google Scholar 

  • Muthén, B., & Asparouhov, T. (2002). Latent variable analysis with categorical outcomes: multiple-group and growth modeling in Mplus. Mplus Web Notes, 4(5), 1–22.

    Google Scholar 

  • Nájera Catalán, H. E. (2019). Reliability, population classification and weighting in multidimensional poverty measurement: a Monte Carlo study. Social Indicators Research, 142(3), 887–910.

    Article  Google Scholar 

  • Olofson, M. W. (2018). A new measurement of adverse childhood experiences drawn from the panel study of income dynamics child development supplement. Child Indicators Research, 11(2), 629–647.

    Article  Google Scholar 

  • Ortiz, R., Gilgoff, R., Harris, N. B. (2022). Adverse childhood experiences, toxic stress, and trauma-informed neurology. JAMA Neurology, 79(6), 539–540.

  • Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: the state of the art and future directions for psychological research. Developmental Review, 41, 71–90.

    Article  Google Scholar 

  • Qureshi, I., Gogoi, M., Al-Oraibi, A., Wobi, F., Pan, D., Martin, C. A., Chaloner, J., Woolf, K., Pareek, M., & Nellums, L. B. (2022). Intersectionality and developing evidence-based policy. The Lancet, 399(10322), 355–356.

    Article  Google Scholar 

  • Racine, N., Afifi, T. O., & Madigan, S. (2022). Childhood adversity and the link between social inequality and early mortality. The Lancet Public Health, 7(2), e100–e101.

    Article  Google Scholar 

  • Rudnev, M., Lytkina, E., Davidov, E., Schmidt, P., & Zick, A. (2018). Testing Measurement Invariance for a Second-Order Factor. A Cross-National Test of the Alienation Scale. Methods, data, 30.

  • Schmitt, N., & Kuljanin, G. (2008). Measurement invariance: review of practice and implications. Human Resource Management Review, 18(4), 210–222.

    Article  Google Scholar 

  • Tabachnick, B. G., Fidell, L. S., & Ullman, J. B. (2007). Using multivariate statistics (Vol. 5). pearson Boston, MA.

  • Thomas, M. L. (2011). The value of item response theory in clinical assessment: a review. Assessment, 18(3), 291–307.

    Article  Google Scholar 

  • Tran, N. M., Henkhaus, L. E., & Gonzales, G. (2022). Adverse childhood Experiences and Mental Distress among US adults by sexual orientation. JAMA Psychiatry, 79(4), 377.

    Article  Google Scholar 

Download references


We thank Lijie Jia for the assistance with arranging research meetings for the team, actively involving the conceptualization and execution of the research projects. We thank you for your interest in learning how health and justice research is conducted and reviewed. We thank Dr. Zachary Hamilton for editing the paper and providing insightful feedback and comments regarding how to improve the manuscript.


Not applicable

Author information

Authors and Affiliations



XM and JL were involved in the conceptualization of the paper; both analyzed the data and wrote parts of the methodology and results. ZL, SH, LL, YH and JL wrote the original draft. All authors revised, read and approved the final manuscript.

Corresponding author

Correspondence to Xiaohan Mei.

Ethics declarations

Ethics approval and consent to participate

This study used a publicly available dataset and was exempted by the Institutional Review Board (IRB) of the authors’ university.

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mei, X., Li, J., Li, ZS. et al. Psychometric evaluation of an Adverse Childhood Experiences (ACEs) measurement tool: an equitable assessment or reinforcing biases?. Health Justice 10, 34 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: