IPIP-NEO-120

The IPIP-NEO-120 is an optimized assessment of the five factor model of personality. It features the ability to measure six underlying facets for each of the five traits. The assessment strikes a nice balance when compared to other five factor assessments in terms of brevity and breadth while maintaining solid validity and reliability (see below). The test was created using items from the open source project called the International Personality Item Pool (IPIP).


Validity

In test theory, validation refers to the level of evidence that supports the ability of users to draw useful, accurate, and truthful inferences/interpretations from a measure. In other words, validity refers to an instrument measuring what it says it measures and the ability of results to generalize to real-world outcomes. Three types of validation include:

Content validity

Ability to infer from test measures to larger construct domains; the degree to which the instrument measures all facets of what it says it measures

According to Johnson (2014), the IPIP-NEO-120 is derived from the larger IPIP-NEO-300 item pool (see Goldberg, 1999) and was reduced to top four items for each facet measure using the following techniques:
  • Reliability: kept items with highest alpha coeffecients: mean α of .84 (traits) and .68 (facets)
  • Rational-Intuitive:
    • Replace similar/duplicate items
    • Keep items that are similar to NEO-PI-R
    • Remove any legally questionable items
The same study confirms that factor component loadings of the IPIP-NEO-120 are representitive of the five major personality traits.

Criterian validity

Ability to make inferences from test scores to another real-world behaviors

IPIP-NEO-120

Five Factors (OCEAN Traits)

Individual, Interpersonal, and Social Life Outcomes

As summarized in Soto (2019), effect sizes between the five factors and important life outcomes are generally between r = .20 - .40 (p < .05 - .01).

Construct validity

Ability to make inferences from test scores to various psychological constructs (e.g., characteristics grouped as personality traits) ; the overall confidence that a test measures what it claims to measure

The five factor theory was popularized by Norman (1963) as an "adequate taxonomy of personality attributes" - it factor analyzed scores from various personality trait scales into five distinct factors. Later, a test called the NEO was created from this earlier body of work and used the familiar OCEAN trait categories: McCrae & Costa (1985) provided validation data for their NEO assessment. Later, Costa & McCrae (1995) provided validity for six facet constructs within each major trait in the NEO-PI-R assessment tool.

Kajonius & Johnson (2019) document factor-analytic results of the IPIP-NEO-120 based on a large internet sample in the United States (n = 320,128). This analysis confirmed a five trait structure with corresponding facets: Root Mean Square Error of Approximation (RMSEA) results for the five scales were 0.06 or less for all except Extraversion which was reported at 0.07. Further, IPIP-NEO-120 scales are highly correlated with scales from longer-form assessments of the five factors (see the test characteristcs table below).

Reliability


Reliability refers to the consistency and repeatability of test scores or measurements. It answers the question, how likely is an individual to obtain the same score if the same measure is repeatedly administered to them?

Inter-rater reliability

Assesses the degree of agreement between two or more raters in their appraisals.

Many studies have explored the extent that different observers ratings are different than self-ratings when evaluating the five factors. In particular, Kim et al. (2019) conducted a meta analysis of over 150 studies examining the degree that big five ratings from friends, family, colleagues, and strangers would differ from self-ratings. Overall, no significant differences were found (average effect δ = −.038) indicating good consistency between self and other ratings across various methods of assessment.

Test-retest reliability

Assesses the degree to which test scores are consistent from one test administration to the next.

McCrae et al. (2010) indicates that the big five personality traits generally have strong retest stability when readministered to the same participants after 1 week (retest rs = .91-.93) and over the course of 6 years (retest rs = .87-.93). Trait consistency for Neuroticism, Extraversion, Openness, and Agreeableness peaks in middle age whereas Conscientiousness continuously stabilizes over time. Trait scores are also influenced by life events (Specht et al., 2011).

Inter-method reliability

Assesses the degree to which test scores are consistent when there is a variation in the methods or instruments used.

As demonstrated in Johnson (2014), convergent / scale correlations between IPIP-NEO-120 and other methods of measuring the five factors are relatively high:
  • NEO PI-R rs = .76 to .87
  • BFI rs = .33 to .57 (p < .05)
  • MiniMarkers rs = .31 to .49 (p < .05)
  • Acquaintance ratings rs = .25 to .49

Internal consistency reliability

Assesses the consistency of results across items within a test.

The alpha coefficient is a measure of reliability. It increases with fewer measurement errors of an assessment's items and when items are measuring a single construct.

Johnson (2014) reports alpha coefficients from a large internet sample (n = 619,150) of users who completed the IPIP-NEO-120:
  • Neuroticism: α = .90
  • Extraversion: α = .89
  • Openness to experience: α = .81
  • Agreeableness: α = .86
  • Conscientiousness: α = .90
  • Facets: mean α = .75; α range = .63 (Liberalism) to .88 (Cautiousness)

Supporting Test Characteristics

Scale NamesNumber of ItemsMean Item IntercorrelationsCoefficient AlphaScale Correlations
IPIPNEONEO
(+ / -)
IPIP 300
(+ / -)
IPIP 120
(+ / -)
NEOIPIP 300IPIP 120NEOIPIP 300IPIP 120IPIP 300 vs NEOIPIP 120 vs NEO
Neuroticism27+21=4833+27=6017+7=240.220.220.240.930.930.88.89 [.95].87 [.97]
AnxietyAnxiety (N1)4 + 4 = 85+ 5=104+0= 40.380.340.390.830.830.71.76 [.92].76 [.99]
AngerAngry Hostility (N2)5 + 3 = 85+ 5=103+1= 40.340.430.470.80.880.77.77 [.92].71 [.90]
DepressionDepression (N3)6 + 2 = 87+ 3=103+1= 40.420.440.510.840.890.8.81 [.94].76 [.93]
Self-ConsciousnessSelf-Consciousness (N4)5 + 3 = 86+ 4=103+1= 40.260.280.30.740.80.63.73 [.95].60 [.88]
ImmoderationImpulsiveness (N5)4 + 4 = 85+ 5=101+3= 40.240.250.360.720.770.69.74 [.99].65 [.92]
VulnerabilityVulnerability (N6)3 + 5 = 85+ 5=103+1= 40.350.330.380.790.820.7.78 [.97].74 [.99]
Extraversion29+19=4836+24=6018+6=240.150.160.180.890.920.84.89 [.98].85 [.98]
FriendlinessWarmth (E1)6 + 2 = 85+ 5=102+2= 40.350.410.450.80.870.77.76 [.91].68 [.87]
GregariousnessGregariousness (E2)4 + 4 = 85+ 5=102+2= 40.350.270.280.80.790.6.78 [.98].73 [.99]
AssertivenessAssertiveness (E3)4 + 4 = 85+ 5=103+1= 40.330.350.440.80.840.75.81 [.99].73 [.94]
Activity LevelActivity (E4)5 + 3 = 85+ 5=103+1= 40.270.20.350.740.710.68.72 [.99].63 [.89]
Excitement-SeekingExcitement-Seeking (E5)6 + 2 = 88+ 2=104+0= 40.20.270.340.650.770.67.67 [.95].59 [.89]
CheerfulnessPositive Emotions (E6)4 + 4 = 88+ 2=104+0= 40.370.30.390.810.810.71.77 [.95].69 [.91]
Openness To Experience24+24=4828+32=6012+12=240.170.170.190.910.920.85.87 [.95].84 [.96]
ImaginationFantasy (O1)3 + 5 = 86+ 4=104+0= 40.360.320.370.810.820.7.74 [.91].69 [.92]
Artistic InterestsAesthetics (O2)5 + 3 = 85+ 5=102+2= 40.410.380.40.840.850.72.80 [.95].76 [.98]
EmotionalityFeelings (O3)5 + 3 = 85+ 5=102+2= 40.280.30.340.750.810.67.71 [.91].65 [.92]
AdventurousnessActions (O4)3 + 5 = 84+ 6=101+3= 40.180.250.330.640.770.66.72 [.99].62 [.95]
IntellectIdeas (O5)5 + 3 = 85+ 5=101+3= 40.390.390.470.820.860.78.81 [.96].75 [.94]
LiberalismValues (O6)3 + 5 = 83+ 7=102+2= 40.310.390.470.780.860.76.71 [.87].63 [.82]
Agreeableness26+22=4824+36=607+17=240.140.130.160.890.90.81.84 [.94].76 [.90]
TrustTrust (A1)5 + 3 = 86+ 4=103+1= 40.430.320.380.840.820.7.78 [.94].73 [.95]
MoralityStraightforwardness (A2)3 + 5 = 82+ 8=100+4= 40.260.240.290.740.740.62.65 [.88].54 [.80]
AltruismAltruism (A3)5 + 3 = 85+ 5=102+2= 40.260.260.330.720.770.65.68 [.91].54 [.79]
CooperationCompliance (A4)3 + 5 = 83+ 7=100+4= 40.250.220.260.720.720.56.72 [.99].62 [.98]
ModestyModesty (A5)4 + 4 = 84+ 6=100+4= 40.260.250.310.730.760.63.71 [.95].64 [.94]
SympathyTender-Mindedness (A6)6 + 2 = 84+ 6=102+2= 40.170.230.350.610.750.68.62 [.92].55 [.85]
Conscientiousness28+20=4831+29=6011+13=240.190.230.20.910.920.84.85 [.93].80 [.92]
Self-EfficacyCompetence (C1)5 + 3 = 86+ 4=104+0= 40.260.280.260.710.790.57.68 [.91].59 [.93]
OrderlinessOrder (C2)3 + 5 = 85+ 5=101+3= 40.280.330.440.740.830.76.77 [.98].68 [.91]
DutifulnessDutifulness (C3)6 + 2 = 85+ 5=102+2= 40.240.190.190.680.710.47.60 [.86].53 [.94]
Achievement-StrivingAchievement Striving (C4)5 + 3 = 87+ 3=102+2= 40.260.280.350.720.790.68.71 [.94].57 [.81]
Self-DisciplineSelf-Discipline (C5)4 + 4 = 85+ 5=102+2= 40.350.370.340.810.850.66.77 [.93].72 [.98]
CautiousnessDeliberation (C6)5 + 3 = 83+ 7=100+4= 40.230.250.380.70.760.7.69 [.95].61 [.87]
Mean for Facet Scales4 + 4 = 85+ 5=102+2= 40.30.30.360.760.80.68.73 [.94].66 [.91]
Adapted from the IPIP website


Notes


References

Costa, P. T., & McCrae, R. R. (1995). Domains and facets: Hierarchical personality assessment using the Revised NEO Personality Inventory. Journal of Personality Assessment, 64(1), 21-50. http://doi.org/10.1207/s15327752jpa6401_2

Goldberg, L. R. (1999). A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models. In I. Mervielde, I. Deary, F. De Fruyt, & F. Ostendorf (Eds.), Personality Psychology in Europe, Vol. 7 (pp. 7-28). Tilburg, The Netherlands: Tilburg University Press. https://ipip.ori.org/A%20broad-bandwidth%20inventory.pdf

Johnson, J. A. (2014). Measuring thirty facets of the Five Factor Model with a 120-item public domain inventory: Development of the IPIP-NEO-120. Journal of Research in Personality, 51, 78-89. http://doi.org/10.1016/j.jrp.2014.05.003

Kajonius, P. J. & Johnson, J. A. (2019). Assessing the Structure of the Five FactorModel of Personality (IPIP-NEO-120) in the Public Domain. Europe's Journal of Psychology. 15 (2), 260-275. https://doi.org/10.5964/ejop.v15i2.1671

Kim, H., Di Domenico, S. I., & Connelly, B. S. (2019). Self–other agreement in personality reports: A meta-analytic comparison of self- and informant-report means. Psychological Science, 30(1), 129-138. http://doi.org/10.1177/0956797618810000

McCrae, R. R., & Costa, P. T. (1985). Updating Norman's 'adequacy taxonomy': Intelligence and personality dimensions in natural language and in questionnaires. Journal of Personality and Social Psychology, 49(3), 710-721. http://doi.org/10.1037/0022-3514.49.3.710

McCrae, R. R., Kurtz, J. E., Yamagata, S., & Terracciano, A. (2011). Internal consistency, retest reliability, and their implications for personality scale validity. Personality and Social Psychology Review, 15(1), 28-50. http://doi.org/10.1177/1088868310366253

Norman, W. T. (1963). Toward an adequate taxonomy of personality attributes: Replicated factor structure in peer nomination personality ratings. The Journal of Abnormal and Social Psychology, 66(6), 574-583. http://doi.org/10.1037/h0040291

Soto, C. J. (2019). How Replicable Are Links Between Personality Traits and Consequential Life Outcomes? The Life Outcomes of personality Replication Project. Psychological Science, 30(5), 711-727. https://doi.org/10.1177%2F0956797619831612

Specht, Jule & Egloff, Boris & Schmukle, Stefan. (2011). Stability and Change of Personality Across the Life Course: The Impact of Age and Major Life Events on Mean-Level and Rank-Order Stability of the Big Five. Journal of Personality and Social Psychology. 101. 862-882. https://doi.org/10.1037/a0024950