<jats:sec><jats:title>Study Design</jats:title><jats:p> Cross-sectional survey. </jats:p></jats:sec><jats:sec><jats:title>Objectives</jats:title><jats:p> Injury classifications are important tools to identify fracture patterns, guide treatment-decisions and aid to identify optimal treatment plans. The AO Spine-DGOU Osteoporotic Fracture (OF) classification system was developed, and the aim of this study was to assess the reliability of this new classification system. </jats:p></jats:sec><jats:sec><jats:title>Methods</jats:title><jats:p> 23 Members of the AO Spine Knowledge Forum Trauma participated in the validation process. Participants were asked to rate 33 cases according to the OF classification at 2 time points, 4 weeks apart (assessment 1 and 2). The kappa statistic (κ) was calculated to assess inter-observer reliability and intra-rater reproducibility. The gold master key for each case was determined by approval of at least 5 out of 7 members of the DGOU. </jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p> A total of 1386 ratings (21 raters) were performed. The overall inter-rater agreement was moderate with a combined kappa statistic for the OF classification of 0.496 in assessment 1 and 0.482 in assessment 2. The combined percentage of correct ratings (compared to gold-standard) in assessment 1 was 71.4% and 67.4% in assessment 2. The average intra-rater reproducibility was substantial (κ = 0.74, median 0.76, range 0.55 to 1.00, SD 0.13) for the assessed fracture types. </jats:p></jats:sec><jats:sec><jats:title>Conclusions</jats:title><jats:p> The assessed overall inter-rater reliability was moderate and substantial in some instances. The average intra-rater reproducibility is substantial. It seems that appropriate training of the classification system can enhance inter- and intra-rater reliability. </jats:p></jats:sec>