Model performance

Gini coefficient

Definition

A summary measure of model discrimination, equal to 2 × AUC − 1 in a binary classification setting; mathematically equivalent to Somers' D for binary outcomes and known in credit-risk practice as the Accuracy Ratio. The classifier Gini ranges from 0 (no discrimination — model performs at chance) to 1 (perfect discrimination — every positive is ranked above every negative).

The Gini coefficient originated in economics as a measure of income inequality (Gini, 1912)[1]Jump to source 1 in the sources list, building on the Lorenz curve (Lorenz, 1905)[2]Jump to source 2 in the sources list. In credit modeling, an analogous geometric construction is used: the Cumulative Accuracy Profile (CAP, also called the credit-scoring power curve) plots the cumulative fraction of defaults captured against the cumulative fraction of the population ranked by model score from worst to best. The credit-modeling “Gini” — also called the Accuracy Ratio (AR), and mathematically equivalent to Somers’ D for binary outcomes[5]Jump to source 5 in the sources list[6]Jump to source 6 in the sources list — measures the area between the CAP curve and the diagonal expected from a random model. The two Ginis share geometric intuition but are not numerically interchangeable across the income-inequality and binary-classifier domains.

Gini areaCumulative % of populationCumulative % of defaults capturedPerfect equalityModel
The Gini coefficient is twice the area between the diagonal of perfect equality and the model’s Lorenz curve.

In a binary classification setting, the Gini coefficient is mathematically equivalent to 2 × AUC − 1[3]Jump to source 3 in the sources list, where AUC is the area under the ROC curve. The probabilistic interpretation: AUC equals the probability that a randomly chosen positive (default) case is scored higher than a randomly chosen negative case — equivalent to the Wilcoxon / Mann-Whitney U statistic[4]Jump to source 4 in the sources list. Some practitioners prefer Gini because it stretches the meaningful range — credit models rarely score below an AUC of 0.50, so the [0, 1] Gini scale is more interpretable than the [0.5, 1.0] AUC scale. The Basel Committee on Banking Supervision adopted the Accuracy Ratio / Gini as a standard discriminatory-power measure for internal-ratings-based model validation[7]Jump to source 7 in the sources list.

AUC ↔ Gini conversion

AUCGiniTypical interpretation
0.500.00No discrimination — equivalent to random ranking
0.650.30Modest; acceptable for some thin-file scoring problems
0.750.50Strong; competitive in consumer credit
0.850.70Very strong; typical of well-tuned fraud or specialty lending models
1.001.00Perfect ranking (essentially never observed on out-of-time data)

Like AUC, the Gini coefficient captures only the ranking quality of a model — its ability to score positives higher than negatives. It does not measure calibration, the accuracy of the predicted probabilities themselves. A model can score a high Gini while systematically over- or under-predicting default rates within score bands, which is why monitoring practices typically pair Gini with a calibration measure such as Brier score or expected calibration error[9]Jump to source 9 in the sources list[10]Jump to source 10 in the sources list. Mathematically the binary-classifier Gini can be negative (a model that ranks worse than random), but credit-scoring models in practice live in [0, 1].

Sources

  1. [1]Variabilità e mutabilità: contributo allo studio delle distribuzioni e delle relazioni statistiche Gini, C. — C. Cuppini, Bologna, 1912 (retrieved 2026-05-15)
  2. [2]Methods of Measuring the Concentration of Wealth Lorenz, M.O. — Publications of the American Statistical Association 9(70), 1905 (retrieved 2026-05-15)
  3. [3]A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems Hand, D.J. & Till, R.J. — Machine Learning 45, 2001 (retrieved 2026-05-15)
    Gini + 1 = 2·AUC.
  4. [4]The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve Hanley, J.A. & McNeil, B.J. — Radiology 143(1), 1982 (retrieved 2026-05-15)
    AUC equals the probability that a randomly chosen positive case is ranked higher than a randomly chosen negative case (equivalent to the Wilcoxon / Mann-Whitney U statistic).
  5. [5]Parameters behind 'nonparametric' statistics: Kendall's tau, Somers' D and median differences Newson, R. — Stata Journal 2(1), 2002 (retrieved 2026-05-15)
  6. [6]Measuring the Discriminative Power of Rating Systems Engelmann, Hayden & Tasche — Deutsche Bundesbank Discussion Paper Series 2, No. 01/2003 (also Risk 16), 2003 (retrieved 2026-05-15)
  7. [7]Studies on the Validation of Internal Rating Systems (revised) Basel Committee on Banking Supervision — Working Paper No. 14, May 2005 (retrieved 2026-05-15)
  8. [8]Bradley, A.P. — The Use of the Area under the ROC Curve in the Evaluation of Machine Learning Algorithms Pattern Recognition 30(7), 1997 (retrieved 2026-05-15)
  9. [9]Open Risk Manual — Accuracy Ratio Open Risk Manual, current (retrieved 2026-05-15)
  10. [10]Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards (2nd ed.) Siddiqi, N. — Wiley/SAS, 2017 (retrieved 2026-05-15)