Gini coefficient | Consilience AI

The Gini coefficient originated in economics as a measure of income inequality (Gini, 1912)^{[1]Jump to source 1 in the sources list}, building on the Lorenz curve (Lorenz, 1905)^{[2]Jump to source 2 in the sources list}. In credit modeling, an analogous geometric construction is used: the Cumulative Accuracy Profile (CAP, also called the credit-scoring power curve) plots the cumulative fraction of defaults captured against the cumulative fraction of the population ranked by model score from worst to best. The credit-modeling “Gini” — also called the Accuracy Ratio (AR), and mathematically equivalent to Somers’ D for binary outcomes^{[5]Jump to source 5 in the sources list}^{[6]Jump to source 6 in the sources list} — measures the area between the CAP curve and the diagonal expected from a random model. The two Ginis share geometric intuition but are not numerically interchangeable across the income-inequality and binary-classifier domains.

The Gini coefficient is twice the area between the diagonal of perfect equality and the model’s Lorenz curve.

In a binary classification setting, the Gini coefficient is mathematically equivalent to 2 × AUC − 1^{[3]Jump to source 3 in the sources list}, where AUC is the area under the ROC curve. The probabilistic interpretation: AUC equals the probability that a randomly chosen positive (default) case is scored higher than a randomly chosen negative case — equivalent to the Wilcoxon / Mann-Whitney U statistic^{[4]Jump to source 4 in the sources list}. Some practitioners prefer Gini because it stretches the meaningful range — credit models rarely score below an AUC of 0.50, so the [0, 1] Gini scale is more interpretable than the [0.5, 1.0] AUC scale. The Basel Committee on Banking Supervision adopted the Accuracy Ratio / Gini as a standard discriminatory-power measure for internal-ratings-based model validation^{[7]Jump to source 7 in the sources list}.

AUC ↔ Gini conversion

AUC	Gini	Typical interpretation
0.50	0.00	No discrimination — equivalent to random ranking
0.65	0.30	Modest; acceptable for some thin-file scoring problems
0.75	0.50	Strong; competitive in consumer credit
0.85	0.70	Very strong; typical of well-tuned fraud or specialty lending models
1.00	1.00	Perfect ranking (essentially never observed on out-of-time data)

Like AUC, the Gini coefficient captures only the ranking quality of a model — its ability to score positives higher than negatives. It does not measure calibration, the accuracy of the predicted probabilities themselves. A model can score a high Gini while systematically over- or under-predicting default rates within score bands, which is why monitoring practices typically pair Gini with a calibration measure such as Brier score or expected calibration error^{[9]Jump to source 9 in the sources list}^{[10]Jump to source 10 in the sources list}. Mathematically the binary-classifier Gini can be negative (a model that ranks worse than random), but credit-scoring models in practice live in [0, 1].

Sources

[1]Variabilità e mutabilità: contributo allo studio delle distribuzioni e delle relazioni statistiche — Gini, C. — C. Cuppini, Bologna, 1912 (retrieved 2026-05-15)
[2]Methods of Measuring the Concentration of Wealth — Lorenz, M.O. — Publications of the American Statistical Association 9(70), 1905 (retrieved 2026-05-15)
[3]A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems — Hand, D.J. & Till, R.J. — Machine Learning 45, 2001 (retrieved 2026-05-15)
“Gini + 1 = 2·AUC.”
[4]The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve — Hanley, J.A. & McNeil, B.J. — Radiology 143(1), 1982 (retrieved 2026-05-15)
“AUC equals the probability that a randomly chosen positive case is ranked higher than a randomly chosen negative case (equivalent to the Wilcoxon / Mann-Whitney U statistic).”
[5]Parameters behind 'nonparametric' statistics: Kendall's tau, Somers' D and median differences — Newson, R. — Stata Journal 2(1), 2002 (retrieved 2026-05-15)
[6]Measuring the Discriminative Power of Rating Systems — Engelmann, Hayden & Tasche — Deutsche Bundesbank Discussion Paper Series 2, No. 01/2003 (also Risk 16), 2003 (retrieved 2026-05-15)
[7]Studies on the Validation of Internal Rating Systems (revised) — Basel Committee on Banking Supervision — Working Paper No. 14, May 2005 (retrieved 2026-05-15)
[8]Bradley, A.P. — The Use of the Area under the ROC Curve in the Evaluation of Machine Learning Algorithms — Pattern Recognition 30(7), 1997 (retrieved 2026-05-15)
[9]Open Risk Manual — Accuracy Ratio — Open Risk Manual, current (retrieved 2026-05-15)
[10]Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards (2nd ed.) — Siddiqi, N. — Wiley/SAS, 2017 (retrieved 2026-05-15)