A lender that retrains a credit model only on previously-approved applicants is training on a censored sample. The accepted population is, by construction, the population the prior model considered good enough to fund — which means the new model never sees the performance of applicants the prior model rejected. The result is a model that learns to predict default within the booked (accepted) population rather than the through-the-door applicant population the new model will actually score[1]Jump to source 1 in the sources list[5]Jump to source 5 in the sources list. Reject inference is the family of techniques that attempts to correct for this sample-selection bias. Crucially, the empirical literature — including the title paper of the field — finds that gains from reject inference are typically modest and that no single technique dominates[3]Jump to source 3 in the sources list[4]Jump to source 4 in the sources list[8]Jump to source 8 in the sources list.
The methods fall into a small number of families, each making a different assumption about how rejected applicants would have performed if funded. Hand & Henley (1993) introduced the canonical three-class taxonomy: extrapolation-based methods, distribution-based methods, and methods using supplementary information[2]Jump to source 2 in the sources list. Industry practitioner taxonomies (Siddiqi 2017; Anderson 2007) extend this with named techniques used in commercial scorecard tooling. A terminology note: in the academic literature (Banasik & Crook), “augmentation” refers to inverse-probability re-weighting of accepts; in Siddiqi/SAS practitioner usage, “augmentation” often refers to the augmented dataset that includes rejects with imputed labels. The same word denotes different operations in the two literatures.
Common methods compared
| Method | What it does | Trade-off |
|---|---|---|
| Augmentation (re-weighting) | Reweights accepted applicants by the inverse probability of acceptance so the booked sample represents the through-the-door applicant pool. | Simple and defensible; provides no signal on the rejected segment itself. |
| Parcelling | Bins rejects by score band and assigns a binary good/bad label using the bin’s empirical bad rate (often with a prudence adjustment). Fuzzy parcelling instead duplicates each reject into two rows with fractional weights P(good) and P(bad). | Uses reject information; inherits the prior model’s bias. |
| Extrapolation | Trains a model on accepts only, predicts proxy labels on rejects, and retrains on the combined dataset. | Aggressively uses the reject segment; sensitive to extrapolation error in the off-distribution reject region. |
| Heckman bivariate-probit (selection model) | Jointly models the acceptance decision and the default outcome, exploiting an exclusion restriction to identify selection bias[7]Jump to source 7 in the sources list. | Theoretically principled; sensitive to the validity of the exclusion restriction. |
| Iterative reclassification | EM-style algorithm that iteratively re-estimates reject labels using the current model and refits[6]Jump to source 6 in the sources list. | Can be unstable; convergence properties depend on initialization. |
| Hard cutoff | Treats all rejects below a score threshold as bad. | Easy to implement; assumes the prior cutoff was correctly calibrated. |
Model validators following SR 11-7-style model risk management expectations want to see reject inference addressed explicitly in development documentation: the method chosen, the rationale for the choice, and a sensitivity analysis showing how model performance and disparate impact would change under an alternative approach. The choice of method can have material implications for adverse impact ratios in thin-file segments where the rejected population is structurally different from the accepted one — though the empirical evidence on that point is still emerging[9]Jump to source 9 in the sources list[10]Jump to source 10 in the sources list.
Sources
- [1]Statistical Classification Methods in Consumer Credit Scoring: A Review — Hand, D.J. & Henley, W.E. — JRSS Series A 160(3), 1997 (retrieved 2026-05-15)
- [2]Can reject inference ever work? — Hand, D.J. & Henley, W.E. — IMA Journal of Mathematics Applied in Business & Industry 5(1), 1993 (retrieved 2026-05-15)
- [3]Does reject inference really improve the performance of application scoring models? — Crook, J.N. & Banasik, J. — Journal of Banking & Finance 28(4), 2004 (retrieved 2026-05-15)
“Reject inference is the process whereby one attempts to incorporate characteristics of rejected applicants into the process of calibrating a scorecard, based primarily on the repayment behaviour of accepted applicants.”
- [4]Reject inference, augmentation, and sample selection — Banasik, J. & Crook, J.N. — European Journal of Operational Research 183(3), 2007 (retrieved 2026-05-15)
- [5]Sample Selection Bias as a Specification Error — Heckman, J.J. — Econometrica 47(1), 1979 (retrieved 2026-05-15)
- [6]Reject inference applied to logistic regression for credit scoring — Joanes, D.N. — IMA Journal of Mathematics Applied in Business & Industry 5(1), 1993 (retrieved 2026-05-15)
- [7]Sample selection in credit-scoring models — Greene, W.H. — Japan and the World Economy 10(3), 1998 (retrieved 2026-05-15)
- [8]Reject inference methods in credit scoring — Ehrhardt, Biernacki, Vandewalle, Heinrich & Beben — Journal of Applied Statistics, 2021 (retrieved 2026-05-15)
- [9]Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards (2nd ed.) — Siddiqi, N. — Wiley/SAS, 2017 (retrieved 2026-05-15)
- [10]The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management — Anderson, R. — Oxford University Press, 2007 (retrieved 2026-05-15)