Reject inference | Consilience AI

A lender that retrains a credit model only on previously-approved applicants is training on a censored sample. The accepted population is, by construction, the population the prior model considered good enough to fund — which means the new model never sees the performance of applicants the prior model rejected. The result is a model that learns to predict default within the booked (accepted) population rather than the through-the-door applicant population the new model will actually score^{[1]Jump to source 1 in the sources list}^{[5]Jump to source 5 in the sources list}. Reject inference is the family of techniques that attempts to correct for this sample-selection bias. Crucially, the empirical literature — including the title paper of the field — finds that gains from reject inference are typically modest and that no single technique dominates^{[3]Jump to source 3 in the sources list}^{[4]Jump to source 4 in the sources list}^{[8]Jump to source 8 in the sources list}.

The methods fall into a small number of families, each making a different assumption about how rejected applicants would have performed if funded. Hand & Henley (1993) introduced the canonical three-class taxonomy: extrapolation-based methods, distribution-based methods, and methods using supplementary information^{[2]Jump to source 2 in the sources list}. Industry practitioner taxonomies (Siddiqi 2017; Anderson 2007) extend this with named techniques used in commercial scorecard tooling. A terminology note: in the academic literature (Banasik & Crook), “augmentation” refers to inverse-probability re-weighting of accepts; in Siddiqi/SAS practitioner usage, “augmentation” often refers to the augmented dataset that includes rejects with imputed labels. The same word denotes different operations in the two literatures.

Common methods compared

Method	What it does	Trade-off
Augmentation (re-weighting)	Reweights accepted applicants by the inverse probability of acceptance so the booked sample represents the through-the-door applicant pool.	Simple and defensible; provides no signal on the rejected segment itself.
Parcelling	Bins rejects by score band and assigns a binary good/bad label using the bin’s empirical bad rate (often with a prudence adjustment). Fuzzy parcelling instead duplicates each reject into two rows with fractional weights P(good) and P(bad).	Uses reject information; inherits the prior model’s bias.
Extrapolation	Trains a model on accepts only, predicts proxy labels on rejects, and retrains on the combined dataset.	Aggressively uses the reject segment; sensitive to extrapolation error in the off-distribution reject region.
Heckman bivariate-probit (selection model)	Jointly models the acceptance decision and the default outcome, exploiting an exclusion restriction to identify selection bias^{[7]Jump to source 7 in the sources list}.	Theoretically principled; sensitive to the validity of the exclusion restriction.
Iterative reclassification	EM-style algorithm that iteratively re-estimates reject labels using the current model and refits^{[6]Jump to source 6 in the sources list}.	Can be unstable; convergence properties depend on initialization.
Hard cutoff	Treats all rejects below a score threshold as bad.	Easy to implement; assumes the prior cutoff was correctly calibrated.

Model validators following SR 11-7-style model risk management expectations want to see reject inference addressed explicitly in development documentation: the method chosen, the rationale for the choice, and a sensitivity analysis showing how model performance and disparate impact would change under an alternative approach. The choice of method can have material implications for adverse impact ratios in thin-file segments where the rejected population is structurally different from the accepted one — though the empirical evidence on that point is still emerging^{[9]Jump to source 9 in the sources list}^{[10]Jump to source 10 in the sources list}.

Sources

[1]Statistical Classification Methods in Consumer Credit Scoring: A Review — Hand, D.J. & Henley, W.E. — JRSS Series A 160(3), 1997 (retrieved 2026-05-15)
[2]Can reject inference ever work? — Hand, D.J. & Henley, W.E. — IMA Journal of Mathematics Applied in Business & Industry 5(1), 1993 (retrieved 2026-05-15)
[3]Does reject inference really improve the performance of application scoring models? — Crook, J.N. & Banasik, J. — Journal of Banking & Finance 28(4), 2004 (retrieved 2026-05-15)
“Reject inference is the process whereby one attempts to incorporate characteristics of rejected applicants into the process of calibrating a scorecard, based primarily on the repayment behaviour of accepted applicants.”
[4]Reject inference, augmentation, and sample selection — Banasik, J. & Crook, J.N. — European Journal of Operational Research 183(3), 2007 (retrieved 2026-05-15)
[5]Sample Selection Bias as a Specification Error — Heckman, J.J. — Econometrica 47(1), 1979 (retrieved 2026-05-15)
[6]Reject inference applied to logistic regression for credit scoring — Joanes, D.N. — IMA Journal of Mathematics Applied in Business & Industry 5(1), 1993 (retrieved 2026-05-15)
[7]Sample selection in credit-scoring models — Greene, W.H. — Japan and the World Economy 10(3), 1998 (retrieved 2026-05-15)
[8]Reject inference methods in credit scoring — Ehrhardt, Biernacki, Vandewalle, Heinrich & Beben — Journal of Applied Statistics, 2021 (retrieved 2026-05-15)
[9]Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards (2nd ed.) — Siddiqi, N. — Wiley/SAS, 2017 (retrieved 2026-05-15)
[10]The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management — Anderson, R. — Oxford University Press, 2007 (retrieved 2026-05-15)