Skip to content

API Reference

BinaryClassifier

The main entry point. Construct from (y_true, y_score) or from bare confusion matrix counts via from_cm.

classifier_uncertainty._classifier.BinaryClassifier

Uncertainty-aware binary classifier evaluator.

Implements Bayesian uncertainty quantification for classifier metrics following Tötsch & Hoffmann (2020). Metrics are derived by sampling the confusion matrix probability matrix (θ) from three independent Beta posteriors for prevalence (φ), TPR, and TNR.

Parameters:

Name Type Description Default
y_true ndarray

Ground-truth binary labels (bool or 0/1).

required
y_score ndarray

Classifier scores; higher values indicate a more positive prediction.

required
n_samples int

Number of posterior CM samples. Default is 20_000.

20000
prior tuple[float, float]

Beta(α, β) prior applied uniformly to φ, TPR, and TNR. Default is the Laplace prior (1.0, 1.0). The same prior is used for all three distributions; per-distribution priors are not currently supported. If you need that, open an issue on GitHub.

(1.0, 1.0)
seed int

Random seed for reproducibility.

None

at_threshold(threshold=0.5)

Return metric distributions at a fixed score threshold.

Parameters:

Name Type Description Default
threshold float

Decision boundary applied to y_score. Default is 0.5.

0.5

Returns:

Type Description
ThresholdResult

Posterior metric distributions at this threshold.

from_cm(tp, fn, tn, fp, n_samples=20000, prior=(1.0, 1.0), seed=None) classmethod

Construct from observed confusion matrix counts.

Parameters:

Name Type Description Default
tp int

True positive count.

required
fn int

False negative count.

required
tn int

True negative count.

required
fp int

False positive count.

required
n_samples int

Number of posterior CM samples. Default is 20_000.

20000
prior tuple[float, float]

Beta(α, β) prior applied uniformly to φ, TPR, and TNR. Default is Laplace (1.0, 1.0). Per-distribution priors are not currently supported; open a GitHub issue if you need them.

(1.0, 1.0)
seed int

Random seed for reproducibility.

None

Returns:

Type Description
BinaryClassifier

Instance with a fixed CM; :meth:roc_curve and :meth:pr_curve are not available.

pr_curve(n_thresholds=50)

Return an uncertainty-aware PR curve over a quantile-spaced threshold grid.

Parameters:

Name Type Description Default
n_thresholds int

Number of thresholds in the grid. Default is 50.

50

Returns:

Type Description
PRResult

PR curve with per-threshold posterior uncertainty ellipses.

roc_curve(n_thresholds=50)

Return an uncertainty-aware ROC curve over a quantile-spaced threshold grid.

Parameters:

Name Type Description Default
n_thresholds int

Number of thresholds in the grid. Default is 50.

50

Returns:

Type Description
ROCResult

ROC curve with per-threshold posterior uncertainty ellipses.


ThresholdResult

Returned by BinaryClassifier.at_threshold(). All metrics share the same posterior CM samples, preserving correlations.

classifier_uncertainty._results.ThresholdResult

Metric distributions at a fixed classification threshold.

All metrics share the same CM samples, preserving their correlations. Custom metrics receive CM entry proportions (θ values) as numpy arrays summing to ~1 per sample, so standard ratio metrics work unchanged.

accuracy()

Return the posterior distribution of accuracy: (TP + TN) / N.

at_prevalence(phi, seed=None)

Return a new ThresholdResult with prevalence replaced by phi.

Re-uses the TPR and TNR posterior samples from this result unchanged, replacing only the prevalence (φ). This implements the prevalence-exchange technique from Tötsch & Hoffmann (2020): because TPR and TNR are sampled independently of φ, swapping φ is exact.

Parameters:

Name Type Description Default
phi float or tuple[float, float]

New prevalence. A float fixes φ exactly (e.g. the known population rate); a (α, β) tuple draws φ from Beta(α, β) to encode uncertainty over the production prevalence (e.g. (2, 398) for φ ≈ 0.005 ± uncertainty).

required
seed int

Random seed used when phi is a tuple. Ignored for float.

None

Returns:

Type Description
ThresholdResult

New result sharing the same TPR/TNR posterior but with the specified φ.

Raises:

Type Description
ValueError

If phi is a float outside the open interval (0, 1).

balanced_accuracy()

Return the posterior distribution of balanced accuracy: (TPR + TNR) / 2.

bookmaker_informedness()

Return the posterior distribution of bookmaker informedness: TPR + TNR − 1.

f1()

Return the posterior distribution of F1: 2TP / (2TP + FP + FN).

mcc()

Return the posterior distribution of Matthews correlation coefficient.

mean_expense(cost, loss)

Return the posterior distribution of mean expense per observation.

Protective actions (TP and FP) each incur cost; missed events (FN) incur loss; correct negatives (TN) have no cost.

The formula is (TP + FP) * cost + FN * loss evaluated on CM entry proportions, which equals (hits + false_alarms) * cost + misses * loss divided by N.

Parameters:

Name Type Description Default
cost float

Cost of a protective action (incurred for both hits and false alarms).

required
loss float

Loss incurred for a missed event (false negative).

required

Returns:

Type Description
MetricResult

Posterior distribution of mean expense per observation.

metric(func)

Compute a custom metric from CM entry proportions.

Parameters:

Name Type Description Default
func callable

A function f(tp, fn, tn, fp) -> array where each argument is a numpy array of CM entry proportions (θ values summing to ~1 per sample). Standard ratio metrics require no rescaling.

required

Returns:

Type Description
MetricResult

Posterior distribution of the custom metric.

npv()

Return the posterior distribution of NPV: TN / (TN + FN).

precision()

Return the posterior distribution of precision (PPV): TP / (TP + FP).

relative_value(cost_loss_ratio)

Return the Value Score distribution at a given cost/loss ratio (Wilks 2001).

Parameters:

Name Type Description Default
cost_loss_ratio float

C/L in the open interval (0, 1). Cost of protective action divided by loss suffered when the event occurs without protection.

required

Returns:

Type Description
MetricResult

Posterior distribution of the Value Score at the given C/L.

Raises:

Type Description
ValueError

If cost_loss_ratio is not in (0, 1).

tnr()

Return the posterior distribution of TNR: TN / (TN + FP).

tpr()

Return the posterior distribution of TPR: TP / (TP + FN).

value_score_curve(n_cl=100)

Return the Value Score curve across all cost/loss ratios (Wilks 2001).

Parameters:

Name Type Description Default
n_cl int

Number of C/L grid points in the open interval (0, 1). Default is 100.

100

Returns:

Type Description
ValueScoreCurve

VS posterior distributions over the C/L grid.


MetricResult

Returned by every metric method. Wraps posterior samples and provides credible intervals and plotting.

classifier_uncertainty._results.MetricResult

Posterior distribution of a scalar classifier metric.

Attributes:

Name Type Description
samples ndarray

Raw posterior samples of shape (n_samples,).

point_estimate float

Posterior mean.

metric_uncertainty float

Length of the 95 % HPDI — the metric uncertainty (MU) of Tötsch & Hoffmann (2020).

metric_uncertainty property

Length of the 95 % HPDI — metric uncertainty (MU) of Tötsch & Hoffmann.

point_estimate property

Posterior mean.

samples property

Raw posterior samples of shape (n_samples,).

credible_interval(level=0.95)

Return the highest posterior density interval (HPDI).

Parameters:

Name Type Description Default
level float

Probability mass to enclose. Default is 0.95.

0.95

Returns:

Type Description
tuple[float, float]

(lower, upper) bounds of the HPDI.

plot(ax=None, level=0.95, **kwargs)

Plot a histogram of posterior samples with HPDI shading.

Parameters:

Name Type Description Default
ax Axes

Axes to draw on. Uses plt.gca() if None.

None
level float

HPDI level to shade. Default is 0.95.

0.95
**kwargs

Forwarded to ax.hist.

{}

Returns:

Type Description
Axes

The axes with the plot.


ValueScoreCurve

Returned by ThresholdResult.value_score_curve().

classifier_uncertainty._results.ValueScoreCurve

Value Score as a function of cost/loss ratio, with posterior uncertainty.

Produced by :meth:ThresholdResult.value_score_curve. The VS curve (Wilks 2001) shows the relative economic value of a classifier as a function of the decision-maker's cost/loss ratio.

plot(ax=None, level=0.95, color='C0', alpha=0.25)

Plot the VS curve with a posterior credible band.

Parameters:

Name Type Description Default
ax Axes

Axes to draw on. Uses plt.gca() if None.

None
level float

HPDI level for the shaded band. Default is 0.95.

0.95
color str

Line and fill colour. Default is "C0".

'C0'
alpha float

Fill opacity. Default is 0.25.

0.25

Returns:

Type Description
Axes

The axes with the plot.


ROCResult

Returned by BinaryClassifier.roc_curve().

classifier_uncertainty._curves.ROCResult

Uncertainty-aware ROC curve.

Produced by :meth:BinaryClassifier.roc_curve. Uncertainty is shown as a 95 % HPDI band computed by interpolating TPR samples onto a fixed FPR grid.

Attributes:

Name Type Description
auc MetricResult

Posterior distribution of AUC-ROC, computed via per-sample trapezoid integration.

auc property

Posterior distribution of AUC-ROC via per-sample trapezoid integration.

plot(ax=None, level=0.95, color='C0', alpha=0.3)

Plot the ROC curve with a posterior HPDI band.

Parameters:

Name Type Description Default
ax Axes

Axes to draw on. Uses plt.gca() if None.

None
level float

HPDI level for the shaded band. Default is 0.95.

0.95
color str

Curve and band colour. Default is "C0".

'C0'
alpha float

Band opacity. Default is 0.3.

0.3

Returns:

Type Description
Axes

The axes with the plot.


PRResult

Returned by BinaryClassifier.pr_curve().

classifier_uncertainty._curves.PRResult

Uncertainty-aware Precision-Recall curve.

Produced by :meth:BinaryClassifier.pr_curve. Uncertainty is shown as a 95 % HPDI band computed by interpolating Precision samples onto a fixed Recall grid.

Attributes:

Name Type Description
auc MetricResult

Posterior distribution of AUC-PR (average precision), computed via per-sample trapezoid integration.

auc property

Posterior distribution of AUC-PR via per-sample trapezoid integration.

plot(ax=None, level=0.95, color='C0', alpha=0.3)

Plot the PR curve with a posterior HPDI band.

Parameters:

Name Type Description Default
ax Axes

Axes to draw on. Uses plt.gca() if None.

None
level float

HPDI level for the shaded band. Default is 0.95.

0.95
color str

Curve and band colour. Default is "C0".

'C0'
alpha float

Band opacity. Default is 0.3.

0.3

Returns:

Type Description
Axes

The axes with the plot.