Skip to contents

Detects anomalous observations in a numeric vector using an IQR-based rule with enhanced robustness and optional verbose reporting. The cutoff threshold is scaled by a tunable factor \((0.15/\alpha)\).

Usage

iqr2(x, alpha = 0.05, max_anoms = 0.2, verbose = FALSE)

Arguments

x

A numeric vector.

alpha

A numeric (default: 0.05). Controls the sensitivity of outlier detection. Smaller values yield broader thresholds.

max_anoms

A numeric between 0 and 1 (default: 0.2). Maximum fraction of data points to be flagged as outliers.

verbose

Logical (default: FALSE). If TRUE, returns a detailed outlier report; otherwise returns a binary vector.

Value

If verbose = FALSE, returns an integer vector of 0s and 1s (same length as x), where 1 indicates an outlier.

If verbose = TRUE, returns a list with the following elements:

outlier

Binary vector of outlier flags (1 = outlier).

outlier_idx

Indices of detected outliers.

outlier_vals

Values of detected outliers.

outlier_direction

Direction of anomaly ("Up" or "Down").

critical_limits

Named vector with lower and upper bounds used for outlier detection.

outlier_report

Tibble containing values, limits, and direction annotations.

Details

This function is based on the IQR-based approach used in anomalize, but modifies the output to return binary flags (1 = outlier, 0 = normal) instead of string labels ("Yes", "No"). It also removes the dependency on pipe operators and expresses the logic using explicit data manipulation functions for clarity and standalone usage.

The detection threshold is defined as: $$[Q1 - (0.15 / \alpha) \cdot IQR, \; Q3 + (0.15 / \alpha) \cdot IQR]$$ where IQR = Q3 - Q1. Among points beyond this range, only the top max_anoms × length(x) are retained based on their magnitude of deviation.

References

This implementation is adapted from the anomalize::iqr() function: https://business-science.github.io/anomalize/index.html