Skip to contents

Classifies numeric values into "high" and "low" categories based on a threshold, while enforcing minimum run lengths for both categories. Values exceeding the threshold are classified as "high", others as "low". Short runs that don't meet the minimum length requirement are reclassified into the opposite category.

Usage

classify_by_threshold(
  values,
  threshold,
  min_low_frames,
  min_high_frames,
  return_type = c("numeric", "factor")
)

Arguments

values

Numeric vector to be classified

threshold

Numeric value used as classification boundary between "high" and "low"

min_low_frames

Minimum number of consecutive frames required for a "low" sequence

min_high_frames

Minimum number of consecutive frames required for a "high" sequence

return_type

Should the function return "factor" ("high"/"low") or "numeric" (1/0) (default: "numeric")

Value

Character vector of same length as input, with values classified as either "high" or "low". NA values in input remain NA in output.

Details

The classification process occurs in two steps:

  1. Initial classification based on threshold

  2. Reclassification of sequences that don't meet minimum length requirements

The function first processes "low" sequences, then "high" sequences. This order can affect the final classification when there are competing minimum length requirements.

Examples

# Basic usage
values <- c(1, 1.5, 2.8, 3.2, 3.0, 2.9, 1.2, 1.1)
result <- classify_by_threshold(values,
                                threshold = 2.5,
                                min_low_frames = 2,
                                min_high_frames = 3)

# Handling NAs
values_with_na <- c(1, NA, 3, 3.2, NA, 1.2)
result <- classify_by_threshold(values_with_na,
                                threshold = 2.5,
                                min_low_frames = 2,
                                min_high_frames = 2)