Debugging a Tukey Outlier Function in R

 In this exercise, I practiced reproducing and fixing a logical bug in an R function that was supposed to flag rows of a numeric matrix as outliers in every column using the Tukey rule.

First, I ran the original code on a test matrix:

set.seed(123)
test_mat <- matrix(rnorm(50), nrow = 10)
tukey_multiple(test_mat)

The function produced a warning similar to:

'length(x) = 10 > 1' in coercion to 'logical(1)'

The issue came from this line inside the loop:

outliers[, j] <- outliers[, j] && tukey.outlier(x[, j])

The operator && only evaluates the first element of each logical vector and returns a single TRUE or FALSE. That is useful for control flow, but it is incorrect here because the function needs to compare every row element-wise. Since both sides are vectors, the correct operator is &.

I fixed the bug by replacing the line with:

outliers[, j] <- outliers[, j] & tukey.outlier(x[, j])

I also improved the function by adding defensive programming checks to make sure the input is actually a numeric matrix before running the main logic.

Here is the corrected function:

corrected_tukey <- function(x) {
if (!is.matrix(x)) {
stop("x must be a matrix.")
}
if (!is.numeric(x)) {
stop("x must be a numeric matrix.")
}

outliers <- array(TRUE, dim = dim(x))

for (j in seq_len(ncol(x))) {
outliers[, j] <- outliers[, j] & tukey.outlier(x[, j])
}

outlier.vec <- logical(nrow(x))
for (i in seq_len(nrow(x))) {
outlier.vec[i] <- all(outliers[i, ])
}

outlier.vec
}

After rerunning the corrected version on the test matrix, it returned a logical vector of length 10 without errors:

[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

This debugging process showed the importance of knowing the difference between && and & in R. The bug was not a syntax problem, but a logical one, which makes defensive checks and careful testing especially important.

Comments

Popular posts from this blog

ABC vs. CBS: Comparing Fictional Presidential Poll Data in R

Assignment #10: Building Your Own R Package

Module # 4 Programming structure assignment