Skip to content

PRL-PRG/mutator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

104 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

R-CMD-check Codecov test coverage

mutator

Overview

mutator is an automated mutation testing tool for the R language. It applies mutation testing principles to help developers improve test suite quality by introducing small, systematic changes (mutations) to source code and verifying if tests can detect these changes.

Features

  • Comprehensive Mutation Testing: Applies various mutation operators to R source code
  • AST-Based Mutation: Uses Abstract Syntax Tree analysis for intelligent code mutations
  • Parallel Test Execution: Runs tests in parallel for improved performance
  • Equivalent Mutant Detection: Uses AI to identify mutants that are functionally equivalent to the original code
  • Detailed Reporting: Provides mutation scores and analysis of test suite effectiveness

Package Structure

mutator/
├── R/                      # R source code
│   ├── mutator.R           # Core functionality
│   └── init.R
├── src/                    # C++ source + Catch2 C++ tests (testthat integration)
├── tests/                  # R tests (includes compiled C++ test entrypoint)
└── DESCRIPTION             # Package metadata

Installation

# Install dependencies
install.packages(c("pkgload", "testthat", "httr", "jsonlite", "future", "furrr"))

# Install from GitHub (if available)
devtools::install_github("PRL-PRG/mutator")

# Or install from local source
# In R console
setwd("path/to/mutator")
devtools::install()

Quick Start Guide

library(mutator)

# Mutate a single file
mutants <- mutate_file("path/to/your/file.R")

# Optional: cap returned mutants by random selection
mutants <- mutate_file("path/to/your/file.R", max_mutants = 20)

# Mutate an entire package and run tests
result <- mutate_package("path/to/your/package")

# Optional: cap tested mutants across the whole package
result <- mutate_package("path/to/your/package", max_mutants = 100)

# Optional: set a fixed timeout (seconds) per mutant test run
result <- mutate_package("path/to/your/package", timeout_seconds = 60)

# Optional: control where mutant files are written
result <- mutate_package("path/to/your/package", mutation_dir = tempdir())

Testing

mutator selects a package test strategy automatically:

  • If tests/testthat/ exists, mutator loads the mutant in-process with pkgload::load_all() and runs its tests the way the package's own tests/testthat.R harness does — forwarding the same arguments (notably any filter) that the harness passes to testthat::test_check() to testthat::test_dir(). This means mutator runs exactly the tests the package author (and R CMD check) run, without paying for an install per mutant.
  • Otherwise, if tests/ exists, mutator falls back to tools::testInstalledPackage(..., types = "tests") after installing each mutant with --install-tests.

The fallback path supports non-testthat layouts (for example tinytest-driven packages that run through tests/ scripts). To avoid recompiling C/C++ on every mutant, mutator installs the unmutated package — compiling its shared objects — once into a template library, then installs each mutant with --no-libs (R code only) and restores the template's prebuilt shared objects before running its tests. This is safe because mutator does not mutate compiled code, so the shared object is identical for every mutant; it also means concurrent mutant installs no longer write into a shared src/ (see "Parallel execution and isolation" below).

Each mutant test run uses a timeout. By default, mutator runs the baseline suite first and derives the per-mutant timeout as baseline_elapsed_seconds * 1.5. You can override this by setting timeout_seconds explicitly.

Mutant outcomes are reported as:

  • SURVIVED: tests passed for the mutant
  • KILLED: tests failed (or execution error)
  • HANG: mutant exceeded timeout

mutator itself uses testthat for its own R tests and testthat + Catch2 for C++ tests.

  • C++ tests are located in src/test-*.cpp
  • The C++ test runner is src/test-runner.cpp
  • C++ tests are executed from tests/testthat/test-cpp.R via run_cpp_tests("mutator")

Run the full test suite with:

devtools::test()

Configuration

Timeouts and the contended baseline

Each mutant is run with a wall-clock timeout; exceeding it is reported as HANG. The timeout is derived from how long the package's own test suite takes, unless you pass timeout_seconds explicitly.

Because mutants run in parallel (cores at a time), the timeout must account for contention: with many workers, each test run is slower than it would be alone — for packages that load many dependencies, or do heavy per-mutant install work, dramatically so. A timeout based on a single, uncontended baseline run would then fire on nearly every mutant (a wave of false HANGs).

To avoid this, mutate_package() first runs the baseline suite once on its own (to confirm it passes and fail fast otherwise), then runs it cores times concurrently and takes the slowest of those as the contended baseline. The timeout is max(contended_baseline * 1.5, 5s). This self-calibrates to the machine, the chosen parallelism, and the package's real load/compile cost — no manual tuning. Pass timeout_seconds to override it entirely. (When cores = 1, or when forking is unavailable, the solo baseline is used.)

CRAN mode (test selection)

By default mutate_package() runs each mutant's tests in CRAN mode (cran = TRUE): the NOT_CRAN environment variable is set to "false" in the test subprocess, so testthat::skip_on_cran() and skip_if_offline() guards take effect and mutator runs the same tests CRAN would. This skips the slow, flaky, or network-dependent tests that packages mark as CRAN-skippable — which keeps mutation runs fast and avoids spurious timeouts/kills from, e.g., tests that hit the network.

Set cran = FALSE to run the full suite instead (NOT_CRAN = "true", the behaviour of devtools::test()):

mutate_package("path/to/pkg", cran = FALSE)

This applies to both test strategies (the testthat strategy and the installed-tests fallback). Note that it only affects tests the package explicitly guards with skip_on_cran() / skip_if_offline(); a package whose network test has no such guard will still run that test in either mode.

Fail-fast (stop at the first failing test)

A mutant is KILLED the instant any one test detects it, so running the rest of its suite is wasted work. By default (fail_fast = TRUE) each mutant's test run stops at the first failing test instead of finishing the suite, which speeds up the test-running phase — often substantially for packages with large suites — without changing any mutant's verdict (the early-aborted run still reports the failure, so the mutant is still KILLED).

Set fail_fast = FALSE to run the full suite for every mutant:

mutate_package("path/to/pkg", fail_fast = FALSE)

This applies to the testthat strategy (it sets TESTTHAT_MAX_FAILS = 1 in the test subprocess and uses the progress reporter, which aborts the run at the first failing test_that() block). SURVIVED mutants are unaffected — they have no failure to short-circuit on — so baseline timing and timeout calibration are unchanged. The installed-tests fallback already stops at the first failing test file regardless of this flag.

Parallel execution and isolation (isolate)

By default each mutant package copy symlinks the unchanged directories of the original package (only the mutated R/ file is materialised), so all parallel workers point at the same physical src/, tests/, etc. This is fast, and two design choices keep it correct:

  • Compiled code is built once, not per mutant. Earlier, the installed strategy recompiled into src/ on every R CMD INSTALL; with several installs running at once against the same (symlinked) src/ they clobbered each other's .o/.so (shared object not found), so genuine survivors were misreported as KILLED/HANG. The strategy now compiles once into a template library and installs each mutant with --no-libs, which never writes into src/, so the race is gone. (The testthat strategy was always immune: pkgload::load_all() reuses the baseline's compiled src/ since C code is never mutated.)

The one remaining hazard is non-hermetic tests that write files into a shared directory (most often tests/). When parallel workers' tests fight over the same files you can still see spurious KILLED/HANG. Two ways to attenuate it:

  1. Run without parallelism: cores = 1. No contention, no extra disk — but the slowest option.

  2. Isolate file state: isolate = TRUE. Each mutant gets its own deep copy of src/ and tests/ instead of a symlink, so file-writing tests can't collide:

    mutate_package("path/to/pkg", isolate = TRUE)

The default (isolate = FALSE) is fast and correct for hermetic test suites; reach for isolate = TRUE (or cores = 1) only when a package's tests are not hermetic and you see parallel-only KILLED/HANG results.

Equivalent Mutant Detection

Equivalent-mutant detection calls an OpenAI-compatible Chat Completions API. Configure it in any of these ways (listed highest precedence first; each setting is resolved independently, so they can be mixed):

  1. Programmatically, in your R session:

    set_openai_config(
      api_key  = "sk-...",
      model    = "gpt-4",
      base_url = "https://api.openai.com/v1" # any OpenAI-compatible endpoint
    )
  2. A .openai_config file in the working directory (or in a directory you pass via get_openai_config(dir = ...)). It is a plain, human-readable file of field: value lines and is parsed, never executed:

    api_key: sk-...
    model: gpt-4
    base_url: https://api.openai.com/v1
    

    Only the given directory is consulted — parent directories are not searched.

  3. Environment variables OPENAI_API_KEY, OPENAI_MODEL and OPENAI_BASE_URL.

If nothing is configured, the model defaults to gpt-4 and the base URL to the public OpenAI API. Set base_url to target a self-hosted or alternative OpenAI-compatible service (for example http://localhost:11434/v1).

Survived mutants are analyzed in bounded batches (default 25 per request), and each mutant is shown to the model as a small unified diff of its edit (plus a short change label) rather than its full mutated source — compact, unambiguous, and in a format LLMs read natively. When mutate_package() runs with cores > 1 the batches are sent concurrently, which (with the bounded size) keeps the equivalence pass fast and avoids the truncated responses that otherwise drop verdicts.

For mutants the model flags as EQUIVALENT (the rare, high-stakes calls — they are excluded from the adjusted mutation score), it also returns a one-sentence reason, stored as equivalence_reason on the mutant so the call can be audited. No reason is requested for NOT_EQUIVALENT/DONT_KNOW, keeping responses small.

Progress and the results summary are emitted via message() (so they can be silenced with suppressMessages()). The full prompts and model responses are not printed by default; set options(mutator.verbose = TRUE) to log them.

Mutation Operators

mutator implements a wide range of mutation operators to thoroughly test your code:

Arithmetic and Logical Operator Replacements

Operator Description Example
cxx_add_to_sub Replaces + with - a + ba - b
cxx_sub_to_add Replaces - with + a - ba + b
cxx_mul_to_div Replaces * with / a * ba / b
cxx_div_to_mul Replaces / with * a / ba * b
cxx_eq_to_ne Replaces == with != a == ba != b
cxx_ne_to_eq Replaces != with == a != ba == b
cxx_gt_to_ge Replaces > with >= a > ba >= b
cxx_gt_to_le Replaces > with <= a > ba <= b
cxx_lt_to_le Replaces < with <= a < ba <= b
cxx_lt_to_ge Replaces < with >= a < ba >= b
cxx_ge_to_gt Replaces >= with > a >= ba > b
cxx_ge_to_lt Replaces >= with < a >= ba < b
cxx_le_to_lt Replaces <= with < a <= ba < b
cxx_le_to_gt Replaces <= with > a <= ba > b
cxx_and_to_or Replaces & with | a & ba | b
cxx_or_to_and Replaces | with & a | ba & b
cxx_logical_and_to_or Replaces && with || a && ba || b
cxx_logical_or_to_and Replaces || with && a || ba && b

Unary Operator Mutations

Operator Description Example
cxx_minus_to_noop Removes unary minus -xx
cxx_remove_negation Removes logical negation !xx

Assignment and Value Mutations

Operator Description Example
cxx_assign_const Replaces assignment with constant a = ba = 42
cxx_replace_scalar_call Replaces function call with constant f(x)42
scalar_value_mutator Replaces constants 042, non-zero → 0
negate_mutator Negates conditionals x!x, !xx

Dependencies

mutator depends on:

  • R Packages:
    • pkgload: For loading mutated package copies
    • testthat: For test execution
    • xml2: For running C++ tests through testthat::run_cpp_tests()
    • future and furrr: For parallel execution
    • httr and jsonlite: For OpenAI API integration
  • LinkingTo: testthat (for Catch2 C++ test headers)
  • C++17: For native mutation engine implementation

About

Automated mutation testing tool for R language

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors