mutator is an automated mutation testing tool for the R language. It applies mutation testing principles to help developers improve test suite quality by introducing small, systematic changes (mutations) to source code and verifying if tests can detect these changes.
- Comprehensive Mutation Testing: Applies various mutation operators to R source code
- AST-Based Mutation: Uses Abstract Syntax Tree analysis for intelligent code mutations
- Parallel Test Execution: Runs tests in parallel for improved performance
- Equivalent Mutant Detection: Uses AI to identify mutants that are functionally equivalent to the original code
- Detailed Reporting: Provides mutation scores and analysis of test suite effectiveness
mutator/
├── R/ # R source code
│ ├── mutator.R # Core functionality
│ └── init.R
├── src/ # C++ source + Catch2 C++ tests (testthat integration)
├── tests/ # R tests (includes compiled C++ test entrypoint)
└── DESCRIPTION # Package metadata
# Install dependencies
install.packages(c("pkgload", "testthat", "httr", "jsonlite", "future", "furrr"))
# Install from GitHub (if available)
devtools::install_github("PRL-PRG/mutator")
# Or install from local source
# In R console
setwd("path/to/mutator")
devtools::install()library(mutator)
# Mutate a single file
mutants <- mutate_file("path/to/your/file.R")
# Optional: cap returned mutants by random selection
mutants <- mutate_file("path/to/your/file.R", max_mutants = 20)
# Mutate an entire package and run tests
result <- mutate_package("path/to/your/package")
# Optional: cap tested mutants across the whole package
result <- mutate_package("path/to/your/package", max_mutants = 100)
# Optional: set a fixed timeout (seconds) per mutant test run
result <- mutate_package("path/to/your/package", timeout_seconds = 60)
# Optional: control where mutant files are written
result <- mutate_package("path/to/your/package", mutation_dir = tempdir())mutator selects a package test strategy automatically:
- If
tests/testthat/exists, mutator loads the mutant in-process withpkgload::load_all()and runs its tests the way the package's owntests/testthat.Rharness does — forwarding the same arguments (notably anyfilter) that the harness passes totestthat::test_check()totestthat::test_dir(). This means mutator runs exactly the tests the package author (andR CMD check) run, without paying for an install per mutant. - Otherwise, if
tests/exists, mutator falls back totools::testInstalledPackage(..., types = "tests")after installing each mutant with--install-tests.
The fallback path supports non-testthat layouts (for example tinytest-driven packages that run through tests/ scripts). To avoid recompiling C/C++ on every mutant, mutator installs the unmutated package — compiling its shared objects — once into a template library, then installs each mutant with --no-libs (R code only) and restores the template's prebuilt shared objects before running its tests. This is safe because mutator does not mutate compiled code, so the shared object is identical for every mutant; it also means concurrent mutant installs no longer write into a shared src/ (see "Parallel execution and isolation" below).
Each mutant test run uses a timeout. By default, mutator runs the baseline suite first and derives the per-mutant timeout as baseline_elapsed_seconds * 1.5. You can override this by setting timeout_seconds explicitly.
Mutant outcomes are reported as:
SURVIVED: tests passed for the mutantKILLED: tests failed (or execution error)HANG: mutant exceeded timeout
mutator itself uses testthat for its own R tests and testthat + Catch2 for C++ tests.
- C++ tests are located in
src/test-*.cpp - The C++ test runner is
src/test-runner.cpp - C++ tests are executed from
tests/testthat/test-cpp.Rviarun_cpp_tests("mutator")
Run the full test suite with:
devtools::test()Each mutant is run with a wall-clock timeout; exceeding it is reported as
HANG. The timeout is derived from how long the package's own test suite takes,
unless you pass timeout_seconds explicitly.
Because mutants run in parallel (cores at a time), the timeout must account
for contention: with many workers, each test run is slower than it would be
alone — for packages that load many dependencies, or do heavy per-mutant install
work, dramatically so. A timeout based on a single, uncontended baseline run
would then fire on nearly every mutant (a wave of false HANGs).
To avoid this, mutate_package() first runs the baseline suite once on its own
(to confirm it passes and fail fast otherwise), then runs it cores times
concurrently and takes the slowest of those as the contended baseline. The
timeout is max(contended_baseline * 1.5, 5s). This self-calibrates to the
machine, the chosen parallelism, and the package's real load/compile cost — no
manual tuning. Pass timeout_seconds to override it entirely. (When cores = 1,
or when forking is unavailable, the solo baseline is used.)
By default mutate_package() runs each mutant's tests in CRAN mode
(cran = TRUE): the NOT_CRAN environment variable is set to "false" in the
test subprocess, so testthat::skip_on_cran() and skip_if_offline() guards
take effect and mutator runs the same tests CRAN would. This skips the slow,
flaky, or network-dependent tests that packages mark as CRAN-skippable — which
keeps mutation runs fast and avoids spurious timeouts/kills from, e.g., tests
that hit the network.
Set cran = FALSE to run the full suite instead (NOT_CRAN = "true", the
behaviour of devtools::test()):
mutate_package("path/to/pkg", cran = FALSE)This applies to both test strategies (the testthat strategy and the
installed-tests fallback). Note that it only affects tests the package
explicitly guards with skip_on_cran() / skip_if_offline(); a package whose
network test has no such guard will still run that test in either mode.
A mutant is KILLED the instant any one test detects it, so running the rest of
its suite is wasted work. By default (fail_fast = TRUE) each mutant's test run
stops at the first failing test instead of finishing the suite, which speeds
up the test-running phase — often substantially for packages with large suites —
without changing any mutant's verdict (the early-aborted run still reports the
failure, so the mutant is still KILLED).
Set fail_fast = FALSE to run the full suite for every mutant:
mutate_package("path/to/pkg", fail_fast = FALSE)This applies to the testthat strategy (it sets TESTTHAT_MAX_FAILS = 1 in the
test subprocess and uses the progress reporter, which aborts the run at the first
failing test_that() block). SURVIVED mutants are unaffected — they have no
failure to short-circuit on — so baseline timing and timeout calibration are
unchanged. The installed-tests fallback already stops at the first failing test
file regardless of this flag.
By default each mutant package copy symlinks the unchanged directories of the
original package (only the mutated R/ file is materialised), so all parallel
workers point at the same physical src/, tests/, etc. This is fast, and two
design choices keep it correct:
- Compiled code is built once, not per mutant. Earlier, the
installedstrategy recompiled intosrc/on everyR CMD INSTALL; with several installs running at once against the same (symlinked)src/they clobbered each other's.o/.so(shared object not found), so genuine survivors were misreported asKILLED/HANG. The strategy now compiles once into a template library and installs each mutant with--no-libs, which never writes intosrc/, so the race is gone. (Thetestthatstrategy was always immune:pkgload::load_all()reuses the baseline's compiledsrc/since C code is never mutated.)
The one remaining hazard is non-hermetic tests that write files into a shared
directory (most often tests/). When parallel workers' tests fight over the same
files you can still see spurious KILLED/HANG. Two ways to attenuate it:
-
Run without parallelism:
cores = 1. No contention, no extra disk — but the slowest option. -
Isolate file state:
isolate = TRUE. Each mutant gets its own deep copy ofsrc/andtests/instead of a symlink, so file-writing tests can't collide:mutate_package("path/to/pkg", isolate = TRUE)
The default (isolate = FALSE) is fast and correct for hermetic test suites;
reach for isolate = TRUE (or cores = 1) only when a package's tests are not
hermetic and you see parallel-only KILLED/HANG results.
Equivalent-mutant detection calls an OpenAI-compatible Chat Completions API. Configure it in any of these ways (listed highest precedence first; each setting is resolved independently, so they can be mixed):
-
Programmatically, in your R session:
set_openai_config( api_key = "sk-...", model = "gpt-4", base_url = "https://api.openai.com/v1" # any OpenAI-compatible endpoint )
-
A
.openai_configfile in the working directory (or in a directory you pass viaget_openai_config(dir = ...)). It is a plain, human-readable file offield: valuelines and is parsed, never executed:api_key: sk-... model: gpt-4 base_url: https://api.openai.com/v1Only the given directory is consulted — parent directories are not searched.
-
Environment variables
OPENAI_API_KEY,OPENAI_MODELandOPENAI_BASE_URL.
If nothing is configured, the model defaults to gpt-4 and the base URL to the
public OpenAI API. Set base_url to target a self-hosted or alternative
OpenAI-compatible service (for example http://localhost:11434/v1).
Survived mutants are analyzed in bounded batches (default 25 per request),
and each mutant is shown to the model as a small unified diff of its edit
(plus a short change label) rather than its full mutated source — compact,
unambiguous, and in a format LLMs read natively. When mutate_package() runs
with cores > 1 the batches are sent concurrently, which (with the bounded
size) keeps the equivalence pass fast and avoids the truncated responses that
otherwise drop verdicts.
For mutants the model flags as EQUIVALENT (the rare, high-stakes calls — they
are excluded from the adjusted mutation score), it also returns a one-sentence
reason, stored as equivalence_reason on the mutant so the call can be
audited. No reason is requested for NOT_EQUIVALENT/DONT_KNOW, keeping
responses small.
Progress and the results summary are emitted via message() (so they can be
silenced with suppressMessages()). The full prompts and model responses are
not printed by default; set options(mutator.verbose = TRUE) to log them.
mutator implements a wide range of mutation operators to thoroughly test your code:
| Operator | Description | Example |
|---|---|---|
| cxx_add_to_sub | Replaces + with - |
a + b → a - b |
| cxx_sub_to_add | Replaces - with + |
a - b → a + b |
| cxx_mul_to_div | Replaces * with / |
a * b → a / b |
| cxx_div_to_mul | Replaces / with * |
a / b → a * b |
| cxx_eq_to_ne | Replaces == with != |
a == b → a != b |
| cxx_ne_to_eq | Replaces != with == |
a != b → a == b |
| cxx_gt_to_ge | Replaces > with >= |
a > b → a >= b |
| cxx_gt_to_le | Replaces > with <= |
a > b → a <= b |
| cxx_lt_to_le | Replaces < with <= |
a < b → a <= b |
| cxx_lt_to_ge | Replaces < with >= |
a < b → a >= b |
| cxx_ge_to_gt | Replaces >= with > |
a >= b → a > b |
| cxx_ge_to_lt | Replaces >= with < |
a >= b → a < b |
| cxx_le_to_lt | Replaces <= with < |
a <= b → a < b |
| cxx_le_to_gt | Replaces <= with > |
a <= b → a > b |
| cxx_and_to_or | Replaces & with | |
a & b → a | b |
| cxx_or_to_and | Replaces | with & |
a | b → a & b |
| cxx_logical_and_to_or | Replaces && with || |
a && b → a || b |
| cxx_logical_or_to_and | Replaces || with && |
a || b → a && b |
| Operator | Description | Example |
|---|---|---|
| cxx_minus_to_noop | Removes unary minus | -x → x |
| cxx_remove_negation | Removes logical negation | !x → x |
| Operator | Description | Example |
|---|---|---|
| cxx_assign_const | Replaces assignment with constant | a = b → a = 42 |
| cxx_replace_scalar_call | Replaces function call with constant | f(x) → 42 |
| scalar_value_mutator | Replaces constants | 0 → 42, non-zero → 0 |
| negate_mutator | Negates conditionals | x → !x, !x → x |
mutator depends on:
- R Packages:
- pkgload: For loading mutated package copies
- testthat: For test execution
- xml2: For running C++ tests through
testthat::run_cpp_tests() - future and furrr: For parallel execution
- httr and jsonlite: For OpenAI API integration
- LinkingTo:
testthat(for Catch2 C++ test headers) - C++17: For native mutation engine implementation