feat: add output_length_guard plugin by msureshkumar88 · Pull Request #24 · IBM/cpex-plugins

msureshkumar88 · 2026-04-10T09:13:51Z

Pull Request #3926: Rust Acceleration for Output Length Guard Plugin

Status: CLOSED (Recreated in PR #4104)
Closed Date: April 9, 2026

📋 Overview

This pull request introduces a PyO3-based Rust execution engine for the output length guard plugin in the IBM/mcp-context-forge repository. The implementation creates a hybrid Python-Rust architecture that significantly improves performance while maintaining backward compatibility.

Architecture Design

The hybrid approach divides responsibilities between Python and Rust:

Python Layer Handles:

Plugin lifecycle management
Hook integration with the framework
MCP content dictionary handling
Fallback behavior when Rust is unavailable

Rust Layer Handles:

High-performance string truncation
Recursive list/dict traversal
Violation detection
Passthrough optimization

The Rust engine exposes a high-level process() API that reduces Python-Rust boundary crossing to a single call per tool_post_invoke invocation, minimizing FFI overhead.

🎯 Problems Solved

Gap 1: No Rust Acceleration Path (MEDIUM Priority)

Problem:
The output length guard was the only post-invoke plugin without Rust optimization, creating a performance bottleneck in the plugin ecosystem.

Solution:
Introduced OutputLengthGuardEngine with automatic detection and graceful fallback to Python implementation when Rust module is unavailable.

Gap 2: O(n) Character Counting Performance (MEDIUM Priority)

Problem:
Initial Rust implementation was 124x slower than Python for large strings due to inefficient character counting.

Solution:
Implemented count_chars_capped() with:

Early-exit optimization at limit + 1
Byte-length fast path for ASCII strings
Zero-copy string borrowing

Gap 3: Per-item FFI Overhead (LOW Priority)

Problem:
Processing lists required crossing the Python-Rust boundary for each item, causing significant overhead.

Solution:
Added batch fast path for all-string lists in truncate mode, processing entire lists in a single Rust call.

🚀 Performance Results

Comprehensive benchmarks (1000 iterations + 50 warmup, character mode, max_chars=500):

Scenario	Python Time	Rust Time	Speedup
Short list passthrough (4 items)	2.88 μs	0.15 μs	18.9x faster ⚡
Short string passthrough (11 chars)	0.62 μs	0.06 μs	9.8x faster ⚡
Wide nested dict (d=2, b=20, 400 leaves)	651 μs	76 μs	8.5x faster ⚡
Deep nested dict (d=5, b=3, 243 leaves)	426 μs	61 μs	7.0x faster ⚡
Block mode (10 KB string)	10.4 μs	2.0 μs	5.1x faster ⚡
List of 10 x 10KB strings	105 μs	35 μs	3.0x faster ⚡

Key Takeaways

Consistent speedups across all scenarios (3x - 19x)
Largest gains in passthrough scenarios (minimal processing)
Significant improvements for nested structures (7x - 8.5x)
Substantial gains even for large string processing (3x - 5x)

⚡ Optimizations Implemented

O(1) Python len() Pre-check
- Skip Rust extraction for strings already under limit
- Eliminates unnecessary FFI calls
Zero-copy PyString::to_str() Borrow
- Replaces full string copy with borrow
- Reduces memory allocation overhead
count_chars_capped() Early-exit
- Stops counting at limit + 1
- Avoids processing entire large strings
byte_offset_of_char() Direct Slicing
- Zero-copy truncation using byte offsets
- Maintains UTF-8 character boundaries
String::with_capacity() Pre-sized Allocation
- Eliminates reallocation during string building
- Improves memory efficiency
Batch List Processing
- Process all-string lists in one Rust call
- Reduces FFI overhead significantly
Numeric String Skip
- Skip character counting for strings > 50 bytes
- Optimizes common numeric data patterns
MCP Content Dict Exclusion
- Preserve Python-side logic for MCP structures
- Maintains compatibility with framework

✅ Test Results

Rust Tests

47/47 unit tests passed ✅
Clippy clean (no warnings) ✅
rustfmt clean (formatting verified) ✅

Python Tests

331/331 tests passed ✅
1 expected skip (intentional)
0 failures ✅

Performance Verification

All benchmark scenarios show expected improvements ✅
No regressions detected ✅
Fallback behavior verified ✅

⚠️ Critical Issues Identified & Resolved

Issue 1: Broader Processing Scope

Problem:
Rust fast path processes more payload types than the original Python implementation:

Python only mutates dict["text"], list[str], and MCP text items
Rust can now truncate/block string-valued metadata (type, mimeType, IDs, URLs, annotations)

Impact:
Could potentially modify critical metadata fields that should remain intact.

Mitigation:
Added METADATA_KEYS list to explicitly preserve critical fields and maintain semantic integrity.

Issue 2: Token-Mode Semantics Divergence

Problem:
Different behavior based on whether Rust module loads:

Python path: Ignores token limits for plain str/dict/list
Rust path: Enforces token bounds for ALL shapes
Same configuration produces different results depending on Rust availability

Impact:
Inconsistent behavior across environments could lead to unexpected truncation.

Status:
Documented for awareness; requires architectural decision on desired behavior.

Issue 3: Bug Fix - Structured Content Display

Problem:
When structuredContent value is truncated, content[0].text showed only the value instead of full JSON representation.

Before:

content[0].text = "Helloasds…"  ❌

After:

content[0].text = "{\"message\":\"Helloasds…\"}"  ✅

Solution:
Removed single-key dict value extraction logic (lines 529-536 in src/lib.rs) to preserve full JSON context.

📁 Files Changed

File	Type	Description
`plugins_rust/output_length_guard/Cargo.toml`	New	Rust crate configuration with dependencies
`plugins_rust/output_length_guard/pyproject.toml`	New	Maturin build configuration for Python packaging
`plugins_rust/output_length_guard/Makefile`	New	Build, test, and install automation targets
`plugins_rust/output_length_guard/src/lib.rs`	New	Core Rust implementation (1,297 lines + 47 tests)
`plugins_rust/output_length_guard/src/bin/stub_gen.rs`	New	Python type stub generator for IDE support
`plugins_rust/output_length_guard/compare_performance.py`	New	Comprehensive benchmark script
`plugins/output_length_guard/output_length_guard.py`	Modified	+70/-1 lines (Rust integration layer)

👥 Contributors

gandhipratik203 - PR Author & Primary Developer
msureshkumar88 (Suresh Kumar Moharajan) - Contributor
lucarlig - Reviewer (requested changes)
jonpspri - Maintainer (closed PR, recreated in #4104)

📝 Commit History

Total Commits: 28
Timeline: March 24 - April 9, 2026

Commit Categories:

Feature development and initial implementation
Performance optimizations and benchmarking
Bug fixes and edge case handling
Test coverage improvements
Documentation updates
Code review feedback integration

🔄 Next Steps

This PR was closed and the work was recreated in PR #4104 for a clean implementation history. The recreation allows for:

Clean commit history without experimental iterations
Incorporation of all review feedback
Proper documentation of final design decisions
Streamlined merge process

📚 Technical Details

Dependencies Added

PyO3 - Python-Rust FFI bindings
pyo3-build - Build-time Python integration
unicode-segmentation - Proper Unicode character handling

Build System

Maturin - Rust-Python package builder
uv - Fast Python package installer
Automated wheel building and installation

Testing Strategy

Unit tests for all Rust functions
Integration tests for Python-Rust boundary
Performance benchmarks for regression detection
Fallback behavior verification

🎓 Lessons Learned

FFI Overhead Matters: Minimizing boundary crossings is critical for performance
Early Optimization Pays Off: Character counting optimization provided 124x improvement
Batch Processing Wins: Processing collections in bulk reduces overhead significantly
Fallback is Essential: Graceful degradation ensures reliability across environments
Testing is Critical: Comprehensive test coverage caught semantic divergence issues

End of PR Description

For the latest updates and continued work, see PR #4104

Add new output_length_guard plugin with Rust+Python implementation. Includes core library, build configuration, and performance comparison script. Signed-off-by: Suresh Kumar Moharajan <suresh.kumar.m@ibm.com>

feat: add output_length_guard plugin

e420ea9

Add new output_length_guard plugin with Rust+Python implementation. Includes core library, build configuration, and performance comparison script. Signed-off-by: Suresh Kumar Moharajan <suresh.kumar.m@ibm.com>

msureshkumar88 mentioned this pull request Apr 10, 2026

feat(plugin): Rust acceleration for output length guard IBM/mcp-context-forge#3926

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add output_length_guard plugin#24

feat: add output_length_guard plugin#24
msureshkumar88 wants to merge 1 commit intomainfrom
feat/output-length-guard-plugin

msureshkumar88 commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

msureshkumar88 commented Apr 10, 2026

Pull Request #3926: Rust Acceleration for Output Length Guard Plugin

📋 Overview

Architecture Design

🎯 Problems Solved

Gap 1: No Rust Acceleration Path (MEDIUM Priority)

Gap 2: O(n) Character Counting Performance (MEDIUM Priority)

Gap 3: Per-item FFI Overhead (LOW Priority)

🚀 Performance Results

Key Takeaways

⚡ Optimizations Implemented

✅ Test Results

Rust Tests

Python Tests

Performance Verification

⚠️ Critical Issues Identified & Resolved

Issue 1: Broader Processing Scope

Issue 2: Token-Mode Semantics Divergence

Issue 3: Bug Fix - Structured Content Display

📁 Files Changed

👥 Contributors

📝 Commit History

Commit Categories:

🔄 Next Steps

📚 Technical Details

Dependencies Added

Build System

Testing Strategy

🎓 Lessons Learned

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant