Open
Conversation
Add new output_length_guard plugin with Rust+Python implementation. Includes core library, build configuration, and performance comparison script. Signed-off-by: Suresh Kumar Moharajan <suresh.kumar.m@ibm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request #3926: Rust Acceleration for Output Length Guard Plugin
PR Link: IBM/mcp-context-forge#3926
Status: CLOSED (Recreated in PR #4104)
Closed Date: April 9, 2026
📋 Overview
This pull request introduces a PyO3-based Rust execution engine for the output length guard plugin in the IBM/mcp-context-forge repository. The implementation creates a hybrid Python-Rust architecture that significantly improves performance while maintaining backward compatibility.
Architecture Design
The hybrid approach divides responsibilities between Python and Rust:
Python Layer Handles:
Rust Layer Handles:
The Rust engine exposes a high-level
process()API that reduces Python-Rust boundary crossing to a single call pertool_post_invokeinvocation, minimizing FFI overhead.🎯 Problems Solved
Gap 1: No Rust Acceleration Path (MEDIUM Priority)
Problem:
The output length guard was the only post-invoke plugin without Rust optimization, creating a performance bottleneck in the plugin ecosystem.
Solution:
Introduced
OutputLengthGuardEnginewith automatic detection and graceful fallback to Python implementation when Rust module is unavailable.Gap 2: O(n) Character Counting Performance (MEDIUM Priority)
Problem:
Initial Rust implementation was 124x slower than Python for large strings due to inefficient character counting.
Solution:
Implemented
count_chars_capped()with:Gap 3: Per-item FFI Overhead (LOW Priority)
Problem:
Processing lists required crossing the Python-Rust boundary for each item, causing significant overhead.
Solution:
Added batch fast path for all-string lists in truncate mode, processing entire lists in a single Rust call.
🚀 Performance Results
Comprehensive benchmarks (1000 iterations + 50 warmup, character mode, max_chars=500):
Key Takeaways
⚡ Optimizations Implemented
O(1) Python len() Pre-check
Zero-copy PyString::to_str() Borrow
count_chars_capped() Early-exit
byte_offset_of_char() Direct Slicing
String::with_capacity() Pre-sized Allocation
Batch List Processing
Numeric String Skip
MCP Content Dict Exclusion
✅ Test Results
Rust Tests
Python Tests
Performance Verification
Issue 1: Broader Processing Scope
Problem:
Rust fast path processes more payload types than the original Python implementation:
dict["text"],list[str], and MCP text itemsImpact:
Could potentially modify critical metadata fields that should remain intact.
Mitigation:
Added
METADATA_KEYSlist to explicitly preserve critical fields and maintain semantic integrity.Issue 2: Token-Mode Semantics Divergence
Problem:
Different behavior based on whether Rust module loads:
Impact:
Inconsistent behavior across environments could lead to unexpected truncation.
Status:
Documented for awareness; requires architectural decision on desired behavior.
Issue 3: Bug Fix - Structured Content Display
Problem:
When
structuredContentvalue is truncated,content[0].textshowed only the value instead of full JSON representation.Before:
content[0].text = "Helloasds…" ❌After:
content[0].text = "{\"message\":\"Helloasds…\"}" ✅Solution:
Removed single-key dict value extraction logic (lines 529-536 in src/lib.rs) to preserve full JSON context.
📁 Files Changed
plugins_rust/output_length_guard/Cargo.tomlplugins_rust/output_length_guard/pyproject.tomlplugins_rust/output_length_guard/Makefileplugins_rust/output_length_guard/src/lib.rsplugins_rust/output_length_guard/src/bin/stub_gen.rsplugins_rust/output_length_guard/compare_performance.pyplugins/output_length_guard/output_length_guard.py👥 Contributors
📝 Commit History
Total Commits: 28
Timeline: March 24 - April 9, 2026
Commit Categories:
🔄 Next Steps
This PR was closed and the work was recreated in PR #4104 for a clean implementation history. The recreation allows for:
📚 Technical Details
Dependencies Added
Build System
Testing Strategy
🎓 Lessons Learned
End of PR Description
For the latest updates and continued work, see PR #4104