A complete, high-performance parser library for Visual Basic 6 code and project files.
Project Documentation & Resources
VB6 Library Reference
Code Coverage Report
Performance Benchmarks
VB6Parse is designed as a foundational library for tools that analyze, convert, or process Visual Basic 6 code. While capable of supporting real-time syntax highlighting and language servers, its primary focus is on offline analysis, legacy code utilities, and migration tools.
Key Features:
- Fast, efficient parsing with minimal allocations
- Full support for VB6 project files, modules, classes, forms, and resources
- Concrete Syntax Tree (CST) with complete source fidelity
- 160+ built-in VB6 library functions and 42 statements
- Comprehensive error handling with detailed failure information
- Zero-copy tokenization and streaming parsing
Add VB6Parse to your Cargo.toml:
[dependencies]
vb6parse = "0.5.1"use vb6parse::*;
let input = r#"Type=Exe
Reference=*\G{00020430-0000-0000-C000-000000000046}#2.0#0#...\stdole2.tlb#OLE Automation
Module=Module1; Module1.bas
Form=Form1.frm
"#;
// Decode source with Windows-1252 encoding (VB6 default)
let source = SourceFile::from_string("Project1.vbp", input);
// Parse the project
let result = ProjectFile::parse(&source);
// Handle results
let (project, failures) = result.unpack();
if let Some(project) = project {
println!("Project type: {:?}", project.project_type);
println!("Modules: {}", project.modules().count());
println!("Forms: {}", project.forms().count());
}
// Print any parsing errors
for failure in failures {
failure.print();
}use vb6parse::*;
let code = r#"Attribute VB_Name = "MyModule"
Public Sub HelloWorld()
MsgBox "Hello, World!"
End Sub
"#;
let source = SourceFile::from_string("MyModule.bas", code);
let result = ModuleFile::parse(&source);
let (module, failures) = result.unpack();
if let Some(module) = module {
println!("Module name: {}", module.name);
}use vb6parse::*;
let mut source_stream = SourceStream::new("test.bas", "Dim x As Integer");
let (token_stream, _failures) = tokenize(&mut source_stream).unpack();
if let Some(tokens) = token_stream {
for (text, token) in tokens {
println!("{:?}: {:?}", text, token);
}
}use vb6parse::*;
let contents = "Sub Test()\n x = 5\nEnd Sub";
let (tree_opt, _failures) = ConcreteSyntaxTree::from_text("test.bas", contents).unpack();
// Print the tree
if let Some(tree) = tree_opt {
println!("{}", tree.debug_tree());
}The CST provides rich navigation capabilities for traversing and querying the tree structure:
use vb6parse::*;
use vb6parse::parsers::SyntaxKind;
let source = "Sub Test()\n Dim x As Integer\n x = 42\nEnd Sub";
let cst = ConcreteSyntaxTree::from_text("test.bas", source).unwrap();
let root = cst.to_root_node();
// Basic navigation
let child_count = root.child_count();
let first = root.first_child();
// Find by kind
let sub_stmt = root.find(SyntaxKind::SubStatement); // First match
let all_dims = root.find_all(SyntaxKind::DimStatement); // All matches
// Filter children
let non_tokens: Vec<_> = root.non_token_children().collect();
let significant: Vec<_> = root.significant_children().collect();
// Custom search with predicates
let keywords = root.find_all_if(|n| n.kind().to_string().ends_with("Keyword"));
let complex = root.find_all_if(|n| !n.is_token() && n.children().len() > 5);
// Iterate all nodes depth-first
for node in root.descendants() {
if node.is_significant() {
println!("{:?}: {}", node.kind(), node.text());
}
// Convenience checkers
if node.is_comment() || node.is_whitespace() {
// Skip trivia
}
}Available Navigation Methods:
Both ConcreteSyntaxTree and CstNode provide:
- Basic:
child_count(),first_child(),last_child(),child_at() - By Kind:
children_by_kind(),first_child_by_kind(),contains_kind() - Recursive:
find(),find_all() - Filtering:
non_token_children(),token_children(),significant_children() - Predicates:
find_if(),find_all_if() - Traversal:
descendants(),depth_first_iter()
CstNode also provides: is_whitespace(), is_newline(), is_comment(), is_trivia(), is_significant()
See also: examples/cst_navigation.rs for comprehensive examples.
For common use cases, import everything with:
use vb6parse::*;This brings in:
- I/O Layer:
SourceFile,SourceStream - Lexer:
tokenize(),Token,TokenStream - File Parsers:
ProjectFile,ClassFile,ModuleFile,FormFile,FormResourceFile - Syntax Parsers:
parse(),ConcreteSyntaxTree,SyntaxKind,SerializableTree - Error Handling:
ErrorDetails,ParseResult, all error kind enums
For advanced use cases, access specific layers:
use vb6parse::io::{SourceFile, SourceStream, Comparator};
use vb6parse::lexer::{tokenize, Token, TokenStream};
use vb6parse::parsers::{parse, ConcreteSyntaxTree};
use vb6parse::language::controls::{Control, ControlKind};
use vb6parse::errors::{ProjectErrorKind, FormErrorKind};Bytes/String/File β SourceFile β SourceStream β TokenStream β CST β Object Layer
(Windows-1252) (Characters) (Tokens) (Tree) (Structured)
Layers:
- I/O Layer (
io): Character decoding and stream access - Lexer Layer (
lexer): Tokenization with keyword lookup - Syntax Layer (
syntax): VB6 language constructs and library functions - Parsers Layer (
parsers): CST construction from tokens - Files Layer (
files): High-level file format parsers - Language Layer (
language): VB6 types, colors, controls - Errors Layer (
errors): Comprehensive error types
src/
βββ io/ # I/O Layer - Character streams and decoding
β βββ mod.rs # SourceFile, SourceStream
β βββ comparator.rs # Case-sensitive/insensitive comparison
β βββ decode.rs # Windows-1252 decoding
β
βββ lexer/ # Lexer Layer - Tokenization
β βββ mod.rs # tokenize() function, keyword lookup
β βββ token_stream.rs # TokenStream implementation
β
βββ syntax/ # Syntax Layer - VB6 Language constructs
β βββ library/ # VB6 built-in library unit tests and documentation
β β βββ functions/ # 160+ VB6 functions (14 categories)
β β β βββ array/ # Array, Filter, Join, Split, etc.
β β β βββ conversion/ # CBool, CInt, CLng, Str, Val, etc.
β β β βββ datetime/ # Date, Now, Time, Year, Month, etc.
β β β βββ file_system/ # Dir, EOF, FileLen, LOF, etc.
β β β βββ financial/ # FV, IPmt, IRR, NPV, PV, Rate, etc.
β β β βββ interaction/ # MsgBox, InputBox, Shell, etc.
β β β βββ math/ # Abs, Cos, Sin, Tan, Log, Sqr, etc.
β β β βββ miscellaneous/ # Environ, RGB, QBColor, etc.
β β β βββ string/ # Left, Right, Mid, Len, Trim, etc.
β β β βββ ...
β β βββ statements/ # VB6 statement unit tests and documentation (7 categories)
β β βββ file_operations/ # Open, Close, Get, Put, etc.
β β βββ filesystem/ # FileCopy, Kill, MkDir, RmDir, etc.
β β βββ runtime_control/ # DoEvents, Stop, End, etc.
β β βββ runtime_state/ # Date, Time assignment, etc.
β β βββ string_manipulation/ # Mid statement, etc.
β β βββ system_interaction/ # Beep, etc.
β β βββ ...
β βββ statements/ # Statement parsing logic
β β βββ control_flow/ # If, Select Case, For, While parsers
β β βββ declarations/ # Dim, ReDim, Const, Enum parsers
β β βββ objects/ # Set, With, RaiseEvent parsers
β βββ expressions/ # Expression parsing utilities
β
βββ parsers/ # Parsers Layer - CST construction
β βββ cst/ # Concrete Syntax Tree implementation
β β βββ mod.rs # parse(), ConcreteSyntaxTree, CstNode
β β βββ rowan_wrapper.rs # Red-green tree wrapper
β βββ parseresults.rs # ParseResult<T, E> type
β βββ syntaxkind.rs # SyntaxKind enum (all token types)
β
βββ files/ # Files Layer - VB6 file format parsers
β βββ common/ # Shared parsing utilities
β β βββ properties.rs # Property bag, PropertyGroup
β β βββ attributes.rs # Attribute statement parsing
β β βββ references.rs # Object reference parsing
β βββ project/ # VBP - Project files
β β βββ mod.rs # ProjectFile struct and parser
β β βββ properties.rs # Project properties
β β βββ references.rs # Reference types
β β βββ compilesettings.rs # Compilation settings
β βββ class/ # CLS - Class modules
β βββ module/ # BAS - Code modules
β βββ form/ # FRM - Forms
β βββ resource/ # FRX - Form resources
β
βββ language/ # Language Layer - VB6 types and definitions
β βββ color.rs # VB6 color constants and Color type
β βββ controls/ # VB6 control definitions (50+ controls)
β β βββ mod.rs # Control, ControlKind enums
β β βββ form.rs # FormProperties
β β βββ textbox.rs # TextBoxProperties
β β βββ label.rs # LabelProperties
β β βββ ... # 50+ control types
β βββ tokens.rs # Token enum definition
β
βββ errors/ # Errors Layer - Error types
β βββ mod.rs # ErrorDetails, error printing
β βββ decode.rs # SourceFileErrorKind
β βββ tokenize.rs # CodeErrorKind
β βββ project.rs # ProjectErrorKind
β βββ class.rs # ClassErrorKind
β βββ module.rs # ModuleErrorKind
β βββ form.rs # FormErrorKind
β βββ property.rs # PropertyError
β βββ resource.rs # ResourceErrorKind
β
βββ lib.rs # Public API surface
use vb6parse::language::Control;
use vb6parse::*;
fn extract_controls(form_path: &str) -> Vec<String> {
let source = SourceFile::from_file(form_path).unwrap();
let result = FormFile::parse(&source);
let (form, _) = result.unpack();
let mut control_names = Vec::new();
if let Some(formfile) = form {
fn visit_control(control: &Control, names: &mut Vec<String>) {
names.push(control.name().to_string());
// Recursively visit children
if let Some(children) = control.kind().children() {
for child in children {
visit_control(child, names);
}
}
}
for control in formfile.form.children().unwrap() {
visit_control(control, &mut control_names);
}
}
control_names
}use vb6parse::*;
fn count_identifiers(code: &str, function_name: &str) -> usize {
let mut source_stream = SourceStream::new("temp.bas", code);
let result = tokenize(&mut source_stream);
let (tokens, _) = result.unpack();
tokens
.map(|ts| {
ts.filter(|(text, token)| {
*token == language::Token::Identifier && text.eq_ignore_ascii_case(function_name)
})
.count()
})
.unwrap_or(0)
}VB6Parse uses a custom ParseResult<T, E> type that separates successful results from recoverable errors:
use vb6parse::*;
let result = ProjectFile::parse(&source);
// Option 1: Unpack into result and failures
let (project_opt, failures) = result.unpack();
// Option 2: Check for failures first
if result.has_failures() {
for failure in result.failures() {
eprintln!("Error at line {}: {:?}", failure.error_offset, failure.kind);
}
}
// Option 3: Convert to Result<T, Vec<ErrorDetails>>
let std_result = result.ok_or_errors();See also:
- src/parsers/parseresults.rs - ParseResult implementation
- src/errors/mod.rs - Error types and ErrorDetails
The Concrete Syntax Tree preserves all source information including whitespace and comments:
use vb6parse::*;
let tree = parse(token_stream);
// Navigate the tree
let root = tree.to_root_node();
for child in root.children() {
println!("Node: {:?}", child.kind());
println!("Text: {}", child.text());
}
// Serialize for debugging
let serializable = tree.to_serializable();
println!("{:#?}", serializable);See also:
- src/parsers/cst/mod.rs - CST documentation
- examples/cst_parse.rs - CST parsing example
- examples/debug_cst.rs - CST debugging
VB6 uses Windows-1252 encoding. Always use decode_with_replacement() for file content:
use vb6parse::*;
// From bytes (e.g., file read)
let bytes = std::fs::read("file.bas")?;
let source = SourceFile::decode_with_replacement("file.bas", &bytes).unwrap();
// From UTF-8 string (testing/programmatic)
let source = SourceFile::from_string("test.bas", "Dim x As Integer");See also:
- src/io/decode.rs - Decoding implementation
- examples/parse_class.rs - Byte-level parsing
VB6Parse includes full definitions for 160+ VB6 library functions organized into 14 categories:
// Access function metadata
use vb6parse::syntax::library::functions::string::left;
use vb6parse::syntax::library::functions::math::sin;
use vb6parse::syntax::library::functions::conversion::cint;
// Each module includes:
// - Full VB6 documentation
// - Function signatures
// - Parameter descriptions
// - Usage examples
// - Related functionsCategories:
- Array manipulation (Array, Filter, Join, Split, UBound, LBound)
- Conversion (CBool, CDate, CInt, CLng, CStr, Val, Str)
- Date/Time (Date, Time, Now, Year, Month, Day, Hour, DateAdd, DateDiff)
- File System (Dir, EOF, FileLen, FreeFile, LOF, Seek)
- Financial (FV, IPmt, IRR, NPV, PV, Rate)
- Formatting (Format, FormatCurrency, FormatDateTime, FormatNumber, FormatPercent)
- Interaction (MsgBox, InputBox, Shell, CreateObject, GetObject)
- Inspection (IsArray, IsDate, IsEmpty, IsNull, IsNumeric, TypeName, VarType)
- Math (Abs, Atn, Cos, Exp, Log, Rnd, Sgn, Sin, Sqr, Tan)
- String (Left, Right, Mid, Len, InStr, Replace, Trim, UCase, LCase)
- And more...
See also: src/syntax/library/functions/
Form resource files contain binary data for controls (images, icons, property blobs):
use vb6parse::*;
// Option 1: load bytes and hand to FormResourceFile to handle.
let bytes = std::fs::read("Form1.frx")?;
let result = FormResourceFile::parse("Form1.frx", bytes);
// Option 2: Load directly from file.
let result = FormResourceFile::from_file("Form1.frx")?;
let (resource, _failures) = result.unpack();
if let Some(resource) = resource {
for (offset, data) in resource.iter_entries() {
println!(
"Resource at offset {}: {} bytes",
offset,
data.as_bytes().unwrap().len()
);
}
}See also:
- docs/technical/frx-format.html - FRX format specification
- examples/debug_resource.rs - Resource file debugging
VB6Parse has comprehensive test coverage.
π View Test Coverage Report
# Clone test data (required for integration tests)
git submodule update --init --recursive
# Run all tests
cargo test
# Run only library tests
cargo test --lib
# Run only integration tests
cargo test --test '*'
# Run documentation tests
cargo test --docIntegration tests use insta for snapshot testing:
# Review snapshot changes
cargo insta review
# Accept all snapshots
cargo insta acceptTest data location: tests/data/ (git submodules of real VB6 projects)
See also:
- tests/ - Test files
- tests/snapshots/ - Snapshot files
- π View Test Coverage Report
VB6Parse includes criterion benchmarks for performance testing:
# Run all benchmarks
cargo bench
# Run specific benchmark
cargo bench bulk_parser_load
# Generate HTML reports
# Results saved to target/criterion/Benchmarks:
bulk_parser_load- Parsing multiple large VB6 projects- Token stream generation
- CST construction
See also:
- benches/ - Benchmark source code
- π View Benchmark Results
VB6Parse uses cargo-llvm-cov to track test coverage and ensure comprehensive testing across all modules.
# Install cargo-llvm-cov
cargo install cargo-llvm-cov# Generate coverage report (terminal output)
cargo llvm-cov
# Generate HTML report
cargo llvm-cov --html
# Open target/llvm-cov/html/index.html in your browser
# Generate coverage with open HTML report
cargo llvm-cov --open
# Generate detailed coverage for specific packages
cargo llvm-cov --package vb6parse
# Include tests in coverage
cargo llvm-cov --all-targets
# Generate LCOV format (for CI/CD integration)
cargo llvm-cov --lcov --output-path lcov.infoCoverage reports are saved to:
- HTML reports:
target/llvm-cov/html/ - Terminal summary: Displays percentage coverage after running
cargo llvm-cov - LCOV files:
lcov.info(when using--lcovflag)
Current Coverage:
- Library tests: 5,467 tests covering VB6 library functions
- Integration tests: 31 tests with real-world VB6 projects
- Documentation tests: 83 tests ensuring examples work
- Coverage focus: Parsers, tokenization, error handling, and file format support
Contributions are welcome! Please see the CONTRIBUTING.md file for more information.
# Clone repository
git clone https://github.com/scriptandcompile/vb6parse
cd vb6parse
# Get test data
git submodule update --init --recursive
# Run tests
cargo test
# Run benchmarks
cargo bench
# Check for issues
cargo clippy
# Format code
cargo fmt- Layer Separation: Keep clear boundaries between layers
- Windows-1252 Handling: Always use
SourceFile::decode_with_replacement() - Error Recovery: Parsers should recover from errors when possible
- CST Fidelity: Preserve all source text including whitespace and comments
- Documentation: Include doc tests for public APIs
VB6 Library Functions:
- Add to appropriate category in
src/syntax/library/functions/ - Include full VB6 documentation
- Add comprehensive tests
- Update category mod.rs
Control Types:
- Add to
src/language/controls/ - Define properties struct
- Add to ControlKind enum
- Include property validation
Error Types:
- Add to appropriate error module in
src/errors/ - Ensure Display implementation
- Add context information
- Use zero-copy where possible (string slices, not String)
- Avoid unnecessary allocations (use iterators)
- Leverage rowan's red-green tree for CST memory efficiency
- Use
phfcrate for compile-time lookup tables
See also:
- CHANGELOG.md - Version history
| Extension | Description | Status |
|---|---|---|
.vbp |
Project files | β Complete |
.cls |
Class modules | β Complete |
.bas |
Code modules | β Complete |
.frm |
Forms | |
.frx |
Form resources | |
.ctl |
User controls | β Parsed as forms |
.dob |
User documents | β Parsed as forms |
.vbw |
IDE window state | β Not yet implemented |
.dsx |
Data environments | β Not yet implemented |
.dsr |
Data env. resources | β Not yet implemented |
.ttx |
Crystal reports | β Not yet implemented |
- β Core Parsing: Fully implemented for VBP, CLS, BAS files
- β Tokenization: Complete with keyword lookup
- β CST Construction: Full syntax tree with source fidelity
- β Error Handling: Comprehensive error types and recovery
- β VB6 Library: 160+ functions, 42 statements documented
β οΈ FRX Resources: Binary loading complete, property mapping partialβ οΈ FRM Properties: Majority of FRM properties load properly, (icon, background, font mapping partial)- β AST: Not yet implemented (CST available)
- β Testing: 5,500+ tests across unit, integration, and doc tests
- β Benchmarking: Criterion-based performance testing
- β Fuzz Testing: Coverage-guided fuzzing with cargo-fuzz
- β Documentation: Comprehensive API docs and examples
VB6Parse includes comprehensive fuzz testing using cargo-fuzz and libFuzzer to discover edge cases, crashes, and undefined behavior.
Available Fuzz Targets:
sourcefile_decode- Tests Windows-1252 decoding with arbitrary bytessourcestream- Tests low-level character stream operationstokenize- Tests tokenization with malformed VB6 codecst_parse- Tests Concrete Syntax Tree parsing with invalid syntax
Quick Start:
# Install cargo-fuzz (requires nightly)
cargo install cargo-fuzz
# Run a fuzzer for 60 seconds
cargo +nightly fuzz run sourcefile_decode -- -max_total_time=60
# List all fuzz targets
cargo +nightly fuzz listLearn More: See fuzz/README.md for detailed usage.
All examples are located in the examples/ directory:
| Example | Description |
|---|---|
| audiostation_parse.rs | Parse a complete real-world VB6 project |
| cst_navigation.rs | Navigate and query the Concrete Syntax Tree |
| cst_parse.rs | Parse tokens directly to CST |
| debug_cst.rs | Display CST debug representation |
| debug_resource.rs | Inspect FRX resource files |
| parse_class.rs | Parse class files from bytes |
| parse_control_only.rs | Parse individual form controls |
| parse_form.rs | Parse VB6 forms |
| parse_module.rs | Parse code modules |
| parse_project.rs | Parse project files |
| sourcestream.rs | Work with character streams |
| tokenstream.rs | Tokenize VB6 code |
Run any example with:
cargo run --example parse_project- Documentation: docs.rs/vb6parse
- Repository: github.com/scriptandcompile/vb6parse
- Crates.io: crates.io/crates/vb6parse
- License: MIT
- Encoding: Primarily designed for "predominantly English" source code with Windows-1252 encoding detection limitations
- AST: Abstract Syntax Tree is not yet implemented (Concrete Syntax Tree is available)
- FRX Mapping: Binary resources are loaded but not all are mapped to control properties
- Real-time Use: While capable, not optimized for real-time highlighting or LSP (focus is on offline analysis)
MIT License - See LICENSE file for details.
Built with β€οΈ by ScriptAndCompile