Skip to content

Commit ac4a210

Browse files
itamargaclaude
andauthored
Code quality, Python 3.11 compat, and perf optimizations (#11)
* Enforce coding standards and add docstrings across entire codebase - Expand all short variable names (init, expr, stmt, prop, obj, args, etc.) - Add StrEnum for constant strings (BindingKind, node types, operators) - Replace if/elif chains with match statements (~20 files) - Reduce nesting via early exits and extracted helper functions - Add type hints to all function signatures using '|' union syntax - Add from __future__ import annotations to all modules - Add concise docstrings to every function, class, and method - Enforce single quotes for strings, double quotes for docstrings - Update tests to match renamed symbols and heuristic outputs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> black && isort * Fix Python 3.11 compatibility: replace type statement with plain aliases The `type X = ...` syntax requires Python 3.12+, but CI tests on 3.11. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix review issues: revert over-engineering, remove perf regressions - Revert IntEnum+match/case in traverser hot path back to plain int constants with if/elif (avoids enum dispatch overhead) - Revert sys.modules indirection in _run_pre_passes to direct calls - Remove BindingKind StrEnum; revert to plain string comparisons - Revert unnecessary variable renames (parser, __main__, _fast_to_dict) - Remove duplicate _write_output helpers; inline the I/O - Remove ~56 trivial docstrings that just restate function names - Restore sample.deobfuscated.js to match main (revert VariableRenamer behavioral change that renamed str→string, arr→array, etc.) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Optimize ConstantProp, ExpressionSimplifier, and _count_nodes Profiled 525 files (25 regression + 500 random from dataset). Total time improved from 20.6s to 18.4s (11% faster). ConstantProp (30x faster on large files): - _find_and_remove_declarator did a full AST traversal per removed binding (O(n*k)). Batched into single pass with set lookup (O(n)). - vue.esm.browser.js: 0.561s → 0.012s for this transform alone. ExpressionSimplifier (2.2x faster): - Merged 5 separate traverse() calls into 2 (one combined pass for unary/binary/conditional/await/comma + one for method calls). _count_nodes: - Replaced callback-based simple_traverse with direct iterative loop, eliminating per-node function call overhead. _fast_to_dict (parser): - Converted recursive esprima-to-dict conversion to stack-based iteration to avoid recursion overhead on large ASTs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * bump version * Add .gitignore and reference Kaggle dataset in README Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update snapshot to match new VariableRenamer heuristics The coding standards commit added response-like, path-like, and other usage-based rename hints to VariableRenamer. Regenerate the expected output snapshot to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 3190899 commit ac4a210

60 files changed

Lines changed: 5574 additions & 4044 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
__pycache__/
2+
*.pyc
3+
*.egg-info/
4+
dist/
5+
.idea/
6+
.nodeenv/
7+
.venv/
8+
uv.lock
9+
tests/resources/jsimplifier
10+
tests/resources/obfuscated-javascript-dataset

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,5 +94,6 @@ See [THIRD_PARTY_LICENSES.md](THIRD_PARTY_LICENSES.md) and
9494
[NOTICE](NOTICE) for full attribution.
9595

9696
Test samples include obfuscated JavaScript from the
97-
[JSIMPLIFIER dataset](https://zenodo.org/records/17531662) (GPL-3.0),
97+
[JSIMPLIFIER dataset](https://zenodo.org/records/17531662) (GPL-3.0)
98+
and the [Obfuscated JavaScript Dataset](https://www.kaggle.com/datasets/fanbyprinciple/obfuscated-javascript-dataset),
9899
used solely for evaluation purposes.

pyjsclear/__init__.py

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,18 @@
55
Python package.
66
"""
77

8+
from pathlib import Path
9+
810
from .deobfuscator import Deobfuscator
911

1012

11-
__version__ = '0.1.4'
13+
__all__ = ['Deobfuscator', 'deobfuscate', 'deobfuscate_file']
14+
15+
__version__ = '0.1.5'
1216

1317

1418
def deobfuscate(code: str, max_iterations: int = 50) -> str:
15-
"""Deobfuscate JavaScript code. Returns cleaned source.
19+
"""Deobfuscate JavaScript code and return cleaned source.
1620
1721
Args:
1822
code: JavaScript source code string.
@@ -24,7 +28,11 @@ def deobfuscate(code: str, max_iterations: int = 50) -> str:
2428
return Deobfuscator(code, max_iterations=max_iterations).execute()
2529

2630

27-
def deobfuscate_file(input_path: str, output_path: str | None = None, max_iterations: int = 50) -> str | bool:
31+
def deobfuscate_file(
32+
input_path: str | Path,
33+
output_path: str | Path | None = None,
34+
max_iterations: int = 50,
35+
) -> str | bool:
2836
"""Deobfuscate a JavaScript file.
2937
3038
Args:
@@ -40,8 +48,9 @@ def deobfuscate_file(input_path: str, output_path: str | None = None, max_iterat
4048

4149
result = deobfuscate(code, max_iterations=max_iterations)
4250

43-
if output_path:
44-
with open(output_path, 'w') as output_file:
45-
output_file.write(result)
46-
return result != code
47-
return result
51+
if not output_path:
52+
return result
53+
54+
with open(output_path, 'w') as output_file:
55+
output_file.write(result)
56+
return result != code

pyjsclear/__main__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77

88

99
def main() -> None:
10+
"""Parse CLI arguments and run the deobfuscator."""
1011
parser = argparse.ArgumentParser(description='Deobfuscate JavaScript files.')
1112
parser.add_argument('input', help='Input JS file (use - for stdin)')
1213
parser.add_argument('-o', '--output', help='Output file (default: stdout)')

0 commit comments

Comments
 (0)