Rules/add logic clone detection fuzzy to Development#27
Merged
noelsaw1 merged 6 commits intodevelopmentfrom Jan 3, 2026
Merged
Rules/add logic clone detection fuzzy to Development#27noelsaw1 merged 6 commits intodevelopmentfrom
noelsaw1 merged 6 commits intodevelopmentfrom
Conversation
Hash-based detection of duplicate function definitions across files - New pattern: `dist/patterns/duplicate-functions.json` - Detects exact function clones (Type 1) - New function: `process_clone_detection()` - Extracts functions, normalizes code, computes MD5 hashes - Thresholds: min 5 lines, min 2 files, min 2 occurrences - Normalization: Strips comments and whitespace before hashing - **Impact:** Catches copy-paste violations where identical functions exist in multiple files - **Coverage:** 60-70% of all clones (Type 1 exact copies only) - **False Positive Rate:** < 5% (proven hash-based approach)
There was a problem hiding this comment.
Pull request overview
This PR implements Tier 1 function clone detection functionality using hash-based matching to identify duplicate function definitions across PHP files. The implementation follows a proven pattern-based approach similar to the existing magic string detection system (v1.0.73).
Key Changes
- Added hash-based function clone detector that normalizes code (strips comments/whitespace) and computes MD5 hashes to identify exact duplicates
- Integrated clone detection into the existing pattern infrastructure with configurable thresholds (min 5 lines, 2+ files, 2+ occurrences)
- Updated HTML reporting to rebrand "Magic Strings" as "DRY Violations" encompassing both string literals and function clones
Reviewed changes
Copilot reviewed 3 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
dist/patterns/duplicate-functions.json |
New pattern definition specifying clone detection configuration, normalization rules, and thresholds |
dist/tests/fixtures/dry/*.php |
Test fixtures with duplicate functions across multiple files for validation |
dist/bin/check-performance.sh |
Added process_clone_detection() function (~150 lines) for extracting, normalizing, hashing, and aggregating function definitions |
dist/bin/templates/report-template.html |
Updated section titles from "Magic Strings" to "DRY Violations" with clarifying subtitle |
PROJECT/3-COMPLETED/*.md |
Comprehensive documentation of implementation approach, testing results, and completion status |
PROJECT/1-INBOX/TRIAGE-2026-01-02.md |
Project management document for inbox triage and task organization |
CHANGELOG.md |
Version 1.0.78 release notes documenting new features and changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ### Observation | ||
| - Lists 8+ pattern ideas with details (severity, rule IDs, proposed grep expressions). | ||
| - Some entries already marked as COMPLETE or ENHANCED (e.g., unsanitized `$_GET`, admin capability checks). |
There was a problem hiding this comment.
The word 'IRL' is used as an abbreviation in documentation. In professional documentation, it's better to spell out 'in real life' or use 'real-world' for clarity and professionalism.
Add plans to split off patterns into individual files.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.