Skip to content

Rules/add logic clone detection fuzzy to Development#27

Merged
noelsaw1 merged 6 commits intodevelopmentfrom
rules/add-logic-clone-detection-fuzzy
Jan 3, 2026
Merged

Rules/add logic clone detection fuzzy to Development#27
noelsaw1 merged 6 commits intodevelopmentfrom
rules/add-logic-clone-detection-fuzzy

Conversation

@noelsaw1
Copy link
Copy Markdown
Contributor

@noelsaw1 noelsaw1 commented Jan 2, 2026

No description provided.

Hash-based detection of duplicate function definitions across files
  - New pattern: `dist/patterns/duplicate-functions.json` - Detects exact function clones (Type 1)
  - New function: `process_clone_detection()` - Extracts functions, normalizes code, computes MD5 hashes
  - Thresholds: min 5 lines, min 2 files, min 2 occurrences
  - Normalization: Strips comments and whitespace before hashing
  - **Impact:** Catches copy-paste violations where identical functions exist in multiple files
  - **Coverage:** 60-70% of all clones (Type 1 exact copies only)
  - **False Positive Rate:** < 5% (proven hash-based approach)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements Tier 1 function clone detection functionality using hash-based matching to identify duplicate function definitions across PHP files. The implementation follows a proven pattern-based approach similar to the existing magic string detection system (v1.0.73).

Key Changes

  • Added hash-based function clone detector that normalizes code (strips comments/whitespace) and computes MD5 hashes to identify exact duplicates
  • Integrated clone detection into the existing pattern infrastructure with configurable thresholds (min 5 lines, 2+ files, 2+ occurrences)
  • Updated HTML reporting to rebrand "Magic Strings" as "DRY Violations" encompassing both string literals and function clones

Reviewed changes

Copilot reviewed 3 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
dist/patterns/duplicate-functions.json New pattern definition specifying clone detection configuration, normalization rules, and thresholds
dist/tests/fixtures/dry/*.php Test fixtures with duplicate functions across multiple files for validation
dist/bin/check-performance.sh Added process_clone_detection() function (~150 lines) for extracting, normalizing, hashing, and aggregating function definitions
dist/bin/templates/report-template.html Updated section titles from "Magic Strings" to "DRY Violations" with clarifying subtitle
PROJECT/3-COMPLETED/*.md Comprehensive documentation of implementation approach, testing results, and completion status
PROJECT/1-INBOX/TRIAGE-2026-01-02.md Project management document for inbox triage and task organization
CHANGELOG.md Version 1.0.78 release notes documenting new features and changes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


### Observation
- Lists 8+ pattern ideas with details (severity, rule IDs, proposed grep expressions).
- Some entries already marked as COMPLETE or ENHANCED (e.g., unsanitized `$_GET`, admin capability checks).
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word 'IRL' is used as an abbreviation in documentation. In professional documentation, it's better to spell out 'in real life' or use 'real-world' for clarity and professionalism.

Copilot uses AI. Check for mistakes.
@noelsaw1 noelsaw1 changed the title Rules/add logic clone detection fuzzy Rules/add logic clone detection fuzzy to Development Jan 3, 2026
@noelsaw1 noelsaw1 merged commit 9e5cc43 into development Jan 3, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants