Skip to content

Latest commit

Β 

History

History
606 lines (489 loc) Β· 19.6 KB

File metadata and controls

606 lines (489 loc) Β· 19.6 KB

πŸ—οΈ PDB2JSON Architecture & System Design

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#ff6b6b','primaryTextColor':'#fff','primaryBorderColor':'#ff6b6b','lineColor':'#4ecdc4','secondaryColor':'#45b7d1','tertiaryColor':'#96ceb4','noteTextColor':'#2d3748','noteBkgColor':'#ffd93d','noteBorderColor':'#ffd93d'}}}%%
graph TB
    subgraph "πŸ–₯️ Client Systems"
        PS[PowerShell Scripts<br/>πŸ”· Test-AllVirtualMemory.ps1<br/>πŸ”· Invoke-Parallel.ps1<br/>πŸ”· Out-SquarifiedTreeMap.ps1]
        PY[Python Plugin<br/>🐍 inVteroJitHash.py<br/>πŸ”¬ Volatility Framework]
        SH[Bash Scripts<br/>🐚 dt.sh<br/>βš™οΈ Symbol Extraction]
    end
    
    subgraph "🌐 Network Communication"
        HTTPS[HTTPS/TLS<br/>πŸ”’ Encrypted Channel<br/>πŸ“Š JSON Payloads]
    end
    
    subgraph "☁️ Azure Cloud Services"
        AF[Azure Functions<br/>⚑ Serverless Backend<br/>🌍 pdb2json.azurewebsites.net]
        
        subgraph "πŸ”Œ API Endpoints"
            API1[πŸ“‹ /api/typedef/x<br/>Type Definitions]
            API2[πŸ” /api/SymFromName/x<br/>Symbol Lookup]
            API3[πŸ“ /api/SymFromAddr/x<br/>Address Resolution]
            API4[πŸ”„ /api/Relocs/x<br/>Relocation Data]
            API5[#️⃣ /api/PageHash/x<br/>JIT Hash Validation]
        end
        
        DB[(πŸ—„οΈ Hash Database<br/>5TB+ Microsoft Software<br/>SHA256 Checksums<br/>PDB Symbol Data)]
    end
    
    subgraph "πŸ’Ύ Local Memory/Files"
        MEM[πŸ’» Process Memory<br/>Running Code Pages<br/>Virtual Address Space]
        DUMP[πŸ“¦ Memory Dumps<br/>.vmem files<br/>.raw files]
        BIN[πŸ“ PE Binaries<br/>.exe / .dll files<br/>Debug Directories]
    end
    
    PS -->|SHA256 Hashes| HTTPS
    PY -->|SHA256 Hashes| HTTPS
    SH -->|PDB Queries| HTTPS
    
    HTTPS -->|JSON Request| AF
    AF -->|Query| DB
    DB -->|JIT Hash Calculation| AF
    AF -->|JSON Response| HTTPS
    
    HTTPS -->|Validation Results| PS
    HTTPS -->|Validation Results| PY
    HTTPS -->|Symbol Data| SH
    
    MEM -.->|Read Pages| PS
    DUMP -.->|Parse Memory| PY
    BIN -.->|Extract Info| SH
    
    style PS fill:#667eea,stroke:#764ba2,color:#fff
    style PY fill:#f093fb,stroke:#f5576c,color:#fff
    style SH fill:#4facfe,stroke:#00f2fe,color:#fff
    style AF fill:#fa709a,stroke:#fee140,color:#fff
    style DB fill:#30cfd0,stroke:#330867,color:#fff
    style HTTPS fill:#a8edea,stroke:#fed6e3,color:#333
    style MEM fill:#ffecd2,stroke:#fcb69f,color:#333
    style DUMP fill:#ff9a9e,stroke:#fecfef,color:#fff
    style BIN fill:#ffdde1,stroke:#ee9ca7,color:#333
Loading

⚑ Serverless Β· πŸ”’ Secure Β· πŸš€ Scalable Β· πŸ†“ Free


πŸ“Š System Overview

The PDB2JSON system is a cloud-native memory integrity validation platform that combines:

  • 🎯 Just-In-Time (JIT) Hashing - Dynamic hash calculation based on client's virtual address
  • πŸ” Privacy-First Design - Only SHA256 hashes transmitted (no binary data)
  • 🌐 Global Scale - Azure Functions serverless architecture
  • πŸ“š Massive Coverage - 5TB+ of Microsoft software pre-indexed

πŸ”„ Data Flow Architecture

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#667eea','primaryTextColor':'#fff','primaryBorderColor':'#764ba2','lineColor':'#f093fb','secondaryColor':'#4facfe','tertiaryColor':'#30cfd0'}}}%%
sequenceDiagram
    participant C as πŸ‘€ Client System
    participant M as πŸ’Ύ Memory Scanner
    participant H as #️⃣ Hash Engine
    participant N as 🌐 Network Layer
    participant A as ☁️ Azure Functions
    participant D as πŸ—„οΈ Hash Database
    participant J as βš™οΈ JIT Calculator

    rect rgb(102, 126, 234, 0.1)
        Note over C,M: 1️⃣ Memory Acquisition Phase
        C->>M: Enumerate Processes
        M->>C: Process List
        C->>M: Read Virtual Memory Pages
        M-->>C: Raw Memory Data (4KB pages)
    end

    rect rgb(240, 147, 251, 0.1)
        Note over C,J: 2️⃣ Hash Generation Phase
        C->>H: Process Memory Pages
        H->>H: Calculate SHA256
        H-->>C: Base64-Encoded Hashes
    end

    rect rgb(79, 172, 254, 0.1)
        Note over C,A: 3️⃣ Network Transmission Phase
        C->>N: Build JSON Request
        Note right of N: Only hashes sent<br/>No binary data!
        N->>A: HTTPS POST /api/PageHash/x
    end

    rect rgb(48, 207, 208, 0.1)
        Note over A,J: 4️⃣ Server Processing Phase
        A->>D: Query Module Info
        D-->>A: PDB Data + Relocations
        A->>J: Calculate JIT Hash
        Note right of J: Adjust for<br/>virtual address<br/>& relocations
        J-->>A: Computed Hash Value
        A->>A: Compare Hashes
    end

    rect rgb(102, 252, 241, 0.1)
        Note over A,C: 5️⃣ Response Phase
        A-->>N: JSON Response (Valid/Invalid)
        N-->>C: Validation Results
        C->>C: Generate Report
    end

    rect rgb(255, 154, 158, 0.1)
        Note over C: 6️⃣ Visualization Phase
        C->>C: Create TreeMap Visualization
        C->>C: Highlight Suspicious Regions
    end
Loading

🎯 Component Deep Dive

1️⃣ PowerShell Client Layer

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#f093fb','primaryTextColor':'#fff','primaryBorderColor':'#f5576c','lineColor':'#4facfe','secondaryColor':'#30cfd0'}}}%%
graph LR
    subgraph "πŸ”· Test-AllVirtualMemory.ps1"
        A[Process<br/>Enumeration] --> B[Memory<br/>Reading]
        B --> C[SHA256<br/>Hashing]
        C --> D[JSON<br/>Serialization]
    end
    
    subgraph "πŸ”· Invoke-Parallel.ps1"
        E[Runspace<br/>Pool] --> F[Thread<br/>Management]
        F --> G[Parallel<br/>Execution]
    end
    
    subgraph "πŸ”· Out-SquarifiedTreeMap.ps1"
        H[Data<br/>Processing] --> I[TreeMap<br/>Algorithm]
        I --> J[WPF<br/>Rendering]
    end
    
    D --> E
    G --> H
    
    style A fill:#f093fb,stroke:#f5576c,color:#fff
    style B fill:#f093fb,stroke:#f5576c,color:#fff
    style C fill:#f093fb,stroke:#f5576c,color:#fff
    style D fill:#f093fb,stroke:#f5576c,color:#fff
    style E fill:#4facfe,stroke:#00f2fe,color:#fff
    style F fill:#4facfe,stroke:#00f2fe,color:#fff
    style G fill:#4facfe,stroke:#00f2fe,color:#fff
    style H fill:#30cfd0,stroke:#330867,color:#fff
    style I fill:#30cfd0,stroke:#330867,color:#fff
    style J fill:#30cfd0,stroke:#330867,color:#fff
Loading

Key Features:

  • πŸ”„ Parallel Processing: Up to 512 concurrent threads
  • 🎨 Rich Visualization: Interactive TreeMap with heat mapping
  • πŸ” Privilege Escalation: Optional SYSTEM-level access
  • 🌐 Remote Scanning: PSRemoting for network-wide deployment

2️⃣ Python Volatility Plugin

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#fa709a','primaryTextColor':'#fff','primaryBorderColor':'#fee140','lineColor':'#4facfe','secondaryColor':'#30cfd0'}}}%%
stateDiagram-v2
    [*] --> Initialize: Load Plugin
    Initialize --> ParseDump: Open Memory Dump
    ParseDump --> EnumerateProcesses: Identify Processes
    EnumerateProcesses --> CheckPages: For Each Process
    CheckPages --> ValidateNX: Check NX Bit
    
    ValidateNX --> SkipPage: NX Set (Not Executable)
    ValidateNX --> HashPage: NX Clear (Executable)
    
    SkipPage --> CheckPages: Next Page
    HashPage --> SendRequest: Batch Hashes
    SendRequest --> ReceiveResponse: Await Server
    ReceiveResponse --> ValidateHash: Compare Results
    
    ValidateHash --> UpdateMetrics: Record Statistics
    UpdateMetrics --> CheckPages: Continue
    
    CheckPages --> GenerateReport: All Pages Done
    GenerateReport --> [*]: Display Results
    
    note right of ValidateNX
        Only executable pages
        are validated to
        reduce false positives
    end note
    
    note right of SendRequest
        Retry logic with
        exponential backoff
        16 attempts max
    end note
Loading

Advanced Features:

  • πŸ” NX Bit Awareness: Only validates executable pages
  • 🎨 Color-Coded Output: ANSI terminal support
  • πŸ“Š Progress Tracking: Real-time tqdm progress bars
  • πŸ”„ Retry Logic: Robust error handling with backoff

3️⃣ Bash Symbol Extraction

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#4facfe','primaryTextColor':'#fff','primaryBorderColor':'#00f2fe','lineColor':'#f093fb','secondaryColor':'#30cfd0'}}}%%
flowchart TD
    A[πŸ“ PE Binary Input] --> B{Parse PE Headers}
    B --> C[Extract Debug Directory]
    C --> D[Read PDB GUID]
    C --> E[Read TimeDateStamp]
    C --> F[Read PDB Filename]
    
    D --> G[Format GUID<br/>Reverse Byte Order]
    E --> H[Convert to Decimal]
    F --> I[Extract Path]
    
    G --> J{Query Type}
    H --> J
    I --> J
    
    J -->|typedef| K[πŸ” Structure Definition Query]
    J -->|SymFromName| L[πŸ”Ž Symbol Name Lookup]
    J -->|SymFromAddr| M[πŸ“ Address Resolution]
    J -->|Relocs| N[πŸ”„ Relocation Data]
    
    K --> O[πŸ“‘ cURL HTTPS Request]
    L --> O
    M --> O
    N --> O
    
    O --> P[☁️ Azure Functions API]
    P --> Q[πŸ“‹ JSON Response]
    Q --> R[βœ… Display Results]
    
    style A fill:#4facfe,stroke:#00f2fe,color:#fff
    style G fill:#f093fb,stroke:#f5576c,color:#fff
    style H fill:#f093fb,stroke:#f5576c,color:#fff
    style I fill:#f093fb,stroke:#f5576c,color:#fff
    style K fill:#30cfd0,stroke:#330867,color:#fff
    style L fill:#30cfd0,stroke:#330867,color:#fff
    style M fill:#30cfd0,stroke:#330867,color:#fff
    style N fill:#30cfd0,stroke:#330867,color:#fff
    style P fill:#fa709a,stroke:#fee140,color:#fff
    style R fill:#96ceb4,stroke:#618833,color:#fff
Loading

πŸ” Security Architecture

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#ff6b6b','primaryTextColor':'#fff','primaryBorderColor':'#c92a2a','lineColor':'#4ecdc4','secondaryColor':'#45b7d1'}}}%%
graph TB
    subgraph "πŸ›‘οΈ Client-Side Security"
        A1[πŸ”’ TLS 1.2+ Required]
        A2[βœ… Certificate Validation]
        A3[#️⃣ Hash-Only Transmission]
        A4[🚫 No Binary Upload]
    end
    
    subgraph "🌐 Network Security"
        B1[πŸ” HTTPS Encryption]
        B2[πŸ“œ Certificate Pinning]
        B3[πŸ”„ Retry with Backoff]
        B4[⏱️ Request Timeout]
    end
    
    subgraph "☁️ Server-Side Security"
        C1[πŸ”‘ API Authentication]
        C2[🚦 Rate Limiting]
        C3[🧹 Input Validation]
        C4[πŸ“Š Audit Logging]
    end
    
    subgraph "πŸ—„οΈ Data Protection"
        D1[πŸ”’ Encrypted at Rest]
        D2[πŸ” Encrypted in Transit]
        D3[πŸ”„ Regular Backups]
        D4[🚫 No PII Storage]
    end
    
    A1 --> B1
    A2 --> B2
    A3 --> B1
    A4 --> B1
    
    B1 --> C1
    B2 --> C2
    B3 --> C3
    B4 --> C4
    
    C1 --> D1
    C2 --> D2
    C3 --> D3
    C4 --> D4
    
    style A1 fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style A2 fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style A3 fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style A4 fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style B1 fill:#4ecdc4,stroke:#2c7873,color:#fff
    style B2 fill:#4ecdc4,stroke:#2c7873,color:#fff
    style B3 fill:#4ecdc4,stroke:#2c7873,color:#fff
    style B4 fill:#4ecdc4,stroke:#2c7873,color:#fff
    style C1 fill:#45b7d1,stroke:#2b6777,color:#fff
    style C2 fill:#45b7d1,stroke:#2b6777,color:#fff
    style C3 fill:#45b7d1,stroke:#2b6777,color:#fff
    style C4 fill:#45b7d1,stroke:#2b6777,color:#fff
    style D1 fill:#96ceb4,stroke:#618833,color:#fff
    style D2 fill:#96ceb4,stroke:#618833,color:#fff
    style D3 fill:#96ceb4,stroke:#618833,color:#fff
    style D4 fill:#96ceb4,stroke:#618833,color:#fff
Loading

πŸ”‘ Security Principles

  1. 🚫 Privacy by Design: No binary data ever leaves client
  2. πŸ”’ Defense in Depth: Multiple security layers
  3. βœ… Verify Always: Certificate and signature validation
  4. πŸ“Š Audit Trail: Comprehensive logging and monitoring

⚑ Performance Optimization

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#ffd93d','primaryTextColor':'#333','primaryBorderColor':'#f4b41a','lineColor':'#667eea','secondaryColor':'#f093fb'}}}%%
graph LR
    subgraph "πŸš€ Client Optimizations"
        A[Parallel Processing<br/>Up to 512 Threads]
        B[Shared Page<br/>Deduplication]
        C[Batch Requests<br/>Multiple Hashes]
    end
    
    subgraph "🌐 Network Optimizations"
        D[Connection Pooling<br/>Keep-Alive]
        E[HTTP/2 Support<br/>Multiplexing]
        F[Compression<br/>gzip/deflate]
    end
    
    subgraph "☁️ Server Optimizations"
        G[In-Memory Caching<br/>5GB Cache Layer]
        H[Async Processing<br/>Non-Blocking I/O]
        I[Load Balancing<br/>Auto-Scaling]
    end
    
    A --> D
    B --> D
    C --> D
    
    D --> G
    E --> H
    F --> I
    
    style A fill:#ffd93d,stroke:#f4b41a,color:#333
    style B fill:#ffd93d,stroke:#f4b41a,color:#333
    style C fill:#ffd93d,stroke:#f4b41a,color:#333
    style D fill:#667eea,stroke:#764ba2,color:#fff
    style E fill:#667eea,stroke:#764ba2,color:#fff
    style F fill:#667eea,stroke:#764ba2,color:#fff
    style G fill:#f093fb,stroke:#f5576c,color:#fff
    style H fill:#f093fb,stroke:#f5576c,color:#fff
    style I fill:#f093fb,stroke:#f5576c,color:#fff
Loading

πŸ“Š Performance Metrics

Component Throughput Latency Concurrency
πŸ’Ύ Memory Read ~100 MB/s <1ms N/A
#️⃣ Hash Generation ~500 MB/s <1ms Multi-core
🌐 Network Request Varies 50-200ms 512 parallel
☁️ Server Processing 10K req/s <50ms Auto-scale

🎨 Visualization Layer

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#96ceb4','primaryTextColor':'#fff','primaryBorderColor':'#618833','lineColor':'#ffd93d','secondaryColor':'#ff6b6b'}}}%%
graph TB
    subgraph "πŸ“Š Data Preparation"
        A[Validation Results] --> B[Calculate Statistics]
        B --> C[Compute Heat Values]
        C --> D[Build Hierarchy]
    end
    
    subgraph "🎨 TreeMap Algorithm"
        D --> E[Squarified Layout]
        E --> F[Size by Memory]
        F --> G[Color by Validation%]
    end
    
    subgraph "πŸ–ΌοΈ WPF Rendering"
        G --> H[Create Canvas]
        H --> I[Draw Rectangles]
        I --> J[Add Labels]
        J --> K[Apply Heat Map]
    end
    
    subgraph "πŸ–±οΈ Interactivity"
        K --> L[Hover Tooltips]
        L --> M[Click to Drill Down]
        M --> N[Right-Click for Diff]
        N --> O[Context Menu]
    end
    
    O --> P[πŸ” Binary Diff Viewer]
    
    style A fill:#96ceb4,stroke:#618833,color:#fff
    style B fill:#96ceb4,stroke:#618833,color:#fff
    style E fill:#ffd93d,stroke:#f4b41a,color:#333
    style F fill:#ffd93d,stroke:#f4b41a,color:#333
    style G fill:#ffd93d,stroke:#f4b41a,color:#333
    style H fill:#667eea,stroke:#764ba2,color:#fff
    style I fill:#667eea,stroke:#764ba2,color:#fff
    style J fill:#667eea,stroke:#764ba2,color:#fff
    style K fill:#667eea,stroke:#764ba2,color:#fff
    style L fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style M fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style N fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style O fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style P fill:#4ecdc4,stroke:#2c7873,color:#fff
Loading

🎨 Color Coding Scheme

Validation % Color Meaning
100% 🟦 Blue βœ… Fully Validated
80-99% 🟩 Green ⚠️ Mostly Valid
60-79% 🟨 Yellow ⚠️ Some Issues
40-59% 🟧 Orange 🚨 Multiple Issues
20-39% πŸŸ₯ Red 🚨 Serious Issues
<20% πŸŸͺ Purple πŸ’€ Critical Issues

πŸ”¬ JIT Hashing Explained

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#4ecdc4','primaryTextColor':'#fff','primaryBorderColor':'#2c7873','lineColor':'#ffd93d','secondaryColor':'#ff6b6b'}}}%%
sequenceDiagram
    autonumber
    participant C as πŸ’» Client
    participant S as ☁️ Server
    participant DB as πŸ—„οΈ Database
    participant J as βš™οΈ JIT Engine

    Note over C: Module loaded at 0x7FF800000000
    C->>C: Read page at virtual address
    C->>C: SHA256(page) = "abc123..."
    C->>S: Send: {BaseAddr, Hash, Module}
    
    Note over S,DB: Server calculates expected hash
    S->>DB: Query module relocations
    DB-->>S: Relocation table + base image
    
    S->>J: Calculate hash for VA 0x7FF800000000
    
    rect rgb(255, 217, 61, 0.1)
        Note right of J: JIT Calculation Steps:
        J->>J: 1. Load base image from DB
        J->>J: 2. Apply relocations for VA
        J->>J: 3. Adjust for image base
        J->>J: 4. SHA256(adjusted page)
    end
    
    J-->>S: Expected hash = "abc123..."
    S->>S: Compare hashes
    
    alt Hashes Match
        S-->>C: βœ… Valid
        Note over C: Page is authentic!
    else Hashes Don't Match
        S-->>C: ❌ Invalid
        Note over C: Possible modification!
    end
Loading

βš™οΈ Why JIT Hashing?

Traditional Approach ❌

  • Pre-compute hashes for every possible load address
  • Storage: ~100TB for Windows alone
  • Inflexible: New addresses = recalculate everything

JIT Approach βœ…

  • Calculate hashes on-demand
  • Storage: ~5TB base images + relocations
  • Flexible: Works for any load address instantly

πŸ“ˆ Deployment Architecture

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#667eea','primaryTextColor':'#fff','primaryBorderColor':'#764ba2','lineColor':'#4ecdc4','secondaryColor':'#f093fb'}}}%%
C4Context
    title System Context - PDB2JSON Ecosystem
    
    Person(analyst, "Security Analyst", "Investigates memory integrity")
    Person(ir, "Incident Responder", "Rapid threat assessment")
    
    System(pdb2json, "PDB2JSON", "Memory integrity validation platform")
    
    System_Ext(azure, "Azure Functions", "Serverless compute")
    System_Ext(storage, "Azure Storage", "5TB hash database")
    System_Ext(cdn, "Azure CDN", "Global distribution")
    
    Rel(analyst, pdb2json, "Uses", "PowerShell/Python")
    Rel(ir, pdb2json, "Uses", "Remote scanning")
    
    Rel(pdb2json, azure, "Executes on", "HTTPS/JSON")
    Rel(azure, storage, "Queries", "Hash lookups")
    Rel(cdn, analyst, "Delivers", "Static assets")
Loading

πŸš€ Future Enhancements

%%{init: {'theme':'dark', 'themeVariables': { 'primaryColor':'#ff6b6b','primaryTextColor':'#fff','primaryBorderColor':'#c92a2a','lineColor':'#96ceb4','secondaryColor':'#ffd93d'}}}%%
mindmap
  root((πŸš€ Future<br/>Roadmap))
    🎯 Coverage Expansion
      Google Chrome
      Mozilla Firefox
      Adobe Products
      Linux Binaries
    ⚑ Performance
      GPU Acceleration
      Distributed Processing
      Edge Caching
      Compression
    πŸ” Security
      Blockchain Audit Trail
      Multi-Factor Auth
      Zero-Knowledge Proofs
    🎨 Visualization
      3D Memory Maps
      Real-Time Updates
      AR/VR Interface
      Mobile App
    πŸ”§ Integration
      SIEM Connectors
      EDR Integration
      Cloud Security
      Container Support
Loading

πŸ“š Additional Resources


Made with πŸ’™ by the Security Community

"Because memory integrity matters"

License: AGPL Azure Functions PowerShell Python