From 80ef236ce5ec44bc8a00f96850f81f8353fd1047 Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Thu, 12 Feb 2026 20:37:14 -0600 Subject: [PATCH 1/8] adjust font size and add docs 1. Font size will improve readability of the chart 2. Docs will make it easier for agents to start working, reducing context load --- CLAUDE.md | 82 ---- docs/ARCHITECTURE.md | 477 ++++++++++++++++++++++ docs/CLAUDE.md | 221 ++++++++++ docs/DATA_EXPANSION_PLAN.md | 449 ++++++++++++++++++++ docs/DECISIONS.md | 421 +++++++++++++++++++ docs/DEVELOPMENT_GUIDE.md | 467 +++++++++++++++++++++ docs/JAVASCRIPT_REFACTORING.md | 306 ++++++++++++++ docs/PRD.md | 290 +++++++++++++ docs/ROADMAP.md | 235 +++++++++++ js/config/theme.js | 10 +- js/render/repoCard.js | 8 +- js/render/shapes.js | 8 +- js/render/text.js | 6 +- js/render/tooltip.js | 102 ++--- js/simulations/collaborationSimulation.js | 2 +- 15 files changed, 2934 insertions(+), 150 deletions(-) delete mode 100644 CLAUDE.md create mode 100644 docs/ARCHITECTURE.md create mode 100644 docs/CLAUDE.md create mode 100644 docs/DATA_EXPANSION_PLAN.md create mode 100644 docs/DECISIONS.md create mode 100644 docs/DEVELOPMENT_GUIDE.md create mode 100644 docs/JAVASCRIPT_REFACTORING.md create mode 100644 docs/PRD.md create mode 100644 docs/ROADMAP.md diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index 93b448e..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,82 +0,0 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Project Overview - -This is a Python CLI tool with a D3.js frontend that generates interactive network visualizations of GitHub contributors. The CLI fetches data from GitHub, generates CSVs, and builds a static site for deployment. - -## Commands - -### Development - -```bash -# Install dependencies -uv sync - -# Run CLI commands -uv run contributor-network data # Fetch contribution data from GitHub -uv run contributor-network csvs # Generate CSVs from JSON -uv run contributor-network build # Build static site to dist/ -uv run contributor-network discover # Find new repositories -uv run contributor-network list-contributors # Display all contributors -``` - -### Quality Checks - -```bash -# Run all checks (as in CI) -uv run ruff format --check . -uv run ruff check . -uv run mypy - -# Auto-fix formatting and lint issues -uv run ruff format . -uv run ruff check --fix . - -# Run tests -uv run pytest -uv run pytest python/tests/test_config.py::test_function_name # Single test -``` - -## Architecture - -**Data flow**: GitHub API → Python CLI → JSON files → CSV files → D3.js visualization - -### Folder Structure - -``` -python/ # Python backend - contributor_network/ # CLI package - tests/ # Python tests - templates/ # Jinja2 HTML templates -js/ # JavaScript frontend - chart.js # Main D3.js visualization - config/ # Theme and scale configuration - data/ # Data preparation and filtering - render/ # Canvas rendering modules - simulations/ # D3 force simulations - ... -assets/ # Static assets - css/ # Stylesheets - data/ # CSV data files - img/ # Images - lib/ # Vendored D3 libraries -``` - -### Key Files - -- `python/contributor_network/cli.py` - Click-based CLI with 5 subcommands -- `python/contributor_network/config.py` - Pydantic models for TOML configuration -- `python/contributor_network/models.py` - Data models (Link, Repository) -- `python/contributor_network/client.py` - GitHub API client wrapper -- `js/chart.js` - D3.js visualization entry point -- `python/templates/` - Jinja2 HTML templates -- `config.toml`, `veda.toml` - Repository and contributor configuration - -## Code Style - -- Use `ruff` for linting and formatting -- Type hints required (mypy runs in CI) -- Pydantic for data validation -- Click for CLI commands diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..0e48807 --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,477 @@ +# Architecture Overview + +How the codebase is organized and how the different pieces fit together. + +--- + +## High-Level Data Flow + +``` +GitHub API + ↓ +Python CLI (client.py) + ↓ +JSON Files (assets/data/) + ↓ +CSV Generation (csvs command) + ↓ +D3.js Visualization (index.html) + ↓ +Interactive Web App +``` + +--- + +## Project Organization + +### Python Backend + +**Purpose:** Fetch data from GitHub, validate it, generate CSVs, and build the static site. + +**Key files:** +- `python/contributor_network/cli.py` - All CLI commands (Click-based) +- `python/contributor_network/client.py` - GitHub API wrapper (uses PyGithub) +- `python/contributor_network/config.py` - Configuration models (Pydantic) +- `python/contributor_network/models.py` - Data models (Repository, Link, Contributor) + +**Architecture:** +``` +CLI (cli.py) + ↓ +Client (client.py) ← Queries GitHub API + ↓ +Models (models.py) ← Validates & structures data + ↓ +Config (config.py) ← Loads from config.toml + ↓ +JSON files / CSVs / Templates +``` + +**Data Flow for `data` command:** +1. Load repositories from `config.toml` +2. For each repo, query GitHub API (client.py) +3. Fetch contributions, commit dates, repository metadata +4. Validate data with Pydantic models (models.py) +5. Save to JSON files in `assets/data/` + +### JavaScript Frontend + +**Purpose:** Load CSV data, prepare it for visualization, render with D3.js, handle interactions. + +**Architecture Status:** Modular ES6 modules + +**Organization:** +``` +src/js/ +├── config/ # Configuration (theme, scales, constants) +├── data/ # Data loading, filtering, preparation +├── interaction/ # Mouse/keyboard event handlers +├── layout/ # Canvas sizing, node positioning +├── render/ # Drawing functions (shapes, text, labels, etc.) +├── simulations/ # D3 force simulations +├── state/ # State management (filters, hover/click state) +└── utils/ # Helpers (formatters, validation, debugging) +``` + +**Current State:** +- ✅ 29 modules extracted +- ✅ 4,642 lines in modular files +- 🟡 Main orchestrator still contains ~2,059 lines +- 🟡 Largest remaining extraction: `prepareData()` (~515 lines) + +--- + +## Key Concepts + +### Nodes + +Three types of nodes in the visualization: + +1. **Contributors** - Team members (arranged in a circle) + - Position: Alphabetically around outer ring + - Color: Based on organization + - Size: Based on total contributions + +2. **Repositories** - GitHub projects + - Position: Determined by force simulation (depends on collaboration pattern) + - Color: Coded by ownership type + - Grouped by: Single owner, single contributor, multiple contributors + +3. **Owners** - Repository owners (intermediary nodes when repos grouped by owner) + - Position: Calculated to be between repos and contributors + - Purpose: Organize related repos visually + +### Links + +Connections between nodes: + +- **Contributor → Owner → Repository** (when repos grouped by owner) +- **Contributor → Repository** (direct links) +- Width: Based on commit count +- Opacity: Based on recency of contribution + +### Force Simulations + +D3.js force simulations position nodes without overlapping: + +1. **Owner Simulation** - Repos with single owner (25% of repos) + - Pulls repos toward owner + - Prevents overlap + +2. **Contributor Simulation** - Repos with single DevSeed contributor (50%) + - Pulls repos toward contributor + - Uses strong charge force to prevent clustering + +3. **Collaboration Simulation** - Repos with multiple contributors (20%) + - Centers at origin + - Balances contributors' influence + - Tighter clustering + +4. **Remaining Simulation** - Contributors outside main circle + - Positioned in outer ring + - Separated from main visualization + +--- + +## Configuration + +### `config.toml` Structure + +```toml +[repositories] +"owner/repo-name" = "Display Name" +"another-owner/project" = "Another Project" + +[contributors.devseed] +github_username = "Display Name" +another_username = "Another Name" + +[contributors.alumni] +old_member = "Old Member Name" +``` + +### Data Models (`models.py`) + +**Repository:** +- ID, full name, URL +- Metrics: stars, forks, watchers, open issues +- Metadata: languages, topics, license, created/updated dates +- Community: total contributors, DevSeed contributors, community ratio + +**Link** (Contributor → Repository relationship): +- Contributor & repo IDs +- Commit count and dates (first/last) +- Contribution span (days) +- Recency flag + +--- + +## State Management + +### Filter State + +Managed in `src/js/state/filterState.js` + +```javascript +{ + organizations: [], // Selected org filters + starsMin: null, + forksMin: null, + watchersMin: null, + language: null +} +``` + +### Interaction State + +Managed in `src/js/state/interactionState.js` + +```javascript +{ + hoverActive: false, + hoveredNode: null, + clickActive: false, + clickedNode: null, + delaunay: null // For mouse position detection +} +``` + +--- + +## Data Processing Pipeline + +### 1. Loading (`src/js/visualization/index.js`) +- Fetch CSV files (repositories.csv, contributors.csv) +- Parse into objects +- Build node and link objects + +### 2. Preparation (`src/js/data/prepare.js` - currently in progress to extract) +- Create node objects from data +- Build link arrays +- Calculate positions +- Determine colors based on metadata + +### 3. Filtering (`src/js/data/filter.js`) +- Apply active filters (organization, stars, language, etc.) +- Cascade: filter repos → filter links → filter contributors +- Rebuild only affected links + +### 4. Simulation (`src/js/simulations/`) +- Run D3 force simulations to position nodes +- Multiple simulations based on repo grouping +- Calculate contributor ring positions + +### 5. Rendering (`src/js/render/`) +- Draw nodes (circles with optional patterns) +- Draw links (curved paths with gradients) +- Draw labels (rotated for contributor ring) +- Draw tooltips on hover/click + +### 6. Interaction (`src/js/interaction/`) +- Track mouse position with Delaunay triangulation +- Trigger hover state on node entry +- Handle click for selection +- Filter links shown on hover + +--- + +## Rendering Pipeline + +### Canvas Layers + +The visualization uses multiple canvas layers (composited in HTML): + +1. **Main canvas** - Nodes and links (performance-critical) +2. **Tooltip canvas** - Hover/click information cards +3. **Label canvas** - Node labels +4. **Hover canvas** - Temporary highlighting + +### Drawing Performance + +**Why Canvas instead of SVG?** +- 200+ interactive nodes + 500+ links would be slow in SVG +- Canvas provides better performance for this density +- D3 force simulation updates positions ~60 times per second + +**Optimization strategies:** +- Request animation frame batching +- Partial redraws (only affected regions) +- Delaunay triangulation for fast node detection + +--- + +## Refactoring Status + +### What's Been Modularized ✅ + +| Area | Lines | Status | +|------|-------|--------| +| Config | 240 | ✅ Complete | +| Data filtering | 217 | ✅ Complete | +| State | 173 | ✅ Complete | +| Simulations | 529 | ✅ Complete | +| Interaction | 239 | ✅ Complete | +| Render (shapes, text, tooltips) | 1,474 | ✅ Complete | +| Layout | 329 | ✅ Complete | +| Utils | 606 | ✅ Complete | +| **Total Modular** | **4,642** | ✅ **Complete** | + +### What Still Needs Work 🟡 + +| Task | Lines | Priority | +|------|-------|----------| +| Extract `prepareData()` | ~515 | High | +| Extract `positionContributorNodes()` | ~117 | High | +| Simplify main `draw()` | ~166 | High | +| Extract helper functions | ~100 | Medium | +| **Total Remaining** | **~898** | | + +**Target:** Main `index.js` from 2,059 lines → ~300-400 lines (thin orchestrator) + +--- + +## JavaScript Module Structure + +### Current Organization + +``` +src/js/ +├── config/ +│ ├── theme.js (119 lines) # Colors, fonts, layout constants +│ └── scales.js (121 lines) # D3 scale factories +├── data/ +│ └── filter.js (217 lines) # Filtering logic +├── state/ +│ ├── filterState.js (67 lines) # Filter state +│ └── interactionState.js (106 lines) # Hover/click state +├── simulations/ +│ ├── ownerSimulation.js (125 lines) +│ ├── contributorSimulation.js (132 lines) +│ ├── collaborationSimulation.js (188 lines) +│ ├── remainingSimulation.js (84 lines) +│ └── index.js (12 lines) # Re-exports +├── interaction/ +│ ├── hover.js (87 lines) +│ ├── click.js (85 lines) +│ └── findNode.js (67 lines) +├── layout/ +│ └── resize.js (122 lines) +├── render/ +│ ├── canvas.js (207 lines) +│ ├── shapes.js (277 lines) +│ ├── text.js (275 lines) +│ ├── tooltip.js (533 lines) # Largest module +│ ├── labels.js (141 lines) +│ └── repoCard.js (248 lines) +├── utils/ +│ ├── helpers.js (121 lines) +│ ├── formatters.js (153 lines) +│ ├── validation.js (185 lines) +│ └── debug.js (147 lines) +└── visualization/ + └── index.js (14 lines) # Exports +``` + +--- + +## Theme & Customization + +### Colors (in `src/js/config/theme.js`) + +**Brand Colors:** +- Grenadier Orange (#CF3F02) +- Aquamarine Blue (#2E86AB) +- Base Gray (#443F3F) + +**Node Colors:** +- Contributors: Varied by organization (from color palette) +- Repositories: Coded by ownership pattern (single owner, single contributor, shared) +- Owners: Gray/neutral + +**Link Colors:** +- Gradient from contributor color to repo color +- Opacity: Based on recency (recent = more opaque) + +### Font Configuration + +```javascript +FONTS = { + family: "...", + baseSizeContributor: 11, // To be increased to ~14 + baseSizeRepo: 10, // To be increased to ~13 + baseSizeOwner: 12 // To be increased to ~15 +} +``` + +--- + +## Dependencies + +### Python +- `click` - CLI framework +- `pydantic` - Data validation +- `pygithub` - GitHub API client +- `requests` - HTTP library +- `tomli` - TOML parsing +- `pytest` - Testing + +### JavaScript +- `d3` - Visualization and force simulations +- `vitest` - Testing framework +- ~~esbuild~~ - Bundling (in package.json, but not active build) + +--- + +## How It All Fits Together + +**User visits the site:** + +1. **index.html** loads JavaScript modules from `src/js/` +2. **visualization/index.js** creates a chart function +3. Chart function: + - Loads CSV data from `assets/data/` + - Calls `prepareData()` to transform raw data into nodes/links + - Runs force simulations to position nodes + - Sets up event handlers (hover, click) + - Starts animation loop + +4. **Animation loop** (`draw()` function): + - Updates node positions (from force simulation) + - Redraws canvas + - Shows/hides tooltips based on interaction state + +5. **User interaction**: + - Mouse move → detect node via Delaunay triangulation + - Mouse over node → highlight and show tooltip + - Click node → select for detailed view + - Change filter → re-run cascade, rebuild visualization + +6. **Data refresh** (happens offline): + - User runs `uv run contributor-network data` + - GitHub data fetched and validated + - User runs `uv run contributor-network csvs` + - CSV files updated in `assets/data/` + - Next page refresh loads new data + +--- + +## Common Patterns + +### Module Pattern + +All modules export functions, not classes: + +```javascript +// src/js/utils/formatters.js +export function formatDate(timestamp) { /* ... */ } +export function formatNumber(num) { /* ... */ } +``` + +```javascript +// Usage in another module +import { formatDate } from '../utils/formatters.js'; +const dateStr = formatDate(timestamp); +``` + +### State Management + +Simple, predictable state updates: + +```javascript +// Create initial state +let state = createInteractionState(); + +// Update immutably +state = setHovered(state, hoveredNode); +state = setClicked(state, clickedNode); +``` + +### Configuration + +All magic numbers and constants centralized: + +```javascript +// In config/theme.js +export const COLORS = { /* ... */ }; +export const LAYOUT = { /* ... */ }; +export const FONTS = { /* ... */ }; +``` + +--- + +## Next Steps for Refactoring + +**High Priority:** +1. Extract `prepareData()` → `data/prepare.js` (~515 lines) +2. Extract `positionContributorNodes()` → `layout/positioning.js` (~117 lines) +3. Simplify main `draw()` function + +**Medium Priority:** +4. Extract `drawHoverState()` → `render/hoverState.js` +5. Extract remaining helper functions + +**Result:** Main orchestrator becomes ~300 lines (thin coordinating layer) + +--- + +**Last Updated**: February 2026 diff --git a/docs/CLAUDE.md b/docs/CLAUDE.md new file mode 100644 index 0000000..bba0c69 --- /dev/null +++ b/docs/CLAUDE.md @@ -0,0 +1,221 @@ +# Contributor Network - Developer Guide + +**Start here.** This file provides quick orientation for anyone working with this codebase. + +## What Is This? + +An interactive D3.js web visualization of Development Seed's contributions to open-source projects. Shows the relationships between team members, repositories, and collaborators. + +**Live**: https://developmentseed.org/contributor-network + +**Repo**: https://github.com/developmentseed/contributor-network + +--- + +## For New Developers + +**First read**: [`PRD.md`](./PRD.md) (5 min) - Understand what this product is and why it exists. + +**Then read**: [`DEVELOPMENT_GUIDE.md`](./DEVELOPMENT_GUIDE.md) (10 min) - Set up your local environment. + +**Then explore**: [`ARCHITECTURE.md`](./ARCHITECTURE.md) (15 min) - Understand how the code is organized. + +--- + +## Quick Start + +### Prerequisites +- [uv](https://docs.astral.sh/uv/getting-started/installation/) for Python +- [Node.js](https://nodejs.org/) 18+ for JavaScript +- GitHub personal access token with `public_repo` scope + +### Installation +```bash +uv sync # Install Python dependencies +npm install # Install JavaScript dependencies +``` + +### View Locally +```bash +python -m http.server 8000 +# Open http://localhost:8000/ +``` + +### Fetch Data & Build +```bash +export GITHUB_TOKEN="your_token_here" +uv run contributor-network data # Fetch from GitHub +uv run contributor-network csvs # Generate CSVs +uv run contributor-network build assets/data dist # Build static site +``` + +--- + +## Key Commands + +### Development +```bash +# Run CLI commands +uv run contributor-network data # Fetch contribution data from GitHub +uv run contributor-network csvs # Generate CSVs from JSON +uv run contributor-network build assets/data dist # Build static site to dist/ +uv run contributor-network discover # Find new repositories to track +uv run contributor-network list-contributors # Display all configured contributors + +# JavaScript testing +npm test # Run Vitest +npm run build # Bundle JavaScript +``` + +### Quality Checks +```bash +# Python: as in CI +uv run ruff format --check . +uv run ruff check . +uv run mypy +uv run pytest + +# Auto-fix issues +uv run ruff format . +uv run ruff check --fix . +``` + +--- + +## Project Structure + +``` +python/ # Python backend (CLI) + contributor_network/ # Main package + cli.py # Click CLI commands + client.py # GitHub API wrapper + config.py # Pydantic config models + models.py # Data models + tests/ # Python tests + templates/ # Jinja2 HTML templates + +src/js/ # JavaScript frontend (modular) + index.js # Barrel exports + config/ # Theme, scales, constants + data/ # Data filtering and prep + interaction/ # Hover, click handlers + layout/ # Sizing, positioning + render/ # Drawing (shapes, text, labels, tooltips) + simulations/ # D3 force simulations + state/ # State management + utils/ # Helpers, formatters, validation + __tests__/ # Unit tests + +assets/ + data/ # JSON data files (generated) + css/ # Stylesheets + img/ # Images + +index.html # Main entry point +config.toml # Repository and contributor config +``` + +--- + +## Key Files + +- **`python/contributor_network/cli.py`** - Click-based CLI with 5 subcommands +- **`python/contributor_network/client.py`** - GitHub API client wrapper +- **`python/contributor_network/models.py`** - Pydantic data models (Repo, Link, etc.) +- **`src/js/index.js`** - Main visualization orchestrator (still being refactored) +- **`config.toml`** - Configuration: which repos to track, who are contributors +- **`index.html`** - Static HTML that loads the visualization + +--- + +## Code Standards + +### Python +- Type hints required (mypy validates in CI) +- Formatted with `ruff` (not black) +- Pydantic for data validation +- Click for CLI commands +- Docstrings on public functions + +### JavaScript +- ES6 modules (no transpilation) +- Modular architecture: each module <300 lines +- JSDoc comments on exported functions +- Tests with Vitest +- No external build step for development (changes auto-available in browser) + +--- + +## Documentation Structure + +| Document | Purpose | Read When | +|----------|---------|-----------| +| **PRD.md** | Product requirements and vision | First - understand the *why* | +| **DEVELOPMENT_GUIDE.md** | Setup, workflows, local development | Setting up your environment | +| **ARCHITECTURE.md** | Code organization, current state | Understanding code structure | +| **JAVASCRIPT_REFACTORING.md** | JS modularization progress and roadmap | Working on frontend code | +| **roadmap.md** | Project status, planned features, and implementation status | Planning new work | +| **DATA_EXPANSION_PLAN.md** | Data collection phases (1-5) with details | Adding new data fields | +| **DECISIONS.md** | Architectural decisions and tradeoffs | Curious about design choices | + +--- + +## Branding + +Development Seed colors: +- **Grenadier** (#CF3F02): Primary orange accent +- **Aquamarine** (#2E86AB): Secondary blue +- **Base** (#443F3F): Text color + +Configured in `src/js/config/theme.js`. + +--- + +## Common Tasks + +### Add a New Repository to Track +1. Edit `config.toml` - add repo to `[repositories]` section +2. Run `uv run contributor-network data` to fetch GitHub data +3. Run `uv run contributor-network csvs` to generate CSVs +4. Run `uv run contributor-network build assets/data dist` to rebuild site + +### Add a New Contributor +1. Edit `config.toml` - add to `[contributors.devseed]` or `[contributors.alumni]` +2. Re-run data fetch and build (above) + +### Debug Visualization Issues +- Open DevTools (F12) +- Look for `debug-contributor-network` flag in console +- Check network tab to see what data was loaded +- See `src/js/utils/debug.js` for debug utilities + +### Run Tests +```bash +# Python +uv run pytest +uv run pytest python/tests/test_config.py::test_function_name # Single test + +# JavaScript +npm test +npm test -- --watch +``` + +--- + +## Current Project Status + +See [`roadmap.md`](./roadmap.md) for full project status, planned features, and roadmap details. + +--- + +## Need Help? + +- **Setting up?** → `DEVELOPMENT_GUIDE.md` +- **How does the code work?** → `ARCHITECTURE.md` +- **What are we building next?** → `roadmap.md` or `DATA_EXPANSION_PLAN.md` +- **Why was a decision made?** → `DECISIONS.md` +- **What's the product for?** → `PRD.md` + +--- + +**Last Updated**: February 2026 diff --git a/docs/DATA_EXPANSION_PLAN.md b/docs/DATA_EXPANSION_PLAN.md new file mode 100644 index 0000000..f4407d0 --- /dev/null +++ b/docs/DATA_EXPANSION_PLAN.md @@ -0,0 +1,449 @@ +# GitHub Data Expansion Plan + +## Progress Tracker + +| Phase | Status | Completed | +|-------|--------|-----------| +| **Phase 1: Quick Wins** | ✅ Complete | Jan 2026 | +| **Phase 2: Community Metrics** | ✅ Complete | Jan 2026 | +| **Phase 3: Timeline Data** | 🔲 Not started | - | +| **Phase 4: PR/Issues** | 🔲 Not started | - | +| **Phase 5: Advanced** | 🔲 Not started | - | + +--- + +## Goals + +1. **Showcase OSS Impact** - Demonstrate the value and reach of repositories DevSeed contributes to +2. **Prove Community Effort** - Show that these are truly community-driven projects, not just DevSeed initiatives +3. **Track Contributions Over Time** - Visualize when and how DevSeed has contributed to the ecosystem +4. **Identify Active vs Stale Projects** - Help prioritize where effort is being spent + +## Current State + +### Repository Data Collected +| Field | Source | Purpose | +|-------|--------|---------| +| `repo_stars` | `repo.stargazers_count` | Popularity metric | +| `repo_forks` | `repo.forks_count` | Ecosystem reach | +| `repo_createdAt` | `repo.created_at` | Project age | +| `repo_updatedAt` | `repo.updated_at` | Recent activity indicator | +| `repo_total_commits` | `repo.get_commits().totalCount` | Development effort | +| `repo_languages` | `repo.get_languages()` | Tech stack | +| `repo_description` | `repo.description` | Context | + +### Contributor Link Data Collected +| Field | Source | Purpose | +|-------|--------|---------| +| `commit_count` | `contributor.contributions` | Individual contribution size | +| `commit_sec_min` | First commit timestamp | When contributor started | +| `commit_sec_max` | Last commit timestamp | Most recent activity | + +--- + +## Phase 1: Quick Wins ✅ COMPLETE + +**Effort: Minimal** - Same API calls, just extracting more fields +**Value: High** - Immediately enriches the visualization +**Status: Implemented in `models.py`** + +### Repository Fields + +| Field | PyGithub Property | Value | Notes | +|-------|-------------------|-------|-------| +| `watchers_count` | `repo.subscribers_count` | Shows sustained interest beyond "star and forget" | Free with existing call | +| `open_issues_count` | `repo.open_issues_count` | Indicates active project with ongoing work | Free with existing call | +| `license` | `repo.license.spdx_id` | OSS credibility, helps filtering | Free with existing call | +| `topics` | `repo.get_topics()` | Categorization, ecosystem mapping | 1 extra lightweight call | +| `has_discussions` | `repo.has_discussions` | Community engagement indicator | Free with existing call | +| `has_wiki` | `repo.has_wiki` | Documentation investment | Free with existing call | +| `default_branch` | `repo.default_branch` | Useful for links | Free with existing call | +| `archived` | `repo.archived` | Filter out inactive projects | Free with existing call | + +### Contributor Link Fields + +| Field | Derivation | Value | Notes | +|-------|------------|-------|-------| +| `contribution_span_days` | `commit_sec_max - commit_sec_min` | Shows long-term stewardship | Computed from existing data | +| `is_recent_contributor` | `commit_sec_max > (now - 90 days)` | Identifies active vs historical | Computed from existing data | + +### Implementation + +```python +# models.py - Repository additions +repo_watchers: int # repo.subscribers_count +repo_open_issues: int # repo.open_issues_count +repo_license: str | None # repo.license.spdx_id if repo.license else None +repo_topics: str # ",".join(repo.get_topics()) +repo_has_discussions: bool # repo.has_discussions +repo_archived: bool # repo.archived + +# models.py - Link additions (computed) +contribution_span_days: int # (commit_sec_max - commit_sec_min) // 86400 +is_recent_contributor: bool # commit_sec_max > (now - 90 days).timestamp() +``` + +--- + +## Phase 2: Community Metrics ✅ COMPLETE + +**Effort: Low-Medium** - One additional API call per repo +**Value: Very High** - Directly addresses "community effort" goal +**Status: Implemented in `models.py` and `client.py`** + +### Repository Fields + +| Field | PyGithub Property | Value | Notes | +|-------|-------------------|-------|-------| +| `total_contributors` | `repo.get_contributors().totalCount` | Proves community involvement | 1 call per repo | +| `devseed_contributor_count` | Count from existing links | Context for DevSeed's role | Computed | +| `external_contributor_count` | `total - devseed` | Community health metric | Computed | +| `community_ratio` | `external / total` | Key "community effort" metric | Computed | + +### Derived Metrics + +| Metric | Calculation | Value | +|--------|-------------|-------| +| **Bus Factor Indicator** | If `devseed_contributor_count == 1` and that person has >80% commits | Risk indicator | +| **Community Health Score** | `(external_contributors / total) * 100` | Higher = more community-driven | + +### Implementation + +```python +# models.py additions +repo_total_contributors: int +repo_devseed_contributors: int +repo_external_contributors: int +repo_community_ratio: float # external / total + +# client.py - new method +def get_contributor_stats(self, repo: Repo, devseed_usernames: set[str]) -> dict: + contributors = list(repo.get_contributors()) + total = len(contributors) + devseed = sum(1 for c in contributors if c.login in devseed_usernames) + return { + "total": total, + "devseed": devseed, + "external": total - devseed, + "ratio": (total - devseed) / total if total > 0 else 0 + } +``` + +--- + +## Phase 3: Timeline Data + +**Effort: Medium** - Uses GitHub Statistics API, may need retry logic +**Value: Very High** - Enables rich temporal visualizations + +### GitHub Statistics API Overview + +GitHub pre-computes repository statistics and caches them. First request may return `202 Accepted` (computing), requiring a retry after a few seconds. + +### Repository Timeline Fields + +| Field | PyGithub Method | Value | Notes | +|-------|-----------------|-------|-------| +| `weekly_commits` | `repo.get_stats_commit_activity()` | Activity heatmap, trend lines | 52 weeks of data | +| `owner_vs_community_weekly` | `repo.get_stats_participation()` | Shows community growth over time | 52 weeks, split by owner | +| `code_frequency` | `repo.get_stats_code_frequency()` | Additions/deletions over time | Shows sustained development | + +### Contributor Timeline Fields + +| Field | PyGithub Method | Value | Notes | +|-------|-----------------|-------|-------| +| `weekly_activity` | `repo.get_stats_contributors()` | Per-contributor commit timeline | Full history | +| `lines_added_total` | Sum from weekly data | Code contribution size | More meaningful than commits | +| `lines_deleted_total` | Sum from weekly data | Refactoring/maintenance work | Shows cleanup effort | +| `active_weeks_count` | Count weeks with commits > 0 | Consistency of contribution | Sustained vs burst | +| `first_contribution_week` | First week with activity | When they joined | More precise than first commit | +| `last_contribution_week` | Last week with activity | Current status | More precise than last commit | + +### Data Structures + +```python +# Weekly commit activity (repo-level) +weekly_commits: list[WeeklyCommit] # stored as JSON string in CSV + +class WeeklyCommit(BaseModel): + week: int # Unix timestamp (start of week) + total: int # Total commits that week + days: list[int] # Commits per day [Sun, Mon, ..., Sat] + +# Participation split (repo-level) +class ParticipationStats(BaseModel): + owner_total: int + community_total: int + owner_weekly: list[int] # 52 weeks + community_weekly: list[int] # 52 weeks + +# Contributor timeline (link-level) +class ContributorWeeklyStats(BaseModel): + week: int # Unix timestamp + commits: int + additions: int + deletions: int +``` + +### Implementation Notes + +```python +# client.py - with retry logic for stats API +import time + +def get_stats_with_retry(self, repo: Repo, stat_method: str, max_retries: int = 3): + """GitHub stats API returns 202 while computing. Retry until ready.""" + for attempt in range(max_retries): + result = getattr(repo, stat_method)() + if result is not None: + return result + time.sleep(2 ** attempt) # Exponential backoff: 1s, 2s, 4s + return None +``` + +--- + +## Phase 4: PR and Issue Activity + +**Effort: Medium-High** - Requires additional API calls, potentially many for active repos +**Value: High** - PRs often more meaningful than raw commits + +### Repository Fields + +| Field | PyGithub Method | Value | Notes | +|-------|-----------------|-------|-------| +| `total_prs` | `repo.get_pulls(state='all').totalCount` | Development activity | 1 call | +| `open_prs` | `repo.get_pulls(state='open').totalCount` | Current activity | 1 call | +| `merged_prs_30d` | Search API with date filter | Recent momentum | More expensive | +| `pr_merge_rate` | `merged / total` | Project health | Computed | +| `avg_pr_time_to_merge` | Requires iterating PRs | Maintainer responsiveness | Expensive | + +### Contributor Fields + +| Field | Method | Value | Notes | +|-------|--------|-------|-------| +| `prs_opened` | Search API: `author:{user} type:pr` | Contribution beyond commits | Per-user search | +| `prs_merged` | Search API with `is:merged` | Accepted contributions | Per-user search | +| `issues_opened` | Search API: `author:{user} type:issue` | Community engagement | Per-user search | +| `reviews_given` | GraphQL API | Quality contribution | Complex | + +### API Cost Considerations + +- **Search API**: 30 requests/minute (authenticated) +- **GraphQL**: More efficient for complex queries, 5000 points/hour +- **Recommendation**: Batch contributor queries, cache aggressively + +### Implementation Approach + +```python +# Use search API for contributor PR/issue counts +def get_contributor_activity(self, username: str, repo_full_name: str) -> dict: + # PRs authored in this repo + prs = self.github.search_issues( + f"repo:{repo_full_name} author:{username} type:pr" + ) + + # Issues authored in this repo + issues = self.github.search_issues( + f"repo:{repo_full_name} author:{username} type:issue" + ) + + return { + "prs_total": prs.totalCount, + "issues_total": issues.totalCount, + } +``` + +--- + +## Phase 5: Advanced Metrics + +**Effort: High** - Complex calculations, GraphQL, or external data +**Value: Medium-High** - Nice-to-have polish + +### Repository Health Metrics + +| Metric | Calculation | Value | +|--------|-------------|-------| +| **Release Frequency** | `repo.get_releases()` + date analysis | Project maturity | +| **Issue Response Time** | Avg time from issue open to first comment | Maintainer engagement | +| **PR Review Turnaround** | Avg time from PR open to first review | Community responsiveness | +| **Documentation Score** | Check for README length, CONTRIBUTING, CODE_OF_CONDUCT | Project professionalism | + +### Contributor Impact Metrics + +| Metric | Calculation | Value | +|--------|-------------|-------| +| **Code Review Activity** | GraphQL: `pullRequestReviews` | Quality contribution | +| **Cross-Repo Presence** | Count repos contributed to | Ecosystem influence | +| **Mentorship Indicator** | Reviews given vs commits made | Senior contributor signal | + +### External Data Sources + +| Source | Data Available | Integration | +|--------|----------------|-------------| +| **npm/PyPI downloads** | Package popularity | API calls to registries | +| **GitHub Sponsors** | Funding status | GraphQL API | +| **Dependent repos** | `repo.get_network_count()` | Shows downstream impact | + +--- + +## Implementation Phases Summary + +| Phase | New Fields | API Calls Added | Effort | Value | +|-------|------------|-----------------|--------|-------| +| **1: Quick Wins** | 8 repo, 2 link | ~1 per repo | 1-2 hours | High | +| **2: Community** | 4 repo | 1 per repo | 2-3 hours | Very High | +| **3: Timeline** | 3 repo, 6 link | 3 per repo (with retry) | 4-6 hours | Very High | +| **4: PR/Issues** | 5 repo, 4 link | 2-4 per repo + per contributor | 1-2 days | High | +| **5: Advanced** | Variable | Variable | 2-3 days | Medium | + +--- + +## Rate Limit Considerations + +| API Type | Limit (Authenticated) | Mitigation | +|----------|----------------------|------------| +| REST API | 5,000/hour | Cache responses, batch where possible | +| Search API | 30/minute | Queue searches, respect rate limits | +| GraphQL | 5,000 points/hour | Use for complex queries only | +| Statistics API | Subject to REST limit | Implement retry logic for 202 responses | + +### Recommendations + +1. **Cache aggressively** - Repository data changes slowly, cache for 24h minimum +2. **Incremental updates** - Only fetch new data since last run +3. **Batch operations** - Group API calls, respect rate limits +4. **Store raw responses** - Keep JSON files for debugging and re-processing + +--- + +## Visualization Opportunities + +With expanded data, new visualization options become possible: + +| Data | Visualization | Goal Addressed | +|------|---------------|----------------| +| Timeline data | Stacked area chart of commits over time | Show sustained contribution | +| Community ratio | Pie/donut chart per repo | Prove community effort | +| Contributor spans | Gantt-style timeline | Show long-term stewardship | +| Cross-repo activity | Network graph thickness | Show ecosystem presence | +| Activity heatmap | Calendar view | Identify active periods | + +--- + +## Next Steps + +1. ~~**Phase 1**: Add quick-win fields to models, update CLI~~ ✅ +2. ~~**Phase 2**: Add contributor counting, compute community ratios~~ ✅ +3. **Phase 3**: Integrate statistics API with retry logic +4. Review visualization needs before Phase 4+ + +--- + +## When to Consider a Database + +The current architecture uses JSON files for individual records and CSVs for aggregated output. This works well for the current scale but has tradeoffs worth considering. + +### Current Approach: JSON + CSV Files + +**Pros:** +- Simple, no infrastructure to maintain +- Easy to debug (human-readable files) +- Version controllable with git +- No setup required for new contributors +- Works offline + +**Cons:** +- No query capability (must load all data into memory) +- No relationships between entities +- Duplicate API calls if data isn't cached properly +- File I/O overhead scales linearly with data size + +### When to Switch: Decision Matrix + +| Trigger | Threshold | Recommendation | +|---------|-----------|----------------| +| **Number of repositories** | > 200 | Consider SQLite | +| **Number of contributors** | > 500 | Consider SQLite | +| **Timeline data (Phase 3)** | 52 weeks × repos × contributors | Likely need SQLite | +| **Query complexity** | Need JOINs, aggregations, filtering | Need a database | +| **Multiple consumers** | Dashboard + API + reports | Consider PostgreSQL | +| **Real-time updates** | Live data refresh | Consider PostgreSQL + cache | + +### Recommended Migration Path + +**Stage 1: SQLite (Local, Single-file)** +- When: Phase 3 implementation or >100 repos +- Why: Still simple, no server, but enables SQL queries +- Schema: Normalize repos, contributors, links, weekly_stats tables +- Effort: ~1 day to migrate + +**Stage 2: PostgreSQL (If needed)** +- When: Multiple users/services need access, or need advanced features +- Why: Concurrent access, better JSON support, full-text search +- Effort: ~2-3 days + infrastructure + +### Suggested SQLite Schema (for reference) + +```sql +-- Core entities +CREATE TABLE repositories ( + id INTEGER PRIMARY KEY, + full_name TEXT UNIQUE NOT NULL, + stars INTEGER, + forks INTEGER, + total_contributors INTEGER, + community_ratio REAL, + -- ... other fields + fetched_at TIMESTAMP +); + +CREATE TABLE contributors ( + id INTEGER PRIMARY KEY, + login TEXT UNIQUE NOT NULL, + display_name TEXT, + is_devseed BOOLEAN +); + +-- Relationships +CREATE TABLE contributions ( + id INTEGER PRIMARY KEY, + repo_id INTEGER REFERENCES repositories(id), + contributor_id INTEGER REFERENCES contributors(id), + commit_count INTEGER, + first_commit_at TIMESTAMP, + last_commit_at TIMESTAMP, + contribution_span_days INTEGER, + UNIQUE(repo_id, contributor_id) +); + +-- Timeline data (Phase 3) +CREATE TABLE weekly_stats ( + id INTEGER PRIMARY KEY, + repo_id INTEGER REFERENCES repositories(id), + contributor_id INTEGER REFERENCES contributors(id), -- NULL for repo-level + week_start TIMESTAMP, + commits INTEGER, + additions INTEGER, + deletions INTEGER +); + +-- Indexes for common queries +CREATE INDEX idx_contributions_repo ON contributions(repo_id); +CREATE INDEX idx_weekly_stats_repo_week ON weekly_stats(repo_id, week_start); +``` + +### Current Recommendation + +**Stay with JSON/CSV for now.** The current ~50 repositories and ~30 contributors are well within the file-based approach's sweet spot. Re-evaluate when: + +1. You implement Phase 3 (timeline data significantly increases data volume) +2. You want to query data in new ways (e.g., "show me all repos where DevSeed contribution dropped in the last 6 months") +3. Build times become noticeably slow (>5 minutes for full refresh) + +--- + +*Document created: January 2026* +*Last updated: January 2026* +*For: Development Seed Contributor Network Tool* diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md new file mode 100644 index 0000000..9cb6195 --- /dev/null +++ b/docs/DECISIONS.md @@ -0,0 +1,421 @@ +# Architectural Decisions + +Record of key decisions made in the project and their rationale. + +--- + +## Canvas Rendering (Not SVG) ✅ DECIDED + +**Decision:** Use HTML5 Canvas for rendering instead of SVG + +**Context:** +- Visualization has 200+ interactive nodes and 500+ links +- Need 60 FPS interaction response (hover, drag simulations) +- SVG would create DOM nodes for each element + +**Rationale:** +- Canvas provides superior performance for high-density visualizations +- SVG with D3 would create 700+ DOM elements (too slow) +- Canvas allows per-pixel control with animation frame batching +- Accepted tradeoff: Canvas requires custom tooltip/interaction logic (not automatic) + +**Consequences:** +- Responsible for all rendering logic (no automatic redraw) +- Must implement custom hit detection (solved with Delaunay triangulation) +- More complex tooltip positioning +- Better performance (60 FPS achievable) + +**Alternatives Considered:** +- WebGL: Overkill for this use case, harder to debug +- Hybrid (SVG + Canvas): Complexity not worth it + +--- + +## JSON/CSV Data Files (Not Database) ✅ DECIDED + +**Decision:** Store data as JSON files and export to CSV (not database) + +**Context:** +- Currently ~50 repositories, ~30 contributors +- Data collected offline (GitHub API → files → visualization) +- Need simple deployment (static site to CDN) + +**Rationale:** +- No infrastructure to manage (no database server) +- Easy to debug (human-readable files) +- Version controllable with git +- Simple to onboard new contributors (just `uv sync`) +- Works offline for local development + +**Consequences:** +- Must load all data into memory (works fine at current scale) +- No real-time filtering/querying capabilities +- File I/O overhead (negligible at current size) + +**When to Reconsider:** +- > 200 repositories +- > 500 contributors +- Need real-time updates +- Multiple services need access + +**Migration Path:** +- Stage 1: SQLite (local, single-file) when data volume warrants +- Stage 2: PostgreSQL (if multiple consumers needed) + +See `DATA_EXPANSION_PLAN.md` for detailed database discussion. + +--- + +## Modular JavaScript Architecture ✅ DECIDED + +**Decision:** Refactor JavaScript from monolith to modular structure + +**Context:** +- Original codebase: 3,400+ line `index.js` (from ORCA template) +- Hard to review, test, extend +- Multiple responsibilities: data prep, layout, rendering, interaction + +**Rationale:** +- Each module has single responsibility +- Easier to test in isolation +- Improves code review process (<300 lines per file) +- Supports future contributors and maintainability +- Each module becomes focused and reusable + +**Progress:** +- 60% complete (29 modules extracted) +- 4,642 lines in modular code +- Main file reduced from 3,400 → 2,059 lines +- ~900 lines remaining to extract + +**Target State:** +- Main orchestrator: ~300 lines (thin coordinating layer) +- All other modules: <300 lines each +- Clear data flow: Load → Prepare → Simulate → Render → Interact + +**Implementation Approach:** +- Extract gradually (don't rewrite from scratch) +- Keep tests passing after each extraction +- Each commit focuses on one extraction +- No rewrite of logic, just reorganization + +--- + +## Pydantic for Data Validation ✅ DECIDED + +**Decision:** Use Pydantic models for all data structures (Python) + +**Context:** +- Need to validate GitHub API responses +- Multiple sources of data (GitHub API, config files, generated) +- Type safety and runtime validation needed + +**Rationale:** +- Validates structure at entry point +- Type hints catch errors early +- Clear error messages when validation fails +- Automatic JSON serialization +- Works with mypy for static type checking + +**Implementation:** +- `models.py` defines Pydantic models for Repository, Link, etc. +- `client.py` converts GitHub API responses to our models +- `config.py` validates TOML configuration + +**Alternatives Considered:** +- Dataclasses: Less validation capability +- Plain dicts: No type safety, error-prone + +--- + +## Click for CLI Framework ✅ DECIDED + +**Decision:** Use Click for Python CLI commands + +**Context:** +- Need multiple subcommands (data, csvs, build, discover, list-contributors) +- Must be easy to use and document +- Should work well with automated workflows + +**Rationale:** +- Simple decorator-based command definition +- Automatic help text generation +- Type-safe argument/option handling +- Easy to test + +**Commands Implemented:** +- `data` - Fetch from GitHub +- `csvs` - Generate CSV exports +- `build` - Build static site +- `discover` - Find new repos +- `list-contributors` - Show configured contributors + +**Alternatives Considered:** +- argparse: More verbose +- Typer: Newer, nice syntax but less mature at time of decision + +--- + +## D3.js Force Simulations (Not Manual Layout) ✅ DECIDED + +**Decision:** Use D3 force-directed simulations to position nodes + +**Context:** +- Visualization shows relationships between nodes +- Hundreds of edges between nodes +- Need to avoid overlap and show structure + +**Rationale:** +- Force simulations naturally cluster related items +- Prevents overlap automatically (collision detection) +- Produces intuitive, readable layouts +- Interactive repositioning possible in future +- D3 provides proven, tested implementation + +**Architecture:** +- Four separate simulations for different repo grouping patterns +- Each simulation optimized for its use case +- Contributes to final layout naturally + +**Alternatives Considered:** +- Hierarchical tree layout: Doesn't fit network structure +- Grid layout: Too regular, loses relationship information +- Manual positioning: Not scalable as data grows + +--- + +## TypeScript Not Used ✅ DECIDED + +**Decision:** Use vanilla JavaScript (ES6 modules) without TypeScript + +**Context:** +- Relatively small frontend codebase +- Team comfort with JavaScript +- Deployment to static site (no build step needed) +- Fast iteration during development + +**Rationale:** +- No build step overhead during development +- Files immediately available in browser +- Changes visible without refresh +- Simpler development workflow +- Small module size keeps files manageable + +**Trade-offs:** +- Less compile-time type checking +- Rely on JSDoc for type hints +- Rely on testing for correctness + +**Alternatives Considered:** +- TypeScript: Good for larger projects, but adds complexity +- Flow: Similar issues to TypeScript + +--- + +## No Transpilation (ES6 Modules) ✅ DECIDED + +**Decision:** Use modern ES6 modules directly (no Babel transpilation) + +**Context:** +- All modern browsers support ES6 modules +- Simplifies development workflow +- Reduces build complexity + +**Rationale:** +- Developers can see their changes immediately +- No build step required during development +- Smaller cognitive overhead +- Works fine for this project's scale + +**Requirements:** +- Users must have modern browsers (works with all current major browsers) +- Not optimized for IE11 (but that's acceptable) + +**Deployment:** +- esbuild for production bundling (if needed) +- Currently deployed as static modules + +--- + +## Separate Simulations by Repo Type ✅ DECIDED + +**Decision:** Use different D3 force simulations based on how repos are grouped + +**Context:** +- Some repos belong to single owner +- Some repos have single DevSeed contributor +- Some repos have multiple collaborators +- Positioning needs differ for each type + +**Rationale:** +- Optimized force parameters for each scenario +- Natural clustering by collaboration pattern +- Cleaner visual hierarchy +- Prevents one type dominating layout + +**Implementation:** +``` +- ownerSimulation: Repos with single owner +- contributorSimulation: Repos with single DevSeed contributor +- collaborationSimulation: Repos shared between multiple contributors +- remainingSimulation: Contributors outside main circle +``` + +**Alternatives Considered:** +- Single universal simulation: Would require compromise on parameters +- Manual positioning: Doesn't scale, not reusable + +--- + +## Configuration via TOML ✅ DECIDED + +**Decision:** Use TOML for configuration (repositories, contributors) + +**Context:** +- Need to specify: + - Which repos to track + - Which contributors to include + - How to group/filter data + +**Rationale:** +- Human-readable and writable +- Better than JSON for configuration +- Python stdlib support (via tomli) +- Easy to edit without breaking structure + +**Example:** +```toml +[repositories] +"owner/repo" = "Display Name" + +[contributors.devseed] +github_username = "Display Name" +``` + +**Alternatives Considered:** +- JSON: More formal, harder to write +- YAML: Whitespace sensitivity can be error-prone +- Python file: Security concerns, harder to review + +--- + +## Removed ORCA Code ✅ DECIDED + +**Decision:** Remove ORCA-specific code and rebrand visualization + +**Context:** +- Project started as ORCA (top-contributor-network) +- Needed to make it DevSeed-specific +- ORCA code was a foundation, not meant to be kept + +**Changes Made:** +- ✅ Renamed `createORCAVisual` → `createContributorNetworkVisual` +- ✅ Removed ORCA-specific UI elements +- ✅ Updated debug flags (`orca-debug` → `debug-contributor-network`) +- ✅ Removed ORCA theming logic +- ✅ Updated branding to Development Seed colors +- ✅ Kept original MPL license and attribution + +**Rationale:** +- Make it clear this is the DevSeed visualization +- Avoid confusion with original ORCA project +- Simplify codebase (removed unused features) +- Establish clear ownership + +**Attribution:** +- License: MPL 2.0 (from ORCA) +- Credit: Original ORCA by Nadieh Bremer +- Link: https://github.com/nbremer/ORCA + +--- + +## Ruff for Python Linting/Formatting ✅ DECIDED + +**Decision:** Use Ruff instead of Black + Flake8 + isort + +**Context:** +- Python code quality tooling landscape fragmented +- Want unified approach +- Need fast, reliable tools + +**Rationale:** +- Single tool for format + lint + import sorting +- Very fast (written in Rust) +- Zero-config setup +- Compatible with Black formatting +- Better than individual tools + +**Configuration:** +- `pyproject.toml` defines settings +- CI runs `ruff format --check`, `ruff check`, `mypy` + +**Alternatives Considered:** +- Black + Flake8: Works but fragmented +- Pylint: Slower, more false positives +- Autopep8: Older approach + +--- + +## Vitest for JavaScript Testing ✅ DECIDED + +**Decision:** Use Vitest for unit testing JavaScript + +**Context:** +- Need to test modules independently +- Want fast test execution +- Want to test in Node (not browser) + +**Rationale:** +- Vitest is fast (built on Vite) +- Compatible with Jest syntax +- Zero-config with Vite setup +- Great for module testing + +**Test Coverage:** +- 75+ tests across modules +- Tests for filtering, validation, formatting, helpers +- More tests added as new modules extracted + +**Alternatives Considered:** +- Jest: Slower, larger +- Mocha: More setup required +- QUnit: Older approach + +--- + +## No Component Libraries ✅ DECIDED + +**Decision:** Build UI with vanilla HTML/CSS, no React/Vue/etc. + +**Context:** +- Small focused app (visualization + tooltip + controls) +- Performance critical +- No complex state management needs + +**Rationale:** +- Minimal dependencies +- Full control over rendering +- Better performance (Canvas + minimal DOM) +- Simpler deployment (static files) + +**Trade-off:** +- More manual DOM management for tooltips +- Build tooltips from scratch + +**When to Reconsider:** +- If dashboard complexity grows significantly +- If multiple views needed beyond visualization + +--- + +## Summary of Key Principles + +1. **Performance First** - Canvas rendering, modular code, fast tooling +2. **Simplicity** - No unnecessary frameworks, JSON/CSV data, static deployment +3. **Maintainability** - Modularization, testing, clear separation of concerns +4. **Scalability** - Design for growth, but don't over-engineer prematurely +5. **Attribution** - Respect original creators, use proper licensing + +--- + +**Last Updated**: February 2026 diff --git a/docs/DEVELOPMENT_GUIDE.md b/docs/DEVELOPMENT_GUIDE.md new file mode 100644 index 0000000..0b0185a --- /dev/null +++ b/docs/DEVELOPMENT_GUIDE.md @@ -0,0 +1,467 @@ +# Development Guide + +How to set up your local environment, run the project, and make changes. + +--- + +## Prerequisites + +Before getting started, install: + +- **[uv](https://docs.astral.sh/uv/getting-started/installation/)** - Fast Python package manager (required) +- **[Node.js](https://nodejs.org/)** 18+ - For JavaScript tooling +- **[Git](https://git-scm.com/)** - For version control +- **GitHub personal access token** - With `public_repo` scope (for fetching data) + +### Getting a GitHub Token + +1. Go to https://github.com/settings/tokens +2. Click "Generate new token (classic)" +3. Give it a name (e.g., "contributor-network") +4. Check `public_repo` scope +5. Generate and copy the token +6. Store it somewhere safe (you'll use it for the `data` command) + +--- + +## Installation + +### 1. Clone the Repository + +```bash +git clone https://github.com/developmentseed/contributor-network.git +cd contributor-network +``` + +### 2. Install Python Dependencies + +```bash +uv sync +``` + +This installs all Python dependencies specified in `pyproject.toml` into a virtual environment. + +### 3. Install JavaScript Dependencies + +```bash +npm install +``` + +This installs D3.js, Vitest, and other frontend tooling. + +--- + +## Running Locally + +### Option A: View the Current Build + +If you already have data in `assets/data/`, you can view the built site locally: + +```bash +python -m http.server 8000 +``` + +Then open http://localhost:8000/ in your browser. + +### Option B: Full Workflow - Fetch Data & Build + +To update the visualization with fresh GitHub data: + +```bash +# 1. Set your GitHub token +export GITHUB_TOKEN="your_token_here" + +# 2. (Optional) Discover new repos that multiple team members contribute to +uv run contributor-network discover --min-contributors 2 + +# 3. Edit config.toml to add/remove repos or contributors + +# 4. Fetch data from GitHub +uv run contributor-network data assets/data assets/data + +# 5. Generate CSV files +uv run contributor-network csvs assets/data + +# 6. Build the static site +uv run contributor-network build assets/data dist + +# 7. View the built site +cd dist && python -m http.server 8000 +# Open http://localhost:8000/ +``` + +--- + +## Development Workflows + +### Making Frontend Changes + +The JavaScript files in `src/js/` are auto-available to the browser without a build step during development. + +**Workflow:** +1. Make changes to files in `src/js/` +2. Refresh http://localhost:8000/ in your browser +3. See changes immediately (no build required) +4. Run tests to verify: `npm test` + +**Special cases:** +- If you modify `src/js/chart.js` (the main visualization), it compiles to `chart.js` in the root +- If you add new modules to `src/js/`, export them from `src/js/index.js` + +### Making Backend Changes + +Python CLI changes take effect immediately (no build step needed). + +**Workflow:** +1. Make changes to `python/contributor_network/` +2. Re-run the CLI command: `uv run contributor-network ` +3. Changes are reflected +4. Run tests: `uv run pytest` + +### Adding a New Repository to Track + +1. **Get the repo URL** - e.g., `owner/repo-name` +2. **Edit `config.toml`**: + ```toml + [repositories] + "owner/repo-name" = "Display Name" + ``` +3. **Fetch fresh data**: + ```bash + export GITHUB_TOKEN="your_token_here" + uv run contributor-network data assets/data assets/data + ``` +4. **Regenerate CSVs**: + ```bash + uv run contributor-network csvs assets/data + ``` +5. **Rebuild the site**: + ```bash + uv run contributor-network build assets/data dist + ``` + +### Adding a New Contributor + +1. **Edit `config.toml`**: + ```toml + [contributors.devseed] + github_username = "Display Name" + + # Or for alumni/external: + [contributors.alumni] + github_username = "Display Name" + ``` +2. **Fetch data and rebuild** (same as above) + +### Customizing the Visualization + +**Colors & Fonts**: Edit `src/js/config/theme.js` + +**Layout Constants**: Edit the `LAYOUT` object in `src/js/config/theme.js` + +**Filters Available**: Check `src/js/state/filterState.js` to see what filters are available + +**Data Filtering Logic**: See `src/js/data/filter.js` for how filters are applied + +--- + +## Quality Checks & Tests + +### Python Quality Checks + +```bash +# Format check (no changes) +uv run ruff format --check . + +# Lint check (no changes) +uv run ruff check . + +# Type checking +uv run mypy + +# Run tests +uv run pytest + +# Run a specific test +uv run pytest python/tests/test_config.py::test_function_name +``` + +### Auto-Fix Python Issues + +```bash +# Auto-format all files +uv run ruff format . + +# Auto-fix fixable lint issues +uv run ruff check --fix . +``` + +### JavaScript Tests + +```bash +# Run all tests +npm test + +# Run tests in watch mode (re-run on file changes) +npm test -- --watch + +# Run a specific test file +npm test -- src/js/__tests__/filter.test.js +``` + +### Run All Checks (As in CI) + +```bash +# Python +uv run ruff format --check . +uv run ruff check . +uv run mypy +uv run pytest + +# JavaScript +npm test +``` + +--- + +## CLI Commands Reference + +### `list-contributors` + +List all configured contributors by category: + +```bash +uv run contributor-network list-contributors +``` + +**Output**: Shows current DevSeed, alumni, and external contributors. + +### `discover` + +Find new repositories where multiple DevSeed employees have contributed: + +```bash +export GITHUB_TOKEN="your_token_here" +uv run contributor-network discover --min-contributors 2 --limit 50 +``` + +**Options:** +- `--min-contributors N` - Minimum number of DevSeed contributors in repo (default: 2) +- `--limit N` - Limit results to N repos (default: 50) + +**Output**: Shows repos not yet in `config.toml` where DevSeed has activity. + +### `data` + +Fetch contribution data from GitHub for all configured repositories: + +```bash +export GITHUB_TOKEN="your_token_here" +uv run contributor-network data assets/data assets/data +``` + +**Arguments:** +- First arg: Input directory (where to save JSON data) - usually `assets/data` +- Second arg: Output directory (for this command, same as input) - usually `assets/data` + +**Options:** +- `--all-contributors` - Include alumni/friends, not just current DevSeed employees + +**Output**: Creates JSON files for each repository with contribution data. + +### `csvs` + +Generate CSV files from the fetched JSON data: + +```bash +uv run contributor-network csvs assets/data +``` + +**Argument:** Directory containing JSON files (usually `assets/data`) + +**Output:** Creates: +- `repositories.csv` - Repository metadata +- `contributors.csv` - Contributor-to-repo relationships + +### `build` + +Build the static site to deploy: + +```bash +uv run contributor-network build assets/data dist +``` + +**Arguments:** +- First arg: Data directory (`assets/data`) +- Second arg: Output directory (`dist`) + +**Output:** Creates `dist/` with static HTML/CSS/JS ready to deploy. + +--- + +## Project Structure for Developers + +### Python Backend + +``` +python/ +├── contributor_network/ +│ ├── __init__.py +│ ├── cli.py # Click CLI commands (entry point) +│ ├── client.py # GitHub API wrapper (uses PyGithub) +│ ├── config.py # Pydantic config models +│ ├── models.py # Data models (Repo, Link, etc.) +│ └── __main__.py # CLI entry point +├── templates/ +│ └── index.html.j2 # Jinja2 template for index.html +└── tests/ + └── test_*.py # Unit tests +``` + +**Key entry points:** +- `python/contributor_network/cli.py` - All CLI commands defined here +- `python/contributor_network/client.py` - GitHub API integration +- `python/contributor_network/models.py` - Data structure definitions + +### JavaScript Frontend + +``` +src/js/ +├── index.js # Barrel exports (re-exports all modules) +├── visualization/ +│ └── index.js # Main visualization factory +├── config/ # Configuration +│ ├── theme.js # Colors, fonts, layout constants +│ └── scales.js # D3 scale factories +├── data/ # Data operations +│ └── filter.js # Filtering logic +├── interaction/ # Event handlers +│ ├── hover.js # Hover event handling +│ ├── click.js # Click event handling +│ └── findNode.js # Node detection via Delaunay +├── layout/ # Layout & positioning +│ └── resize.js # Canvas resize handling +├── render/ # Drawing functions +│ ├── canvas.js # Canvas setup +│ ├── shapes.js # Shape drawing utilities +│ ├── text.js # Text rendering +│ ├── tooltip.js # Tooltip rendering +│ ├── labels.js # Node labels +│ └── repoCard.js # Repo details card +├── simulations/ # D3 force simulations +│ ├── ownerSimulation.js +│ ├── contributorSimulation.js +│ ├── collaborationSimulation.js +│ └── remainingSimulation.js +├── state/ # State management +│ ├── filterState.js # Filter state +│ └── interactionState.js # Hover/click state +├── utils/ # Utilities +│ ├── helpers.js # Math utilities +│ ├── formatters.js # Date/number formatting +│ ├── validation.js # Data validation +│ └── debug.js # Debug logging +└── __tests__/ # Unit tests +``` + +--- + +## Debugging Tips + +### JavaScript Debugging + +**In Browser DevTools:** +1. Open DevTools (F12) +2. Check the Console for errors +3. Look for debug output (the code logs with `debug-contributor-network` flag) +4. Inspect the Network tab to see what data was loaded +5. Check the Elements tab to inspect the canvas and DOM + +**Enable Debug Logging:** +```javascript +// In the browser console +localStorage.setItem('debug', 'debug-contributor-network'); +// Reload the page +``` + +**Check Loaded Data:** +```javascript +// In the browser console +// After the visualization loads, you can access: +console.log(window.data); // Raw data +console.log(window.nodes); // Processed nodes +``` + +### Python Debugging + +**Add print statements:** +```python +# In Python code +print(f"Debug: {variable_name}") # Will show in terminal + +# Or use logging +import logging +logger = logging.getLogger(__name__) +logger.debug(f"Debug message: {value}") +``` + +**Run a specific test with output:** +```bash +uv run pytest python/tests/test_file.py -v -s +``` + +The `-s` flag shows print statements and logging output. + +--- + +## Common Issues & Solutions + +### Issue: "GitHub API rate limit exceeded" + +**Solution:** +- Make sure you're using a GitHub token: `export GITHUB_TOKEN="your_token"` +- Unauthenticated requests have a much lower limit (60/hour vs 5,000/hour) +- Wait an hour for the limit to reset, or wait for exponential backoff retry logic + +### Issue: `uv: command not found` + +**Solution:** +```bash +# Install uv if you haven't +curl -LsSf https://astral.sh/uv/install.sh | sh + +# Or on macOS with homebrew +brew install uv +``` + +### Issue: Changes to `src/js/` aren't showing up + +**Solution:** +1. Make sure you're running `python -m http.server 8000` +2. Hard-refresh your browser: Ctrl+Shift+R (or Cmd+Shift+R on Mac) +3. Check that the file was actually saved + +### Issue: Tests are failing + +**Solution:** +```bash +# Run tests with verbose output +npm test -- --reporter=verbose + +# Or for Python +uv run pytest -v +``` + +--- + +## Where to Get Help + +- **Setup problems**: Check this file first +- **Code structure questions**: See `ARCHITECTURE.md` +- **What should I work on next?**: See `roadmap.md` +- **How does filtering work?**: See the code comments in `src/js/data/filter.js` +- **Data expansion ideas**: See `DATA_EXPANSION_PLAN.md` + +--- + +**Last Updated**: February 2026 diff --git a/docs/JAVASCRIPT_REFACTORING.md b/docs/JAVASCRIPT_REFACTORING.md new file mode 100644 index 0000000..3338db1 --- /dev/null +++ b/docs/JAVASCRIPT_REFACTORING.md @@ -0,0 +1,306 @@ +# JavaScript Refactoring Progress & Roadmap + +Current status of the JavaScript modularization effort. + +--- + +## Overall Progress: 60% Complete ✅ + +| Metric | Before | Current | Target | Status | +|--------|--------|---------|--------|--------| +| Main file size | 3,400+ lines | 2,059 lines | ~300-400 lines | 🟡 In Progress | +| Modular files | 0 | 29 modules | 29+ modules | ✅ Complete | +| Total modular code | 0 lines | 4,642 lines | ~4,500 lines | ✅ Complete | +| Largest module | N/A | 533 lines (tooltip.js) | <300 lines | 🟡 Needs work | + +--- + +## What's Been Done ✅ + +### Phase 1: Configuration & Constants +- ✅ `src/js/config/theme.js` - Colors, fonts, layout constants (119 lines) +- ✅ `src/js/config/scales.js` - D3 scale factories (121 lines) +- **Result:** Centralized configuration, easy to customize + +### Phase 2: State Management +- ✅ `src/js/state/filterState.js` - Filter state management (67 lines) +- ✅ `src/js/state/interactionState.js` - Hover/click state (106 lines) +- **Result:** Clear separation of state concerns + +### Phase 3: Force Simulations +- ✅ `src/js/simulations/ownerSimulation.js` (125 lines) +- ✅ `src/js/simulations/contributorSimulation.js` (132 lines) +- ✅ `src/js/simulations/collaborationSimulation.js` (188 lines) +- ✅ `src/js/simulations/remainingSimulation.js` (84 lines) +- **Result:** 529 lines extracted, easier to test and modify + +### Phase 4: Interaction Handlers +- ✅ `src/js/interaction/hover.js` - Mouse hover handling (87 lines) +- ✅ `src/js/interaction/click.js` - Click handling (85 lines) +- ✅ `src/js/interaction/findNode.js` - Node detection via Delaunay (67 lines) +- **Result:** Separated event logic from rendering + +### Phase 5: Render Functions +- ✅ `src/js/render/shapes.js` - Shape drawing (277 lines) +- ✅ `src/js/render/text.js` - Text utilities (275 lines) +- ✅ `src/js/render/tooltip.js` - Tooltip rendering (533 lines) ⚠️ +- ✅ `src/js/render/labels.js` - Node labels (141 lines) +- ✅ `src/js/render/repoCard.js` - Repo card rendering (248 lines) +- ✅ `src/js/render/canvas.js` - Canvas setup (207 lines) +- **Result:** Rendering logic broken down by component + +### Phase 6: Layout & Utilities +- ✅ `src/js/layout/resize.js` - Resize handling (122 lines) +- ✅ `src/js/utils/helpers.js` - Math utilities (121 lines) +- ✅ `src/js/utils/formatters.js` - Date/number formatting (153 lines) +- ✅ `src/js/utils/validation.js` - Data validation (185 lines) +- ✅ `src/js/utils/debug.js` - Debug logging (147 lines) +- **Result:** Utilities organized by concern + +### Phase 7: Data Management +- ✅ `src/js/data/filter.js` - Filtering logic (217 lines) +- **Result:** Pure functions for data transformation + +--- + +## What Still Needs Work 🟡 + +### High Priority + +**1. Extract `prepareData()` function** (~515 lines) +- **Location:** Currently in main `index.js`, lines 683-1198 +- **Should move to:** `src/js/data/prepare.js` +- **What it does:** Transforms raw CSV data into nodes and links +- **Complexity:** High - depends on many local variables +- **Effort:** 4-6 hours +- **Why important:** Largest single extraction remaining, holds up refactoring + +**2. Extract `positionContributorNodes()` function** (~117 lines) +- **Location:** Currently in main `index.js`, lines 1212-1310 +- **Should move to:** `src/js/layout/positioning.js` +- **What it does:** Calculates contributor ring positions +- **Complexity:** Medium - clear inputs/outputs +- **Effort:** 2-3 hours +- **Why important:** Separates layout logic from main orchestrator + +**3. Simplify `draw()` function** (~166 lines) +- **Location:** Currently in main `index.js`, lines 448-514 +- **Should move to:** `src/js/render/draw.js` or keep as thin orchestrator +- **What it does:** Main drawing loop, calls render functions +- **Complexity:** Medium - mostly orchestration +- **Effort:** 2-3 hours +- **Why important:** Main loop should be readable at a glance + +### Medium Priority + +**4. Extract `drawHoverState()` function** (~130 lines) +- **Location:** Currently in main `index.js`, lines 1538-1668 +- **Should move to:** `src/js/render/hoverState.js` +- **Complexity:** Medium +- **Effort:** 2 hours + +**5. Extract helper functions** (~100 lines spread across code) +- `isValidContributor()` - Validation helper +- `syncDelaunayVars()` - Delaunay state sync +- `calculateEdgeCenters()` - Link path calculations +- `calculateLinkGradient()` - Gradient colors +- **Should move to:** `src/js/utils/` modules + +**6. Remove wrapper functions** (~50 lines) +- Temporary compatibility layer after migration +- Can be removed once main extraction complete + +### Lower Priority + +**7. Extract canvas setup code** (~60 lines) +- Already partially in `canvas.js` +- Can move remaining setup to initialization module + +--- + +## Detailed Extraction Roadmap + +### Step 1: Extract `prepareData()` → `src/js/data/prepare.js` + +**What to do:** +1. Create new file `src/js/data/prepare.js` +2. Move `prepareData()` function from main `index.js` +3. Extract helper functions it depends on +4. Import it in main `index.js` +5. Update tests to test the module directly +6. Verify build still works + +**Dependencies to handle:** +- Uses colors from `theme.js` (import them) +- Uses scales from `scales.js` (import them) +- Uses validation from `validation.js` (import them) +- Creates objects with specific structure (document in JSDoc) + +**Expected after extraction:** +- Main `index.js` shrinks by ~515 lines +- Easier to test data transformation separately +- Clearer data flow from CSV → nodes/links + +### Step 2: Extract `positionContributorNodes()` → `src/js/layout/positioning.js` + +**What to do:** +1. Create new file `src/js/layout/positioning.js` +2. Move `positionContributorNodes()` function +3. Import from main `index.js` +4. Add unit tests for positioning logic + +**Dependencies:** +- Layout constants from `theme.js` (already exported) +- Math utilities from `helpers.js` (import as needed) + +**Expected after extraction:** +- Clear separation of layout vs rendering +- Easier to test position calculations +- Could support alternative layout algorithms in future + +### Step 3: Simplify `draw()` function + +**What to do:** +1. Review current `draw()` - what's orchestration vs. logic? +2. Extract logic into separate modules where applicable +3. Reduce `draw()` to simple: get state → call render functions → schedule next frame +4. Consider creating `src/js/render/draw.js` for orchestration + +**Expected after extraction:** +- Main loop readable in 5-10 lines +- Each frame clearly shows: what updates, what renders +- Easier to understand frame-by-frame flow + +--- + +## Module Size Targets + +After all extractions, target <300 lines per file: + +| File | Current | Target | +|------|---------|--------| +| `src/js/data/prepare.js` | N/A | ~400 lines | +| `src/js/layout/positioning.js` | N/A | ~100 lines | +| `src/js/render/draw.js` | N/A | ~100 lines | +| `src/js/render/hoverState.js` | N/A | ~120 lines | +| Main `index.js` | 2,059 | ~300-400 lines | +| All other modules | Various | <300 lines ✅ | + +--- + +## Testing Strategy + +### Current Test Coverage + +- 75 tests across extracted modules +- Tests for filtering, validation, formatting, helpers +- Vitest framework + +### Testing New Extractions + +When extracting new functions: +1. Write unit tests *first* if not already present +2. Test the extracted function in isolation +3. Test integration with dependent modules +4. Run full test suite to ensure no regressions + +**Example for `prepareData()`:** +```javascript +// src/js/data/__tests__/prepare.test.js +import { prepareData } from '../prepare.js'; + +describe('prepareData', () => { + it('transforms raw data into nodes and links', () => { + const raw = { /* ... */ }; + const result = prepareData(raw, config); + expect(result.nodes).toBeDefined(); + expect(result.links).toBeDefined(); + }); +}); +``` + +--- + +## Implementation Timeline + +| Task | Effort | Estimated Duration | +|------|--------|-------------------| +| Extract `prepareData()` | High | 4-6 hours | +| Extract `positionContributorNodes()` | Medium | 2-3 hours | +| Simplify `draw()` | Medium | 2-3 hours | +| Extract `drawHoverState()` | Medium | 2 hours | +| Extract helpers & cleanup | Low-Med | 2-3 hours | +| **Total** | **High** | **12-18 hours** | + +**Recommended breakdown:** +- Session 1: Extract `prepareData()` (largest impact) +- Session 2: Extract positioning & simplify draw +- Session 3: Polish remaining functions + +--- + +## Migration Checklist + +For each extraction, follow this checklist: + +- [ ] Create new module file +- [ ] Copy code to new file +- [ ] Identify and resolve dependencies +- [ ] Add JSDoc comments +- [ ] Write or update unit tests +- [ ] Update main `index.js` to import +- [ ] Run full test suite: `npm test` +- [ ] Verify build: `npm run build` +- [ ] Test in browser (visual regression) +- [ ] Commit with clear message + +**Commit message template:** +``` +refactor(js): Extract [function_name] to [new_module] + +- Moved [function_name] from index.js to [new_module] +- Reduces index.js by [X] lines +- Adds [Y] lines to [new_module] +- All tests passing +``` + +--- + +## Benefits of Completing This + +✅ **Main orchestrator becomes readable** - ~300 lines instead of 2,000+ + +✅ **Each module has single responsibility** - Easier to understand + +✅ **Improved testability** - Can test functions in isolation + +✅ **Better git history** - Smaller, focused commits + +✅ **Easier code review** - <300 lines per module is reviewable in 10-15 min + +✅ **Reduced maintenance burden** - Clearer code, fewer dependencies per file + +✅ **Foundation for future work** - Makes adding features easier + +--- + +## How to Track Progress + +1. Check the line count in main `index.js`: `wc -l src/js/index.js` +2. Run tests after each extraction: `npm test` +3. Check module sizes: `wc -l src/js/*/` +4. Monitor with git history: `git log --oneline src/js/` + +--- + +## Questions to Ask When Extracting + +- **Does this function have a single, clear responsibility?** ✓ +- **Can it be tested independently?** (If not, break it down) +- **Does it depend on many external variables?** (If yes, pass as parameters) +- **Is it more than 200 lines?** (If yes, consider breaking further) +- **Will other modules want to use this?** (If yes, export clearly) + +--- + +**Last Updated**: February 2026 diff --git a/docs/PRD.md b/docs/PRD.md new file mode 100644 index 0000000..d67473e --- /dev/null +++ b/docs/PRD.md @@ -0,0 +1,290 @@ +# Product Requirements Document: Contributor Network Visualization + +## Executive Summary + +**Contributor Network** is an interactive web visualization that showcases Development Seed's contributions to open-source projects. It displays the relationships between Development Seed team members, the repositories they contribute to, and the broader ecosystem of collaborators on those projects. + +The tool serves three core purposes: +1. **Showcase OSS Impact** - Demonstrate the value and reach of repositories DevSeed contributes to +2. **Prove Community Effort** - Show that these are truly community-driven projects, not just DevSeed initiatives +3. **Track Contributions Over Time** - Visualize when and how DevSeed has contributed to the ecosystem + +--- + +## Product Overview + +### What It Is + +A D3.js-based interactive network visualization that: +- Displays DevSeed contributors as nodes arranged in a circle +- Shows repositories as nodes positioned based on collaboration patterns +- Visualizes connections (links) between contributors and repos +- Provides filtering by organization and repository metrics +- Offers rich tooltips showing contributor and repository statistics +- Enables interactive exploration through hover and click interactions + +### Live Demo + +https://developmentseed.org/contributor-network + +### Core Features + +#### 1. **Network Visualization** +- Contributors arranged alphabetically around a central ring +- Repositories grouped by ownership pattern (single owner, multiple contributors, collaborations) +- Force-directed layout creates natural clustering of related projects +- Visual flows show which contributors have worked on which repos + +#### 2. **Filtering & Discovery** +- Filter by organization (e.g., show only repos where "Conservation Labs" contributed) +- Filter repositories by: + - Minimum stars + - Minimum forks + - Minimum watchers + - Programming language +- Clear all filters with one click + +#### 3. **Interactive Exploration** +- **Hover**: Highlight a contributor or repo to see its connections +- **Click**: Select a contributor to see detailed stats about their contributions +- **Hover + Click**: When a contributor is selected, hover over repos to see the specific link details (commits, dates) + +#### 4. **Rich Information Display** +- **Contributor Tooltips**: Name, organization, contribution count, date range +- **Repository Tooltips**: Name, stars, forks, watchers, languages, open issues, community metrics +- **Statistics**: Shows commit counts, contribution spans, community involvement ratios + +#### 5. **Visual Design** +- Uses Development Seed brand colors +- Clear typography with readable labels +- Responsive canvas that adapts to window size +- Optimized for both desktop exploration and presentation use + +--- + +## Technical Stack + +### Backend (Python) +- **Language**: Python 3.10+ +- **Package Management**: `uv` (fast Python package installer) +- **CLI Framework**: Click (for command-line interface) +- **Data Validation**: Pydantic +- **GitHub API Client**: PyGithub +- **Data Format**: TOML config, JSON data files, CSV exports + +### Frontend (JavaScript) +- **Visualization**: D3.js (v7) +- **Canvas Rendering**: HTML5 Canvas (for performance) +- **Bundler**: esbuild (via npm scripts) +- **Testing**: Vitest +- **Architecture**: Modular ES6 modules + +### Deployment +- **Hosting**: Static site (GitHub Pages or CDN) +- **Build**: GitHub Actions workflow +- **Source**: GitHub repository (`developmentseed/contributor-network`) + +--- + +## Data Flow + +``` +GitHub API + ↓ +Python CLI (client.py) + ↓ +JSON Files (assets/data/) + ↓ +CSV Generation (csvs command) + ↓ +Configuration (config.toml) + ↓ +D3.js Visualization (index.html) +``` + +### Data Collection Process + +1. **Configuration** (`config.toml`): + - Specify repositories to track: `owner/repo` format + - Define contributors: Current DevSeed team, alumni, external collaborators + - Tag teams/organizations for filtering + +2. **Data Fetching** (`uv run contributor-network data`): + - Queries GitHub API for each configured repo + - Collects: commits, contributors, stars, forks, languages, topics, etc. + - Stores raw JSON in `assets/data/` + +3. **CSV Generation** (`uv run contributor-network csvs`): + - Converts JSON to CSV format for web consumption + - Creates two main files: + - `repositories.csv` - Repo metadata and metrics + - `contributors.csv` - Contributor-to-repo relationships and commit details + +4. **Site Build** (`uv run contributor-network build`): + - Bundles JavaScript modules + - Generates static HTML + - Outputs to `dist/` for deployment + +--- + +## Current Capabilities + +### What Works Today + +#### Configuration-Driven +- Edit `config.toml` to add/remove repositories and contributors +- Auto-discover new repositories where multiple DevSeed members contributed +- Support for team/organization grouping + +#### Data Collection +- Fetches commit counts, dates, and contributor lists +- Collects repository metrics (stars, forks, watchers, languages, topics, etc.) +- Calculates community health metrics: + - Total contributors per repo + - DevSeed vs external contributor split + - Community contribution ratio + +#### Visualization +- Force-directed layout with optimized positioning +- Color-coded by contributor type and repository ownership +- Responsive to window resizing +- Hover states with gradient highlighting +- Click to select for detailed stats +- Smooth animations and transitions + +#### Filtering +- Organization-based filtering +- Repository metrics filters (stars, forks, watchers) +- Language filters +- Clear filters button + +--- + +## Target Users & Use Cases + +### Primary Users +- **Development Seed Team**: Show impact and community involvement to stakeholders +- **Potential Clients/Partners**: Demonstrate expertise in open-source ecosystems +- **Community Members**: Discover how to contribute to funded projects +- **Media/Press**: Visual story about DevSeed's open-source commitment + +### Key Use Cases + +1. **Impact Storytelling** + - "DevSeed contributed to 50+ repositories with 300+ external collaborators" + - Show the breadth of ecosystem impact + +2. **Team Highlights** + - Interactive way to showcase team member contributions + - Identify cross-project collaboration patterns + +3. **Community Health Assessment** + - Visualize which projects have active external communities + - Identify projects that need more community investment + +4. **Contribution Discovery** + - Help new community members find where to contribute + - See which projects align with their interests + +--- + +## Success Metrics + +### Technical Metrics +- **Load Time**: < 3 seconds on typical broadband +- **Interaction Responsiveness**: < 100ms on hover/click +- **Accessibility**: WCAG 2.1 AA compliance +- **Performance**: 60 FPS on modern browsers + +### Business/Product Metrics +- **Engagement**: Avg session duration on the visualization page +- **Discovery**: Click-through rate to individual repositories +- **Reach**: Views per month, geographic distribution +- **Feedback**: User comments/shares on social media + +--- + +## Technical Constraints & Considerations + +### Rate Limiting +- GitHub API: 5,000 requests/hour (authenticated) +- Search API: 30 requests/minute +- Statistics API: Subject to REST rate limits with 202 retry behavior +- **Solution**: Aggressive caching, incremental updates, batch operations + +### Data Volume +- Current: ~50 repositories, ~30 contributors +- Current approach (JSON/CSV files) works well at this scale +- Migration to SQLite recommended if: + - > 200 repositories + - > 500 contributors + - Need Phase 3+ timeline data + +### Browser Compatibility +- Modern browsers (Chrome, Firefox, Safari, Edge) +- Canvas support required +- ES6 module support required (no transpilation) +- Not optimized for mobile (desktop-first design) + +--- + +## Configuration & Customization + +### Repository Configuration (`config.toml`) +```toml +[repositories] +"owner/repo-name" = "Display Name" + +[contributors.devseed] +github_username = "Display Name" + +[contributors.alumni] +github_username = "Display Name" +``` + +### Visualization Customization +- **Colors**: Defined in `src/js/config/theme.js` +- **Layout**: Force simulation parameters, collision detection +- **Font sizes**: Theme configuration (currently under refactoring to increase) +- **Filters**: Defined in filter state management modules + +--- + +## Project Status & Roadmap + +See [`roadmap.md`](./roadmap.md) for current project status, planned features, and implementation details. The roadmap is the single source of truth for what's been completed, what's in progress, and what's planned next. + +--- + +## Development Guidelines + +### Code Quality Standards +- **Python**: Typed with mypy, formatted with ruff, tested with pytest +- **JavaScript**: Modular architecture, unit tests with Vitest, <300 lines per file (target) +- **Both**: Clear separation of concerns, single responsibility principle + +### Git Workflow +- Main branch: Always deployable +- Feature branches: Descriptive names +- PRs required for all changes +- CI/CD validates tests, linting, type checking before merge + +### Documentation +- Code comments for complex logic +- Docstrings for all public functions (Python) +- JSDoc comments for exported functions (JavaScript) +- Runbooks for common operations + +--- + +## License & Attribution + +**License**: Mozilla Public License (MPL) 2.0 +**Original Work**: [ORCA top-contributor-network](https://github.com/nbremer/ORCA/tree/main/top-contributor-network) by Nadieh Bremer +**Modifications**: Development Seed (2025) + +--- + +*Document Version: 1.0* +*Last Updated: February 2026* +*Maintained by: Development Seed Team* diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md new file mode 100644 index 0000000..12f4c20 --- /dev/null +++ b/docs/ROADMAP.md @@ -0,0 +1,235 @@ +# Roadmap + +Project status, planned features, and verification criteria. + +--- + +## Project Status + +### Completed +- Core visualization and interactions +- Repository and contributor discovery +- Basic filtering (organization, metrics) +- Data expansion phases 1-2 (metadata and community metrics) +- JavaScript modularization and refactoring +- ORCA code removal and rebrand + +### In Progress +- UX and chart readability improvements (font sizes, UI refinement) + +### Planned +- See [Current Implementation Batch](#features-current-implementation-batch) and [Longer-Term Enhancements](#longer-term-enhancements) below + +--- + +## Features: Current Implementation Batch + +These are the next features planned for the visualization. Implementation details are documented in `IMPLEMENTATION_PLAN.md`. + +--- + +### Feature 1: More Repository Filters 🟡 Ready to Design + +**What it does:** +Add filtering UI controls for: +- Minimum stars +- Minimum forks +- Minimum watchers +- Programming language + +**Why:** +- Let users explore by project scale +- Filter by tech stack +- Discover active vs abandoned projects +- Show modern tech preferences + +**Implementation approach:** + +**4a. Extend filter state** (`js/state/filterState.js`) +```javascript +{ + organizations: [], + starsMin: null, + forksMin: null, + watchersMin: null, + language: null +} +``` + +**4b. Extend filtering logic** (`js/chart.js` `applyFilters()`) +```javascript +// After org filter, add: +if (activeFilters.starsMin !== null) { + visibleRepos = visibleRepos.filter(r => r.stars >= activeFilters.starsMin); +} +// Same for forks, watchers, language +``` + +**4c. Add UI controls** (`index.html`) +- Range sliders for stars, forks, watchers +- Dropdown for language selection +- "Clear All Filters" button + +**Verification:** +``` +✓ Test each filter independently +✓ Test filters in combination +✓ Verify "Clear All" resets everything +✓ Check language dropdown populated correctly +✓ Test with no repos matching filters +``` + +**Status:** 🟡 Design ready, implementation ready + +--- + +### Feature 2: Visual Flows Target Specific Repo on Hover/Click 🟡 Ready to Design + +**What it does:** +When a contributor is selected (clicked), hovering over different repos shows only the relevant link for that contributor-repo pair, not all their connections. + +**Why:** +- Shows specific collaboration pathways +- Less visual noise for highly collaborative contributors +- Better understanding of individual relationships + +**Current behavior:** +- Click contributor → select them +- Hover repo → see all their links light up + +**Target behavior:** +- Click contributor → select them +- Hover repo → show ONLY link to that specific repo (dimly show others) + +**Implementation:** + +**6a. Track hovered repo during click state** +```javascript +// In interactionState.js, add: +hoveredRepoWhileClicked: null +``` + +**6b. Filter links during hover rendering** +When clicked node is contributor and hovered node is repo: +- Find links connecting them +- Draw targeted links at full opacity +- Draw others at ~0.05 opacity (ghost them) + +**6c. Handle owner intermediary** +Some links go: contributor → owner → repo +- Need to highlight both segments +- Owner node's neighbor_links contain owner→repo links + +**Verification:** +``` +✓ Click contributor +✓ Hover different repos +✓ See only relevant link highlighted +✓ Test with owner-grouped repos +✓ Verify clicking away clears state +``` + +**Status:** 🟡 Design ready, implementation ready + +--- + +## Longer-Term Enhancements + +### Additional Metrics 📊 Planned + +**What:** Expand data collection beyond raw commits to provide richer insights +- Weekly commit heatmaps and contributor activity timelines +- Code frequency (additions/deletions over time) +- PR count and merge rates +- Per-contributor PR/issue counts +- Review participation + +**Value:** Very High - Enables temporal visualizations and highlights quality contributions beyond commit counts + +**Implementation:** See `DATA_EXPANSION_PLAN.md` Phases 3-4 + +**Effort:** 2-4 days + +--- + +### Advanced Health Metrics 📈 Planned + +**What:** Health and impact metrics for deeper project insights +- Release frequency +- Issue response time +- Documentation scores +- Cross-repo contributor presence + +**Value:** Medium-High - Polish and deeper insights + +**Implementation:** See `DATA_EXPANSION_PLAN.md` Phase 5 + +**Effort:** 2-3 days + +--- + +### Mobile Responsiveness 📱 Future + +**What:** Optimize for small screens and touch interaction + +**Current:** Desktop-first design, not optimized for mobile + +**Why:** People want to share/view on phones + +**Approach:** +- Responsive canvas sizing +- Touch event handlers (instead of mouse) +- Simplified tooltips for small screens +- Portrait vs landscape support + +**Status:** Not started (lower priority) + +--- + +### Export & Sharing 💾 Future + +**What:** Users can export the visualization or share filtered views + +**Options:** +- Export as PNG/SVG +- Shareable URL with filters preserved +- Embed widget for other sites + +**Status:** Not started (lower priority) + +--- + +### Replicable for Other Organizations 🔧 Future + +**What:** Make the tool easy to fork, configure, and deploy for any organization's open-source portfolio + +- Configurable branding (colors, logos, typography) +- Organization-agnostic data pipeline (minimal config to point at a different org) +- Streamlined setup for new deployments +- Clear documentation for customization and deployment +- Potential packaging as a reusable template or library + +**Status:** Not started + +--- + +## How to Contribute + +**Want to work on a feature?** + +1. Read this document for the overview +2. Check `DEVELOPMENT_GUIDE.md` for detailed instructions on how to contribute +3. Read `ARCHITECTURE.md` to understand the codebase +4. Reference the verification criteria when testing + +**Found a bug or have an idea?** + +Open an issue on GitHub with: +- What you observed +- What you expected +- Steps to reproduce +- Suggested fix (if you have one) + +--- + +**Last Updated**: February 2026 diff --git a/js/config/theme.js b/js/config/theme.js index 71be337..9ae2fb7 100644 --- a/js/config/theme.js +++ b/js/config/theme.js @@ -45,9 +45,9 @@ export const FONTS = { bold: 700, // Default sizes (scaled dynamically in visualization) - baseSizeContributor: 11, - baseSizeRepo: 10, - baseSizeOwner: 12 + baseSizeContributor: 14, + baseSizeRepo: 13, + baseSizeOwner: 15 }; /** @@ -76,7 +76,7 @@ export const LAYOUT = { // Contributor ring positioning contributorPadding: 20, // Default, overridden by config - maxContributorWidth: 55, // The maximum width (at SF = 1) of the contributor name before it gets wrapped + maxContributorWidth: 70, // The maximum width (at SF = 1) of the contributor name before it gets wrapped // Canvas sizing defaultSize: 1500, // Default canvas size @@ -86,7 +86,7 @@ export const LAYOUT = { linkWidthExponent: 0.75, // Collision detection - bboxPadding: 2 + bboxPadding: 4 }; /** diff --git a/js/render/repoCard.js b/js/render/repoCard.js index 1b53617..588d161 100644 --- a/js/render/repoCard.js +++ b/js/render/repoCard.js @@ -16,10 +16,10 @@ import { min } from '../utils/helpers.js'; */ export const REPO_CARD_CONFIG = { lineHeight: 1.4, - sectionSpacing: 20, // Balanced spacing (was 24, reduced to 18, now 20 for better readability) - labelFontSize: 11, - valueFontSize: 11.5, - headerFontSize: 12, + sectionSpacing: 24, // Balanced spacing (scaled up for larger font sizes) + labelFontSize: 14, + valueFontSize: 14, + headerFontSize: 15, labelOpacity: 0.6, valueOpacity: 0.9, warningOpacity: 0.7, diff --git a/js/render/shapes.js b/js/render/shapes.js index 03f1290..4cf8dc2 100644 --- a/js/render/shapes.js +++ b/js/render/shapes.js @@ -284,12 +284,12 @@ export function drawContributorRing(context, SF, RADIUS_CONTRIBUTOR, CONTRIBUTOR const center_x = 0; const center_y = 0; - // Small offset for visual refinement (matches original ORCA implementation) - const O = 4; + // Position the ring so contributor dots sit at 1/3 from the inner edge + // (more ring space on the outer/name side, less empty space inside the dots) const LW = CONTRIBUTOR_RING_WIDTH; - const radius_inner = (RADIUS_CONTRIBUTOR - LW / 2 + O) * SF; - const radius_outer = (RADIUS_CONTRIBUTOR + LW / 2) * SF; + const radius_inner = (RADIUS_CONTRIBUTOR - LW / 3) * SF; + const radius_outer = (RADIUS_CONTRIBUTOR + 2 * LW / 3) * SF; context.save(); diff --git a/js/render/text.js b/js/render/text.js index 151acd3..02b08e1 100644 --- a/js/render/text.js +++ b/js/render/text.js @@ -39,7 +39,7 @@ export function setFont(context, fontSize, fontWeight, fontStyle = 'normal', fon * @param {number} SF - Scale factor * @param {number} fontSize - Base font size */ -export function setRepoFont(context, SF = 1, fontSize = 12) { +export function setRepoFont(context, SF = 1, fontSize = 15) { setFont(context, fontSize * SF, 400, 'normal'); } @@ -61,7 +61,7 @@ export function setCentralRepoFont(context, SF = 1, fontSize = 15) { * @param {number} SF - Scale factor * @param {number} fontSize - Base font size */ -export function setOwnerFont(context, SF = 1, fontSize = 12) { +export function setOwnerFont(context, SF = 1, fontSize = 15) { setFont(context, fontSize * SF, 700, 'normal'); } @@ -72,7 +72,7 @@ export function setOwnerFont(context, SF = 1, fontSize = 12) { * @param {number} SF - Scale factor * @param {number} fontSize - Base font size */ -export function setContributorFont(context, SF = 1, fontSize = 13) { +export function setContributorFont(context, SF = 1, fontSize = 16) { setFont(context, fontSize * SF, 700, 'italic'); } diff --git a/js/render/tooltip.js b/js/render/tooltip.js index eaea19c..2782fb2 100644 --- a/js/render/tooltip.js +++ b/js/render/tooltip.js @@ -31,22 +31,22 @@ function calculateRepoTooltipHeight(d, interactionState, SF, formatDate, formatD let height = 0; // Header section - height += 18; // Top padding (balanced) - height += 12 * line_height; // "Repository" label (12px font * 1.2 line height = 14.4px) - height += 18; // Spacing (balanced) + height += 22; // Top padding (balanced) + height += 15 * line_height; // "Repository" label (15px font * 1.2 line height = 18px) + height += 22; // Spacing (balanced) // Title section (owner/name) - two lines - height += 15 * line_height; // Owner line (15px font * 1.2 = 18px) - height += 15 * line_height; // Name line (15px font * 1.2 = 18px) - height += 42; // Spacing to dates (matches render: y += 42 accounts for name at y+18 plus padding) + height += 19 * line_height; // Owner line (19px font * 1.2 = 22.8px) + height += 19 * line_height; // Name line (19px font * 1.2 = 22.8px) + height += 50; // Spacing to dates (matches render: y += 50 accounts for name at y+22.8 plus padding) // Dates section - height += 11 * line_height; // Created date (11px font * 1.2 = 13.2px) - height += 11 * line_height; // Updated date (11px font * 1.2 = 13.2px) - height += 20; // Spacing before stats (balanced) + height += 14 * line_height; // Created date (14px font * 1.2 = 16.8px) + height += 14 * line_height; // Updated date (14px font * 1.2 = 16.8px) + height += 24; // Spacing before stats (balanced) // Stats line - height += config.headerFontSize * line_height; // Stats line (12px font * 1.2 = 14.4px) + height += config.headerFontSize * line_height; // Stats line (15px font * 1.2 = 18px) // Note: renderLanguages will add its own sectionSpacing (24px) before it // Languages section (if present) @@ -90,12 +90,12 @@ function calculateRepoTooltipHeight(d, interactionState, SF, formatDate, formatD if (interactionState.clickActive && interactionState.clickedNode && interactionState.clickedNode.type === "contributor") { const link = interactionState.clickedNode.data.links_original?.find((l) => l.repo === d.id); if (link) { - height += 28; // Spacing - height += 11 * line_height; // "X commits by" line (11px font * 1.2 = 13.2px) - height += 16; // Spacing - height += 11.5 * line_height; // Contributor name (11.5px font * 1.2 = 13.8px) - height += 18; // Spacing - height += 11 * line_height; // Date range line (11px font * 1.2 = 13.2px) + height += 34; // Spacing + height += 14 * line_height; // "X commits by" line (14px font * 1.2 = 16.8px) + height += 20; // Spacing + height += 14 * line_height; // Contributor name (14px font * 1.2 = 16.8px) + height += 22; // Spacing + height += 14 * line_height; // Date range line (14px font * 1.2 = 16.8px) } } @@ -121,14 +121,14 @@ function calculateRepoTooltipWidth(context, d, interactionState, SF, formatDate, let maxWidth = 0; // Measure title text - setFont(context, 14 * SF, 700, "normal"); + setFont(context, 18 * SF, 700, "normal"); let width = context.measureText(d.data.owner).width * 1.25; if (width > maxWidth) maxWidth = width; width = context.measureText(d.data.name).width * 1.25; if (width > maxWidth) maxWidth = width; // Measure date text - setFont(context, 11 * SF, 400, "normal"); + setFont(context, 14 * SF, 400, "normal"); width = context.measureText(`Created in ${formatDate(d.data.createdAt)}`).width * 1.25; if (width > maxWidth) maxWidth = width; width = context.measureText(`Last updated in ${formatDate(d.data.updatedAt)}`).width * 1.25; @@ -197,16 +197,16 @@ function calculateRepoTooltipWidth(context, d, interactionState, SF, formatDate, if (interactionState.clickActive && interactionState.clickedNode && interactionState.clickedNode.type === "contributor") { const link = interactionState.clickedNode.data.links_original?.find((l) => l.repo === d.id); if (link) { - setFont(context, 11 * SF, 400, "italic"); + setFont(context, 14 * SF, 400, "italic"); const commitText = link.commit_count === 1 ? '1 commit by' : `${link.commit_count} commits by`; width = context.measureText(commitText).width * 1.25; if (width > maxWidth) maxWidth = width; - setFont(context, 11.5 * SF, 700, "normal"); + setFont(context, 14 * SF, 700, "normal"); width = context.measureText(interactionState.clickedNode.data.contributor_name).width * 1.25; if (width > maxWidth) maxWidth = width; - setFont(context, 11 * SF, 400, "normal"); + setFont(context, 14 * SF, 400, "normal"); let dateText = ''; if (formatDateExact(link.commit_sec_min) === formatDateExact(link.commit_sec_max)) { dateText = `On ${formatDateExact(link.commit_sec_max)}`; @@ -224,7 +224,7 @@ function calculateRepoTooltipWidth(context, d, interactionState, SF, formatDate, maxWidth = maxWidth / SF + 80; // Ensure minimum width - return Math.max(maxWidth, 280); + return Math.max(maxWidth, 320); } /** @@ -263,12 +263,12 @@ export function drawTooltip(context, d, config, interactionState, central_repo, if (d.type === "contributor") { // Contributor tooltip - H = 80; - W = 280; + H = 100; + W = 320; } else if (d.type === "owner") { // Owner tooltip - keep existing logic for now - H = 93; - W = 280; + H = 116; + W = 320; } else if (d.type === "repo") { // Repository tooltip - use dynamic calculations // Calculate height dynamically based on all content @@ -276,13 +276,13 @@ export function drawTooltip(context, d, config, interactionState, central_repo, // Calculate width dynamically based on all text content W = calculateRepoTooltipWidth(context, d, interactionState, SF, formatDate, formatDateExact, formatDigit); } else { - H = 93; - W = 280; + H = 116; + W = 320; } // Write all the repos for the "owner" nodes, but make sure they are not wider than the box and save each line to write out if (d.type === "owner") { - font_size = 11.5; + font_size = 14; setFont(context, font_size * SF, 400, "normal"); d.text_lines = []; text = ""; @@ -308,10 +308,10 @@ export function drawTooltip(context, d, config, interactionState, central_repo, // Recalculate width for owner tooltips based on text lines let tW = 0; - setFont(context, 15 * SF, 700, "normal"); + setFont(context, 20 * SF, 700, "normal"); tW = context.measureText(d.data.owner).width * 1.25; // Check if any of the "repo lines" are longer than the owner's name - setFont(context, 11.5 * SF, 400, "normal"); + setFont(context, 14 * SF, 400, "normal"); d.text_lines.forEach((t) => { let line_width = context.measureText(t).width * 1.25; if (line_width > tW) tW = line_width; @@ -320,7 +320,7 @@ export function drawTooltip(context, d, config, interactionState, central_repo, if (tW + 40 * SF > W * SF) W = tW / SF + 40; } else if (d.type === "contributor") { // Recalculate width for contributor tooltips - setFont(context, 15 * SF, 700, "normal"); + setFont(context, 20 * SF, 700, "normal"); text = d.data ? d.data.contributor_name : d.author_name; let tW = context.measureText(text).width * 1.25; // Update the max width if the text is wider @@ -365,8 +365,8 @@ export function drawTooltip(context, d, config, interactionState, central_repo, context.textBaseline = "middle"; // Contributor, owner or repo - y = 18; // Balanced - font_size = 12; + y = 22; // Balanced + font_size = 15; setFont(context, font_size * SF, 400, "italic"); context.fillStyle = COL; text = ""; @@ -376,29 +376,29 @@ export function drawTooltip(context, d, config, interactionState, central_repo, renderText(context, text, x * SF, y * SF, 2.5 * SF); context.fillStyle = COLOR_TEXT; - y += 18; // Balanced + y += 22; // Balanced if (d.type === "contributor") { // The contributor's name - font_size = 16; + font_size = 20; setFont(context, font_size * SF, 700, "normal"); text = d.data ? d.data.contributor_name : d.author_name; renderText(context, text, x * SF, y * SF, 1.25 * SF); } else if (d.type === "owner") { // The name - font_size = 16; + font_size = 20; setFont(context, font_size * SF, 700, "normal"); renderText(context, d.data.owner, x * SF, y * SF, 1.25 * SF); // Which repos fall under this owner in this visual - y += 28; - font_size = 11; + y += 34; + font_size = 14; context.globalAlpha = 0.6; setFont(context, font_size * SF, 400, "italic"); renderText(context, "Included repositories", x * SF, y * SF, 2 * SF); // Write out all the repositories - font_size = 11.5; + font_size = 14; y += font_size * line_height + 4; context.globalAlpha = 0.9; setFont(context, font_size * SF, 400, "normal"); @@ -408,7 +408,7 @@ export function drawTooltip(context, d, config, interactionState, central_repo, }); // forEach } else if (d.type === "repo") { // The repo's name and owner - font_size = 15; + font_size = 19; setFont(context, font_size * SF, 700, "normal"); renderText(context, `${d.data.owner}/`, x * SF, y * SF, 1.25 * SF); renderText( @@ -420,9 +420,9 @@ export function drawTooltip(context, d, config, interactionState, central_repo, ); // The creation date - // Note: name was rendered at y + 18, so we need to move past it (18) plus add spacing (24) = 42 - y += 42; - font_size = 11; + // Note: name was rendered at y + line_height*font_size, so we need to move past it plus add spacing + y += 50; + font_size = 14; context.globalAlpha = 0.7; setFont(context, font_size * SF, 400, "normal"); renderText( @@ -447,7 +447,7 @@ export function drawTooltip(context, d, config, interactionState, central_repo, // ============================================================ // Stats line: stars, forks, watchers - y += 20; // Balanced + y += 24; // Balanced renderStatsLine(context, d.data, x, y, SF, formatDigit); // Languages section @@ -476,15 +476,15 @@ export function drawTooltip(context, d, config, interactionState, central_repo, if (!link) return; let num_commits = link.commit_count; - y += 20; // Reduced from 28 - font_size = 11; + y += 24; // Spacing before clicked section + font_size = 14; context.globalAlpha = 0.6; setFont(context, font_size * SF, 400, "italic"); text = num_commits === 1 ? "1 commit by" : `${num_commits} commits by`; renderText(context, text, x * SF, y * SF, 2 * SF); - y += 12; // Reduced from 16 - font_size = 11.5; + y += 15; // Spacing to contributor name + font_size = 14; context.globalAlpha = 0.9; setFont(context, font_size * SF, 700, "normal"); renderText( @@ -495,8 +495,8 @@ export function drawTooltip(context, d, config, interactionState, central_repo, 1.25 * SF, ); - y += 14; // Reduced from 18 - font_size = 11; + y += 17; // Spacing to date range + font_size = 14; context.globalAlpha = 0.6; setFont(context, font_size * SF, 400, "normal"); if ( diff --git a/js/simulations/collaborationSimulation.js b/js/simulations/collaborationSimulation.js index d4d6c45..a4e8e04 100644 --- a/js/simulations/collaborationSimulation.js +++ b/js/simulations/collaborationSimulation.js @@ -104,7 +104,7 @@ export function runCollaborationSimulation( let r = d.type === "owner" ? d.max_radius : d.r; let top = max(r, d.r + text_height); - let w = max(r * 2, text_size.width * 1.25) + 10; + let w = max(r * 2, text_size.width * 1.25) + 14; d.bbox = [ [-w / 2, -top], From 353ee8bd9116af3c767606293be4a8c43a2a24b5 Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Thu, 12 Feb 2026 22:11:54 -0600 Subject: [PATCH 2/8] add more filters and docs on nasa feature request 1. filters for stars and forks 2. researched feature request for nasa, including docs for future reference --- assets/css/style.css | 16 +- docs/DATE_RANGE_IMPLEMENTATION_PLAN.md | 318 ++++++ docs/ROADMAP.md | 273 +++++ docs/VISUALIZATION_DESIGN_GUIDE.md | 651 ++++++++++++ docs/sponsor_centric/00_READ_ME_FIRST.txt | 365 +++++++ docs/sponsor_centric/ASSESSMENT_INDEX.md | 408 ++++++++ .../sponsor_centric/FEASIBILITY_ASSESSMENT.md | 582 +++++++++++ .../FEATURE_REQUEST_SUMMARY.md | 397 +++++++ .../sponsor_centric/IMPLEMENTATION_ROADMAP.md | 984 ++++++++++++++++++ index.html | 55 +- js/__tests__/filter.test.js | 153 ++- js/chart.js | 34 +- js/data/filter.js | 68 +- js/state/filterState.js | 28 +- 14 files changed, 4311 insertions(+), 21 deletions(-) create mode 100644 docs/DATE_RANGE_IMPLEMENTATION_PLAN.md create mode 100644 docs/VISUALIZATION_DESIGN_GUIDE.md create mode 100644 docs/sponsor_centric/00_READ_ME_FIRST.txt create mode 100644 docs/sponsor_centric/ASSESSMENT_INDEX.md create mode 100644 docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md create mode 100644 docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md create mode 100644 docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md diff --git a/assets/css/style.css b/assets/css/style.css index 9f80868..9ea2aad 100644 --- a/assets/css/style.css +++ b/assets/css/style.css @@ -508,7 +508,9 @@ li#item-remaining::before { margin: 0; } -#org-select { +#org-select, +#stars-select, +#forks-select { padding: 8px 12px; font-size: 14px; border: 1px solid #ddd; @@ -519,11 +521,15 @@ li#item-remaining::before { transition: border-color 0.2s ease; } -#org-select:hover { +#org-select:hover, +#stars-select:hover, +#forks-select:hover { border-color: #CF3F02; } -#org-select:focus { +#org-select:focus, +#stars-select:focus, +#forks-select:focus { outline: none; border-color: #CF3F02; box-shadow: 0 0 0 2px rgba(207, 63, 2, 0.1); @@ -545,7 +551,9 @@ li#item-remaining::before { gap: 10px; } - #org-select { + #org-select, + #stars-select, + #forks-select { width: 100%; max-width: 300px; } diff --git a/docs/DATE_RANGE_IMPLEMENTATION_PLAN.md b/docs/DATE_RANGE_IMPLEMENTATION_PLAN.md new file mode 100644 index 0000000..d1ad124 --- /dev/null +++ b/docs/DATE_RANGE_IMPLEMENTATION_PLAN.md @@ -0,0 +1,318 @@ +# Date Range Implementation Plan + +Granular commit counts over time, enabling dynamic link/flow sizing based on commit activity within a selected date window. + +--- + +## Goal + +When a user selects a time range, the visualization should reflect the actual commit count within that window — not just whether activity overlapped. Link widths, contributor node sizes, and flow visuals should all scale to the filtered commit counts. This also lays the groundwork for a future animation feature that steps through time. + +--- + +## Current State + +### Data Pipeline + +The Python CLI (`python/contributor_network/cli.py`) fetches commit data via PyGithub. The `data` command calls `Link.from_github()` for each contributor-repo pair, which: + +1. Calls `repo.get_commits(author=contributor.login)` to get all commits +2. Extracts only three values: first commit timestamp, last commit timestamp, total count +3. Discards individual commit dates + +The raw temporal data passes through the code but is not stored. + +### Link Model (`python/contributor_network/models.py`) + +``` +author_name: str +repo: str +commit_count: int +commit_sec_min: int # Unix timestamp of first commit +commit_sec_max: int # Unix timestamp of last commit +contribution_span_days: int +is_recent_contributor: bool +``` + +### CSV Output (`links.csv`) + +One row per contributor-repo pair. 369 rows for the current dataset. Contains the aggregate `commit_count` with no temporal breakdown. + +### JavaScript Consumption + +`d3.csv()` loads `links.csv`. `prepare.js` parses `commit_count` as a number and uses it to scale link widths via `scale_link_width`. Contributor radii are scaled by `total_commits` (sum of their link commit counts). + +--- + +## Changes Required + +### 1. Python: Collect Monthly Commit Histograms + +**File:** `python/contributor_network/models.py` + +Add a new field to the `Link` model: + +```python +commit_histogram: dict[str, int] = {} # {"2024-01": 5, "2024-02": 12, ...} +``` + +**File:** `python/contributor_network/cli.py` (inside `Link.from_github()` or `update_links()`) + +The method already iterates all commits to compute the total count. During that iteration, extract each commit's `author.date`, format as `YYYY-MM`, and bucket: + +```python +from collections import defaultdict + +histogram = defaultdict(int) +for commit in repo.get_commits(author=login): + date = commit.commit.author.date # datetime object + month_key = date.strftime("%Y-%m") + histogram[month_key] += 1 + +link.commit_histogram = dict(histogram) +``` + +This adds zero extra API calls. The commit objects are already being fetched. + +**Existing fields are unchanged.** `commit_count`, `commit_sec_min`, `commit_sec_max`, `contribution_span_days`, and `is_recent_contributor` remain as-is for backward compatibility. + +--- + +### 2. New CSV: `commit_activity.csv` + +**File:** `python/contributor_network/cli.py` (inside the `csvs` command) + +Create a fourth CSV file with one row per contributor-repo-month: + +``` +author_name,repo,month,commit_count +Vincent Sarago,developmentseed/titiler,2024-01,12 +Vincent Sarago,developmentseed/titiler,2024-02,8 +Vincent Sarago,developmentseed/titiler,2024-03,15 +Kyle Barron,stac-utils/rustac,2024-01,22 +... +``` + +**Columns:** + +| Column | Type | Description | +|--------|------|-------------| +| `author_name` | string | Contributor display name (matches `links.csv`) | +| `repo` | string | Full repo name `owner/repo` (matches `links.csv`) | +| `month` | string | `YYYY-MM` format | +| `commit_count` | int | Number of commits in that month | + +**Generation logic:** + +```python +# Inside the csvs command, after writing links.csv: +activity_rows = [] +for link_file in links_dir.glob("*.json"): + link = Link.model_validate_json(link_file.read_text()) + for month, count in link.commit_histogram.items(): + activity_rows.append({ + "author_name": link.author_name, + "repo": link.repo, + "month": month, + "commit_count": count, + }) + +# Sort for readability +activity_rows.sort(key=lambda r: (r["author_name"], r["repo"], r["month"])) + +# Write CSV +with open(output_dir / "commit_activity.csv", "w", newline="") as f: + writer = csv.DictWriter(f, fieldnames=["author_name", "repo", "month", "commit_count"]) + writer.writeheader() + writer.writerows(activity_rows) +``` + +**Expected size:** ~369 links × ~30 months average = ~11,000 rows. Well within browser limits. + +**Also update the `build` command** to copy `commit_activity.csv` to `assets/data/` alongside the other CSVs. + +--- + +### 3. JavaScript: Load and Index the Activity Data + +**File:** `index.html` + +Add the fourth CSV to the existing `Promise.all`: + +```javascript +const promises = [ + d3.csv('assets/data/top_contributors.csv'), + d3.csv('assets/data/repositories.csv'), + d3.csv('assets/data/links.csv'), + d3.csv('assets/data/commit_activity.csv') // NEW +]; +``` + +Pass `values[3]` through to the chart constructor. + +**File:** `js/chart.js` + +Build a lookup map during initialization, keyed by `author_name~repo`: + +```javascript +// Build activity index: "author_name~repo" → Map +const activityIndex = new Map(); +activityData.forEach(row => { + const key = `${row.author_name}~${row.repo}`; + if (!activityIndex.has(key)) { + activityIndex.set(key, new Map()); + } + activityIndex.get(key).set(row.month, +row.commit_count); +}); +``` + +**File:** `js/data/prepare.js` + +During link normalization, attach the histogram to each link: + +```javascript +// After parsing commit_count, commit_sec_min, commit_sec_max: +const linkKey = `${d.contributor_name}~${d.repo}`; +d.commit_histogram = activityIndex.get(linkKey) || new Map(); +``` + +--- + +### 4. Time Range Filtering with Accurate Counts + +**File:** `js/chart.js` (inside `applyFilters()`) + +Replace the overlap-based time filter described in the roadmap (Feature 4) with a count-based filter: + +```javascript +if (activeFilters.timeRangeMin !== null || activeFilters.timeRangeMax !== null) { + visibleLinks = visibleLinks.map(link => { + // Sum commits only within the selected month range + let filteredCount = 0; + for (const [month, count] of link.commit_histogram.entries()) { + // Parse "YYYY-MM" to a comparable date (first of month) + const monthDate = new Date(month + "-01"); + if (activeFilters.timeRangeMin && monthDate < activeFilters.timeRangeMin) continue; + if (activeFilters.timeRangeMax && monthDate > activeFilters.timeRangeMax) continue; + filteredCount += count; + } + // Return a copy with the filtered count + return { ...link, commit_count: filteredCount }; + }).filter(link => link.commit_count > 0); // Remove links with zero commits in range + + // Re-derive visible repos from remaining links + const repoIdsFromLinks = new Set(visibleLinks.map(l => l.repo)); + visibleRepos = visibleRepos.filter(r => repoIdsFromLinks.has(r.repo)); +} +``` + +The existing cascade (Steps 3-4 in `applyFilters()`) then handles filtering contributors and re-filtering links. + +--- + +### 5. Dynamic Link Width Scaling + +**File:** `js/data/prepare.js` (inside scale domain calculation) + +The `scale_link_width` domain is currently set once from the global max `commit_count`. After time-range filtering replaces commit counts, the domain must update. + +**Recommended approach — fixed domain with context:** + +Keep the global max as the scale domain so link widths are always relative to the full dataset. A link that had 100 commits total but only 10 in the selected range will appear proportionally thinner. This makes comparisons across time ranges meaningful. + +```javascript +// During prepareData(), the domain is set from current (possibly filtered) data: +const maxCommitCount = d3.max(links, d => d.commit_count); +scale_link_width.domain([1, 10, maxCommitCount]); +``` + +This already happens in `prepareData()` which runs during `chart.rebuild()`. Since `commit_count` values are replaced by the filtered counts before `prepareData()` runs, the scale will use the filtered max. If you want a fixed global reference instead, store `originalMaxCommitCount` before any filtering and use that as the domain ceiling. + +--- + +### 6. Dynamic Contributor Node Sizing + +**File:** `js/data/prepare.js` + +Contributor radius is scaled by `total_commits`, which is calculated as the sum of their link `commit_count` values. Since links now carry filtered counts after Step 4, this recalculation happens naturally during `prepareData()`: + +```javascript +// Already exists in prepare.js — recalculates from current link data: +contributors.forEach(d => { + d.total_commits = d3.sum( + d.links_original.filter(l => /* link is visible */), + l => l.commit_count + ); +}); +``` + +Verify that `links_original` on each contributor is updated to reference the filtered links (not the pre-filter originals). If `links_original` still points to unfiltered data, either update it during `applyFilters()` or compute `total_commits` from `visibleLinks` instead. + +--- + +### 7. Owner Node Aggregation + +**File:** `js/data/prepare.js` + +Owner nodes aggregate stats from their child repos. When link counts change, the owner-level aggregations (total commits through that owner) should reflect the filtered values. The existing owner link deduplication and aggregation in `prepareData()` (lines 364-436) already sums `commit_count` from the current links, so this should work automatically. Verify by checking that owner link widths shrink when the time range narrows. + +--- + +## Animation Hook (Future) + +With monthly histograms on each link, animation becomes a UI controller problem: + +```javascript +// Pseudocode for animation controller +const months = getAllMonthsSorted(); // ["2019-06", "2019-07", ...] +let frameIndex = 0; + +function animateFrame() { + const currentMonth = months[frameIndex]; + chart.setTimeRange( + new Date(months[0] + "-01"), // Cumulative: from start + new Date(currentMonth + "-01") // Up to current frame + ); + frameIndex++; + if (frameIndex < months.length) { + requestAnimationFrame(animateFrame); + } +} +``` + +Nodes appear when their first commit month is reached. Links grow as commit counts accumulate. The ring fills up over time. No additional data pipeline changes are needed — only a UI play/pause/scrub control and the animation loop. + +--- + +## Implementation Order + +1. **Python model change** — add `commit_histogram` field to Link +2. **Python CLI change** — bucket commits by month during `Link.from_github()` +3. **CSV generation** — write `commit_activity.csv` in the `csvs` command +4. **Build command** — copy new CSV to `assets/data/` +5. **Re-run data collection** — `python -m contributor_network data` then `csvs` then `build` +6. **JS data loading** — load fourth CSV, build activity index +7. **JS data preparation** — attach histograms to link objects +8. **Time range filtering** — implement count-based filtering in `applyFilters()` +9. **Scale updates** — verify link widths and contributor radii update correctly +10. **UI slider** — build the time range control (see Feature 4 in ROADMAP.md) + +Steps 1-5 are Python/data work (~half day). Steps 6-10 are JS/visualization work (~1-2 days), mostly layered on top of the Feature 4 time range slider from the roadmap. + +--- + +## Risks and Considerations + +**API rate limits:** No additional API calls are needed. The commit data is already being fetched — we're just retaining the dates instead of discarding them. + +**Data freshness:** Existing JSON files in `links/` won't have `commit_histogram` until re-fetched. The field defaults to `{}` so old data won't break, but will produce empty histograms. A full re-fetch is needed for complete temporal data. + +**Month boundary precision:** Commits are bucketed by calendar month (UTC). A commit at 11:59pm on Jan 31 and one at 12:01am on Feb 1 land in different buckets. This is acceptable for the visualization's granularity. + +**Scale behavior:** When the time range is very narrow (e.g., one month), most links will have small counts and the scale domain shrinks. This can make thin links appear thick. Consider setting a minimum domain ceiling (e.g., 10) to prevent scale distortion on narrow ranges. + +**Backward compatibility:** `links.csv` is unchanged. The new `commit_activity.csv` is additive. If the JS can't find it, fall back to the overlap-based filtering described in the ROADMAP.md Feature 4 entry. + +--- + +**Last Updated**: February 2026 diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index 12f4c20..7ed3a89 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -133,6 +133,279 @@ Some links go: contributor → owner → repo --- +### Feature 3: Click Action to Hide Irrelevant Contributors/Repos 🟡 Ready to Design + +**What it does:** +When a user clicks a contributor node, the chart hides all unrelated contributors and repos, keeping only the clicked contributor, their linked repos, and any co-contributors on those repos. The details panel expands to show richer information about the selected contributor. + +**Why:** +- Declutters the view for highly connected networks +- Lets users focus on one contributor's ecosystem +- Creates space for showing deeper data (commit timelines, repo breakdowns) in the details panel + +**Current behavior:** +- Click contributor → Delaunay index narrows to neighbors, main canvas fades to 15% opacity, hover canvas shows the contributor's links and neighbor nodes +- All other nodes remain drawn on the faded main canvas + +**Target behavior:** +- Click contributor → the chart rebuilds with only the relevant subset of data visible (similar to how org filtering works), and the details panel shows expanded information +- Click background or press Escape → restore full chart + +**Implementation:** + +**3a. Add click-filter mode to interaction state** (`js/state/interactionState.js`) +```javascript +// Add to state object: +clickFilterActive: false, +clickFilterContributor: null +``` + +**3b. Build a click-filter function** (`js/chart.js`) + +Model this on the existing `applyFilters()` cascade but driven by a clicked contributor rather than UI controls: + +```javascript +function applyClickFilter(contributorNode) { + // 1. Find all repos linked to this contributor + const linkedRepoIds = new Set( + contributorNode.neighbor_links + .map(l => getLinkNodeId(l.target)) + .filter(id => /* is repo or owner */) + ); + // 2. Find all contributors who also link to those repos + // 3. Filter visibleRepos, visibleLinks, visibleContributors + // 4. Call chart.rebuild() to re-layout with subset +} +``` + +Key difference from org filtering: this is a *temporary* filter triggered by interaction, not the filter UI. Store the pre-click data snapshot so it can be restored on deselect. The existing `originalContributors/Repos/Links` pattern works here — just avoid overwriting them. + +**3c. Wire click handler** (`js/interaction/click.js`) + +On contributor click: +1. Set `clickFilterActive = true` and store the contributor +2. Call `applyClickFilter(contributorNode)` +3. Optionally animate the transition (fade out irrelevant nodes before rebuild) + +On background click or Escape: +1. Set `clickFilterActive = false` +2. Restore original data arrays +3. Call `chart.rebuild()` + +**3d. Expand the details panel** (`js/render/tooltip.js`) + +When click-filter is active, render a richer panel for the selected contributor: +- Total commits across all visible repos +- List of repos with individual commit counts +- Date range of activity per repo +- Languages across their repos +- Co-contributors (other contributors sharing repos) + +This panel should be drawn on the click canvas so it persists across hover interactions. + +**Risks:** +- Rebuild performance: full `chart.rebuild()` re-runs all force simulations. May need to cache simulation results or skip simulations for small subsets. +- State complexity: two filter systems (UI filters + click filter) must compose correctly. Click filter should operate on already-UI-filtered data, not raw originals. + +**Verification:** +``` +✓ Click contributor → only relevant nodes/links remain +✓ Click background → full chart restores +✓ Details panel shows expanded contributor info +✓ Works correctly when org/metric filters are also active +✓ Clicking a different contributor switches the filter +✓ Zoom/pan state preserved across click-filter transitions +``` + +**Status:** 🟡 Design ready, implementation ready + +--- + +### Feature 4: Time Range Filter for Commit Activity 🟡 Ready to Design + +**What it does:** +Add a time range slider (or dual-handle range input) that filters the visualization to only show commit activity within a selected date window. Repos, links, and contributors outside the time range are hidden. + +**Why:** +- Explore how the contributor network evolved over time +- Identify recent vs legacy contributors +- See which repos are actively maintained vs dormant +- Answer questions like "who contributed in the last year?" + +**Current data available:** +Each link already has `commit_sec_min` and `commit_sec_max` (Unix timestamps for earliest and latest commit by that contributor on that repo). Repos have `createdAt` and `updatedAt` dates. This is enough for time-range filtering without new data collection. + +**Limitation:** The CSV stores one `commit_count` per contributor-repo pair with no per-period breakdown. When filtering by time range, you can determine *whether* a contributor was active on a repo during the window (their min/max overlaps), but cannot recalculate the exact commit count within that window. Display the full commit count with a note like "active during this period" or fetch granular data (see below). + +**Implementation:** + +**4a. Extend filter state** (`js/state/filterState.js`) +```javascript +{ + // ... existing fields + timeRangeMin: null, // Date object or null (no filter) + timeRangeMax: null, // Date object or null +} +``` + +Update `hasActiveFilters()` to check these fields. + +**4b. Add time-range filtering to the cascade** (`js/chart.js` `applyFilters()`) + +Insert after repo filtering but before the link cascade: +```javascript +// Filter links by time overlap with selected range +if (activeFilters.timeRangeMin !== null || activeFilters.timeRangeMax !== null) { + visibleLinks = visibleLinks.filter(link => { + const linkMin = link.commit_sec_min; + const linkMax = link.commit_sec_max; + // Check overlap: link's range intersects filter range + if (activeFilters.timeRangeMin && linkMax < activeFilters.timeRangeMin) return false; + if (activeFilters.timeRangeMax && linkMin > activeFilters.timeRangeMax) return false; + return true; + }); + // Re-derive visible repos from remaining links + const repoIdsFromLinks = new Set(visibleLinks.map(l => l.repo)); + visibleRepos = visibleRepos.filter(r => repoIdsFromLinks.has(r.repo)); +} +// Existing contributor cascade (Step 3) handles the rest +``` + +Note: the time filter operates on *links* first (not repos), since the temporal data lives on links. Then repos without any visible links are removed. The existing contributor cascade then removes contributors without visible links. + +**4c. Build the UI control** (`index.html`) + +Add a dual-handle range slider below the existing filters: +- Compute global min/max dates from all links' `commit_sec_min`/`commit_sec_max` during data load +- Use two `` elements (or a library like noUiSlider) mapping to the date range +- Display selected range as formatted dates (e.g., "Jan 2020 — Mar 2025") +- On change, call `contributorNetworkVisual.setRepoFilter('timeRangeMin', date)` and `setRepoFilter('timeRangeMax', date)` + +**4d. Add chart API** (`js/chart.js`) + +The existing `chart.setRepoFilter(name, value)` pattern from Feature 1 works here — just ensure it handles Date values. Add a convenience method: +```javascript +chart.setTimeRange = function(minDate, maxDate) { + activeFilters.timeRangeMin = minDate; + activeFilters.timeRangeMax = maxDate; + chart.rebuild(); + return chart; +}; +``` + +**Optional enhancement:** For richer granularity, expand the Python data pipeline (`python/client.py`) to fetch weekly or monthly commit counts per contributor-repo pair. This would allow showing *how many* commits occurred in the selected window rather than just whether activity overlapped. This is a separate data expansion task (see `DATA_EXPANSION_PLAN.md` Phases 3-4). + +**Verification:** +``` +✓ Slider range matches actual data timespan +✓ Narrowing range hides repos/contributors with no activity in window +✓ Widening range back to full restores all data +✓ Composes correctly with org and metric filters +✓ Edge case: single-day range still works +✓ Edge case: range that excludes all data shows empty state gracefully +``` + +**Status:** 🟡 Design ready, implementation ready + +--- + +### Feature 5: More Evenly Spaced Orgs/Repos 🔴 Needs Design + +**What it does:** +Improve the positioning of organization and repository nodes so they are more uniformly distributed across the available space, reducing visual clutter and overlap. + +**Why:** +- Current layout produces clusters with large gaps elsewhere +- Owner-grouped repos can pile up near certain contributors +- Shared/collaboration repos in the center can overlap heavily +- Better spacing makes the chart easier to read at a glance + +**Current architecture:** +Repo positioning uses three independent force simulations that run sequentially: + +1. **Owner simulation** (`js/simulations/ownerSimulation.js`): Groups repos by owner. Creates a local force simulation per owner that pulls repos toward an owner centroid. Repos are attracted to their owner node with a centering force. + +2. **Contributor simulation** (`js/simulations/contributorSimulation.js`): For repos linked to a single contributor (not shared), runs a per-contributor force simulation that positions repos in a cloud around that contributor. Uses `d3.forceCollide` and `d3.forceRadial` to keep repos near (but not on top of) their contributor. + +3. **Collaboration simulation** (`js/simulations/collaborationSimulation.js`): For repos linked to multiple contributors, positions them in the central area using `d3.forceCenter(0,0)`, `d3.forceCollide`, and link forces pulling toward connected contributor nodes. + +Each simulation runs independently with its own alpha/decay parameters. Results are merged into the final node positions. There is also `d3-bboxCollide` for label collision avoidance. + +**Why this is hard:** +- Three independent simulations don't coordinate — a repo positioned by the contributor sim may overlap with one positioned by the collaboration sim +- Force simulation tuning is iterative and visual — small parameter changes cascade unpredictably +- "Even spacing" is subjective and depends on the dataset + +**Implementation approach:** + +**5a. Unify collision detection across simulations** + +After all three simulations complete, add a final "reconciliation" pass that applies global collision forces to all repo nodes together. In `js/chart.js` after the simulation calls: + +```javascript +// After all simulations complete, run a short global collision pass +const allRepoNodes = nodes.filter(n => n.type === "repo" || n.type === "owner"); +const reconciliation = d3.forceSimulation(allRepoNodes) + .force("collide", d3.forceCollide().radius(d => d.r + 8).strength(0.7)) + .force("containment", d3.forceRadial( + RADIUS_CONTRIBUTOR * 0.85, 0, 0 // keep repos inside the contributor ring + ).strength(0.05)) + .alpha(0.3) + .alphaDecay(0.05) + .stop(); + +for (let i = 0; i < 100; i++) reconciliation.tick(); +// Copy reconciled positions back to nodes +``` + +**5b. Tune per-simulation parameters** + +In each simulation file, adjust: +- **Owner sim**: Increase `forceCollide` radius between owner groups to prevent inter-group overlap. Add a weak `forceRadial` to distribute owner groups around a ring between contributors and center. +- **Contributor sim**: Increase the radial band where per-contributor repos sit. Currently repos cluster tightly; widen the angular spread. +- **Collaboration sim**: Add `d3.forceManyBody().strength(-30)` to push shared repos apart from each other. Increase `forceCollide` padding. + +**5c. Consider angular partitioning** + +For a more structured layout, assign each contributor an angular "sector" and constrain their linked repos to that sector: +```javascript +// Each contributor already has an angle from ring positioning +// Use that angle ± half the contributor's angular allocation +// as bounds for a forceRadial + angular constraint +``` +This prevents repos from drifting into other contributors' territory. Implementation requires a custom force function since D3 doesn't have built-in angular constraints. + +**5d. Add `d3-bboxCollide` for all nodes** + +Currently `d3-bboxCollide` is used for label collision. Extend it to also handle node-to-node overlap for repos: +```javascript +.force("bbox", d3.bboxCollide(d => { + const pad = 4; + return [[-d.r - pad, -d.r - pad], [d.r + pad, d.r + pad]]; +})) +``` + +**Risks:** +- High iteration cost: tuning forces requires visual feedback loops, not just code changes +- Dataset-dependent: parameters that work for 50 repos may fail for 200 +- Performance: adding a reconciliation simulation increases layout computation time +- May require multiple rounds of parameter adjustment after initial implementation + +**Verification:** +``` +✓ No node-on-node overlaps (repos don't sit on top of each other) +✓ Owner groups visually distinct and separated +✓ Central collaboration repos spread out, not piled in center +✓ Repos stay inside the contributor ring boundary +✓ Layout looks reasonable across different datasets/filter states +✓ Rebuild after filtering still produces even spacing +✓ Performance: layout completes in <2 seconds for current dataset +``` + +**Status:** 🔴 Needs further design iteration and visual tuning + +--- + ## Longer-Term Enhancements ### Additional Metrics 📊 Planned diff --git a/docs/VISUALIZATION_DESIGN_GUIDE.md b/docs/VISUALIZATION_DESIGN_GUIDE.md new file mode 100644 index 0000000..5d5150c --- /dev/null +++ b/docs/VISUALIZATION_DESIGN_GUIDE.md @@ -0,0 +1,651 @@ +# Visualization Design Guide: Sponsored vs. Community Contributors + +**Purpose:** Help you visualize and decide on the design for the tiered contributor visualization +**Status:** Design Decision Document +**Date:** February 2026 + +--- + +## Current Visualization (No Tiers) + +``` + All Contributors in Ring + + USER A (15) + + USER B (8) USER C (12) + + USER D (6) [Center] USER E (9) + + USER F (20) USER G (4) + + USER H (2) USER I (18) + + USER J (7) + + [Repositories with links to all contributors] +``` + +**Current Behavior:** +- All tracked contributors shown in fixed ring +- Repositories positioned based on force simulation +- Links show commit relationships +- No distinction between different contributor types + +--- + +## Proposed Design Option A: Outer Scattered Layout + +Uses existing "remaining" simulation to position community contributors. + +``` + Sponsored in Ring + + ALICE (15) + + BOB (8) [SPONSOR] CHARLIE (12) + + DAVE (6) [Center] EVE (9) + + FRANK (20) [REPOS] GRACE (4) + + HENRY (2) [SPONSOR] IRIS (18) + + JACK (7) + + + [Community scattered around/outside:] + + Unknown1 • Unknown2 • Unknown3 • + + Unknown4 • Unknown5 • + + + Unknown6 • Unknown7 • +``` + +**Characteristics:** +- Sponsored contributors: Central ring (prominent) +- Community contributors: Scattered around edges (visible but secondary) +- Visual hierarchy: Ring = important, scattered = supporting +- Reference: Current "extra contributors" positioning + +**Pros:** +- Minimal code changes (reuse existing simulation) +- Quick to implement (1 week) +- Clear visual hierarchy +- Familiar pattern (already used for extras) + +**Cons:** +- Community contributors may feel "random" +- Less organized appearance +- Harder to see all community members at once + +--- + +## Proposed Design Option B: Outer Ring Layout + +Creates second ring for community contributors. + +``` + ╔═══════════════════════════════════════════════╗ + ║ ║ + ║ Unknown3 • Unknown2 • Unknown1 • ║ + ║ ║ + ║ Unknown7 • Unknown4 • ║ + ║ ║ + ║ Unknown6 • Unknown5 • ║ + ║ ║ + ║ COMMUNITY RING (Outer) ║ + ║ ║ + ╚═════════════════════════════════════════════╝ + + ALICE (15) + + BOB (8) [SPONSOR] CHARLIE (12) + + DAVE (6) [Center] EVE (9) + + FRANK (20) [REPOS] GRACE (4) + + HENRY (2) [SPONSOR] IRIS (18) + + JACK (7) + + ╚═════════════════════════════════════════════╝ + ║ SPONSORED RING (Inner) ║ + ║ ║ + ║ Named contributors arranged in circle ║ + ║ ║ + ╚═════════════════════════════════════════════╝ +``` + +**Characteristics:** +- Sponsored contributors: Inner ring (very prominent) +- Community contributors: Outer ring (visible, organized) +- Visual hierarchy: Ring position = importance +- Reference: ORCA visualization model + +**Pros:** +- Clear visual distinction (two rings) +- Organized appearance +- Like ORCA model (recognizable pattern) +- Community members still visible/accessible + +**Cons:** +- More implementation effort (custom simulation) +- Longer development (2 weeks) +- May feel visually cluttered +- Requires tuning for attractive layout + +--- + +## Color & Style Design + +### Sponsored Contributor Styling + +``` +┌─────────────────────┐ +│ Sponsored Contrib │ +│ │ +│ ●●●●●●●● │ ← Circle node +│ ● Anthony ● │ (orange, full opacity) +│ ●●●●●●●● │ +│ │ +│ 14 commits │ ← Label +│ 5 repositories │ +└─────────────────────┘ + +Color: Grenadier Orange (#CF3F02) - Your brand color +Size: Default (e.g., 40px radius) +Opacity: 100% +Border: 2px solid darker orange +Label: Visible in ring +``` + +### Community Contributor Styling + +``` +┌─────────────────────┐ +│ Community Contrib │ +│ │ +│ ○○○○○○○ │ ← Circle node +│ ○ Unknown ○ │ (muted blue, 70% opacity) +│ ○○○○○○○ │ +│ │ +│ 2 commits │ ← Label (lighter) +│ 1 repository │ +└─────────────────────┘ + +Color: Aquamarine Blue (#2E86AB) - Secondary brand color +Size: 85% of default (e.g., 34px radius) +Opacity: 70% (muted appearance) +Border: 1px solid lighter blue +Label: Visible but lighter +``` + +### Comparison Visual + +``` +Sponsored (Full prominence) Community (Secondary prominence) + + ●●●●●●● ○○○○○○○ + ● (40px) ● ○ (34px) ○ + ● 100% ● ○ 70% ○ + ●●●●●●● ○○○○○○○ + Orange Blue + Bold Muted +``` + +--- + +## Link Styling (From Contributors to Repos) + +### Sponsored Contributor Links +``` +[SPONSOR] ═══════════════════ [REPO] + Orange Bold, Full Gray/Blue + Opacity Color +``` + +- Start color: Grenadier orange +- Width: Based on commit count (thicker = more commits) +- Opacity: 100% for recent contributions, 70% for old + +### Community Contributor Links +``` +[COMMUNITY] - - - - - - - - - [REPO] + Blue Dashed, 70% Gray/Blue + Opacity Color +``` + +- Start color: Aquamarine blue (muted) +- Width: Based on commit count +- Opacity: 70% (less prominent) +- Optional: Dashed line to indicate secondary contributor + +--- + +## Tooltip Design + +### Sponsored Contributor Tooltip + +``` +╔════════════════════════════╗ +║ Anthony Boyd ║ +║ ┌──────────────────────┐ ║ +║ │ SPONSORED CONTRIBUTOR│ ║ ← Orange badge +║ └──────────────────────┘ ║ +║ ║ +║ Contributions: 14 commits ║ +║ Repositories: 5 ║ +║ First Commit: Jan 2024 ║ +║ Last Commit: Feb 2026 ║ +╚════════════════════════════╝ +``` + +### Community Contributor Tooltip + +``` +╔════════════════════════════╗ +║ Unknown Contributor ║ +║ ┌──────────────────────┐ ║ +║ │ COMMUNITY CONTRIBUTOR│ ║ ← Blue badge +║ └──────────────────────┘ ║ +║ ║ +║ Contributions: 2 commits ║ +║ Repositories: 1 ║ +║ First Commit: Jun 2024 ║ +║ Last Commit: Oct 2025 ║ +╚════════════════════════════╝ +``` + +--- + +## Layout Comparison: Before vs. After + +### Before (Current - No Tiers) + +``` +INPUT: +- List of repos (fixed) +- All contributors to those repos +- No classification + +PROCESSING: +- Fetch all commits +- Identify all contributors +- No separation/grouping + +OUTPUT: +Users in Ring [All same visual style] +● User A +● User B +● User C + [etc - could be 50+ nodes] + +[Repos in center, connected to all] + +RESULT: +- Can't tell which contributors are important +- Hard to see community impact +- Visual clutter with many users +``` + +### After (New - With Tiers) + +``` +INPUT: +- List of repos (fixed) +- All contributors to those repos +- Sponsor list (config) + +PROCESSING: +- Fetch all commits +- Identify all contributors +- Classify as sponsored/community + +OUTPUT: +Sponsored Community +(Central Ring) (Outer Ring) +● Alice ○ Unknown1 +● Bob ○ Unknown2 +● Charlie ○ Unknown3 +[5-10 important] [30-100 rest] + +[Repos connected to both] + +RESULT: +- Clear hierarchy (ring position = importance) +- Easier to see community involvement +- Organized appearance +- Sponsored contributors highlighted +``` + +--- + +## Decision Matrix + +### Option A (Scattered Community) + +| Aspect | Rating | Notes | +|--------|--------|-------| +| Visual Clarity | ⭐⭐⭐ | Good - clear hierarchy | +| Implementation | ⭐⭐⭐⭐⭐ | Very easy - reuse existing | +| Development Time | ⭐⭐⭐⭐⭐ | 1 week | +| Aesthetics | ⭐⭐⭐ | Good but less organized | +| ORCA Similarity | ⭐⭐ | Loose reference | +| Community Recognition | ⭐⭐⭐ | Community still visible | +| Scalability | ⭐⭐⭐ | Fine for 50-100 community | + +### Option B (Outer Ring) + +| Aspect | Rating | Notes | +|--------|--------|-------| +| Visual Clarity | ⭐⭐⭐⭐ | Excellent - two-ring system | +| Implementation | ⭐⭐⭐ | Moderate - custom simulation | +| Development Time | ⭐⭐⭐ | 2 weeks | +| Aesthetics | ⭐⭐⭐⭐⭐ | Beautiful - professional | +| ORCA Similarity | ⭐⭐⭐⭐⭐ | Direct match | +| Community Recognition | ⭐⭐⭐⭐⭐ | Prominent - organized ring | +| Scalability | ⭐⭐⭐⭐ | Better for 100+ community | + +--- + +## Mobile Responsiveness + +### Desktop (Current) + +``` +┌──────────────────────────────────┐ +│ Visualization (1200x800) │ +│ ┌──────────────────────────────┐│ +│ │ Central ring layout ││ +│ │ Full visualization visible ││ +│ │ All nodes labeled ││ +│ │ Hover tooltips work ││ +│ └──────────────────────────────┘│ +│ ┌──────────────────────────────┐│ +│ │ Filters & Legend ││ +│ └──────────────────────────────┘│ +└──────────────────────────────────┘ +``` + +### Tablet (Moderate Screen) + +``` +┌─────────────────────────┐ +│ Visualization (600x500)│ +│ ┌───────────────────────┐ +│ │ Smaller nodes │ +│ │ Some labels removed │ +│ │ Zoom still works │ +│ └───────────────────────┘ +│ ┌───────────────────────┐ +│ │ Filters & Legend │ +│ │ (Simplified) │ +│ └───────────────────────┘ +└─────────────────────────┘ +``` + +### Mobile (Small Screen) + +``` +┌────────────┐ +│ MOBILE │ +│ (360x640)│ +│ ┌────────┐│ +│ │ Viz ││ ← Smaller +│ │(scalable) +│ │ Tap=|| +│ │ (no hover) +│ └────────┘│ +│ ┌────────┐│ +│ │Legend ││ ← Stacked +│ │Filters ││ Vertical +│ └────────┘│ +└────────────┘ +``` + +**Recommendation:** Start with desktop/tablet optimization. Mobile can be Phase 2. + +--- + +## Animation & Interaction + +### Hover Behavior + +``` +User hovers on node: + 1. Node: Slight grow animation (5% larger) + 2. Links: Highlight connected links (higher opacity) + 3. Tooltip: Appears near cursor + 4. Related nodes: Fade other nodes to 30% opacity + +User moves away: + 1. Node: Shrink back to normal + 2. Links: Return to default opacity + 3. Tooltip: Fade out + 4. Related nodes: Fade back to 100% + +Duration: 200ms smooth transitions +``` + +### Click Behavior + +``` +User clicks node: + 1. Node: Expand/highlight (visual "selection") + 2. Tooltip: Show detailed info + 3. Links: All links from node highlighted + 4. Related nodes: Highlight connected nodes + 5. Lock state until click elsewhere or Escape + +User clicks elsewhere: + 1. Deselect node + 2. Return to default view +``` + +--- + +## Filtering Behavior + +### Current Filtering +- Filter by organization +- Filter by stars +- Filter by language +- Results: Hides repos, cascades to hide links/contributors + +### Proposed Filtering (After New Feature) + +**Option 1: Tier-Aware Filtering** +``` +- Sponsored contributors: Always visible (never filtered) +- Community contributors: Can be hidden if filters exclude their repos +- Repos: Can be filtered +- Links: Show based on visible repo/contributor combination +``` + +**Option 2: Tier Toggle** +``` +- Checkbox: "Show community contributors" (default: ON) +- When OFF: Hide all community nodes, show only sponsored +- Useful for focused view on key contributors +- Fast way to simplify visualization +``` + +**Recommendation:** Implement Option 1 initially, add Option 2 in Phase 2 if requested. + +--- + +## Performance Considerations + +### Data Size Impact +``` +Scenario 1: Small Project +- 10 repos, 30 contributors (20 sponsored) +- Current: ~30 nodes + ~100 links +- After: No change in data size +- Performance: Excellent + +Scenario 2: Medium Project +- 50 repos, 150 contributors (20 sponsored) +- Current: ~150 nodes + ~500 links +- After: No change in data size +- Performance: Good + +Scenario 3: Large Project +- 75 repos, 300+ contributors (20 sponsored) +- Current: ~300 nodes + ~1000 links +- After: No change in data size +- Performance: Monitor, may need optimization +``` + +### Optimization Strategies +1. Lazy load community contributor details +2. Use simplified tooltips for community (load on demand) +3. Batch force simulation calculations +4. Render community nodes at lower detail initially + +--- + +## Color Accessibility + +### Current Colors +- Grenadier Orange (#CF3F02) +- Aquamarine Blue (#2E86AB) +- Base Gray (#443F3F) + +### Contrast Ratios +- Orange on white: 5.2:1 ✅ (WCAG AA) +- Blue on white: 5.1:1 ✅ (WCAG AA) +- Gray on white: 6.8:1 ✅ (WCAG AAA) + +### Colorblind-Friendly Design +- Don't rely on color alone +- Use size/shape/position for distinction +- Add tier badges (text) to distinguish +- Links: Use both color gradient and stroke width + +### Recommended Accessibility Features +1. Tier badge labels (text not just color) +2. High contrast borders on nodes +3. Alternative icons for colorblind users +4. Keyboard navigation support + +--- + +## Design Decision Template + +Use this to document your final choice: + +``` +DESIGN DECISION: [Option A / Option B] + +RATIONALE: +- Why this option? +- What makes it right for your use case? +- What are the key benefits? + +VISUAL SPECIFICATIONS: +- Sponsored node color: [color] +- Community node color: [color] +- Opacity differences: [specs] +- Size differences: [specs] +- Labels: [visible/hidden/conditional] + +COMMUNITY POSITIONING: +- Layout: [scattered/ring/other] +- Distance from center: [radius] +- Interaction behavior: [behavior] + +TOOLTIPS: +- Show tier? [yes/no] +- Tier badge style: [style] +- Information shown: [fields] + +FILTERS: +- Community visible by default? [yes/no] +- Can be filtered out? [yes/no] +- Always show sponsored? [yes/no] + +TIMELINE: +- Design approval: [date] +- Development start: [date] +- Target launch: [date] +``` + +--- + +## Example: Completed Decision Document + +``` +DESIGN DECISION: Option B (Outer Ring) + +RATIONALE: +- Matches ORCA model (client familiar with it) +- Professional appearance +- Clear visual hierarchy +- Scales well with large communities + +VISUAL SPECIFICATIONS: +- Sponsored node color: #CF3F02 (Grenadier Orange) +- Community node color: #2E86AB (Aquamarine Blue) +- Community opacity: 75% +- Community size: 85% of sponsored +- All nodes labeled (adjustable font size by tier) + +COMMUNITY POSITIONING: +- Layout: Outer ring +- Distance from center: 400px (vs 150px for sponsored) +- Gentle repulsion between community nodes (no overlap) +- Radial positioning (angle-based like sponsored ring) + +TOOLTIPS: +- Show tier? Yes +- Tier badge style: Colored pill with text +- Information shown: Name, Tier, Commits, Repos, Dates + +FILTERS: +- Community visible by default? Yes +- Can be filtered out? Yes (via repo filters) +- Always show sponsored? Yes (never hidden) + +TIMELINE: +- Design approval: Feb 14, 2026 +- Development start: Feb 17, 2026 +- Target launch: Mar 14, 2026 (4 weeks) +``` + +--- + +## Next Steps + +1. **Choose Option A or Option B** + - Or propose a hybrid/custom approach + +2. **Finalize Color Scheme** + - Confirm using existing brand colors + - Or propose alternatives + +3. **Specify Tier Labels/Badging** + - How should tiers be shown? + - Text, icons, colors, or combinations? + +4. **Decide on Community Visibility** + - Always shown? + - Togglable? + - Filtered by default? + +5. **Approve Timeline** + - Option A: 3 weeks + - Option B: 4 weeks + - With feedback loops: add 1 week + +Once these decisions are made, the implementation roadmap can proceed with certainty. + +--- + +**This guide is ready for design discussion and feedback.** + +**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/00_READ_ME_FIRST.txt b/docs/sponsor_centric/00_READ_ME_FIRST.txt new file mode 100644 index 0000000..31fc157 --- /dev/null +++ b/docs/sponsor_centric/00_READ_ME_FIRST.txt @@ -0,0 +1,365 @@ +================================================================================ + FEASIBILITY ASSESSMENT - SPONSORED VS COMMUNITY CONTRIBUTORS +================================================================================ + +STATUS: ✅ APPROVED FOR IMPLEMENTATION (3-4 weeks estimated) + +This folder contains a complete feasibility assessment for your feature request: +"Visualize sponsored contributors in a ring, community contributors scattered" + +================================================================================ + START HERE +================================================================================ + +1. Read this file first (you are here!) +2. Then read: ASSESSMENT_INDEX.md (navigation guide) +3. Then read: FEATURE_REQUEST_SUMMARY.md (quick overview) + +That's all you need for the first round. It takes 30 minutes. + +================================================================================ + WHAT YOU'RE GETTING +================================================================================ + +✅ ASSESSMENT_INDEX.md + - Navigation guide for all 4 documents + - Quick reference tables + - Who should read what + - Key questions answered + +✅ FEATURE_REQUEST_SUMMARY.md + - Executive summary (approvals/stakeholders read this) + - What's being built & why + - Feasibility verdict: HIGHLY FEASIBLE + - Configuration examples + - Next steps for approval + +✅ FEASIBILITY_ASSESSMENT.md + - Technical deep-dive (developers/architects read this) + - Architecture impact analysis + - File-by-file changes required + - Risk assessment + - Testing strategy + - Comparison to ORCA model + +✅ IMPLEMENTATION_ROADMAP.md + - Step-by-step implementation guide + - 6 phases from backend to release + - Code examples for each phase + - Testing procedures + - Validation checkpoints + - Command reference + +✅ VISUALIZATION_DESIGN_GUIDE.md + - Design specs for two options (A & B) + - Option A: Scattered community (simpler, 1 week) + - Option B: Outer ring (more polished, 2 weeks) + - Color schemes and styling + - Wireframes and diagrams + - Decision matrix + - Design decision template + +================================================================================ + KEY FINDINGS +================================================================================ + +FEASIBILITY: ✅ Highly Feasible +Risk Level: 🟡 Low-to-Medium (no high risks) +Effort Required: ~300 lines of code total +Timeline: 3-4 weeks (Option A) to 4-5 weeks (Option B) +Breaking Changes: None - fully backward compatible +Architecture Ready: Yes - modular design supports this well + +BACKEND: ~50 lines Python + - Config parsing for sponsor list + - Contributor classification + - CSV output with tier column + Duration: 4-5 days + +FRONTEND: ~250 lines JavaScript + - Load tier data + - Position nodes by tier + - Render with tier-based styling + Duration: 1-2 weeks + +DESIGN DECISION: Option A vs Option B + - A: Simpler, faster (1 week for layout) + - B: More polished, like ORCA (2 weeks for layout) + RECOMMENDATION: Start with A, upgrade to B based on feedback + +================================================================================ + QUICK START GUIDE +================================================================================ + +Step 1: UNDERSTAND (30 min) + → Read: ASSESSMENT_INDEX.md (3 min) + → Read: FEATURE_REQUEST_SUMMARY.md (10 min) + → Read: VISUALIZATION_DESIGN_GUIDE.md - Option A & B (15 min) + +Step 2: DECIDE (15 min) + → Which design option? A (simple) or B (polished)? + → Who needs to approve? + → Create timeline + +Step 3: PLAN (30 min) + → Read: IMPLEMENTATION_ROADMAP.md - Overview + → Assign developers to phases + → Schedule implementation + +Step 4: BUILD (3-4 weeks) + → Execute phases 1-6 using IMPLEMENTATION_ROADMAP.md + → Follow code examples and testing procedures + → Iterate based on feedback + +================================================================================ + DOCUMENT READING ORDER +================================================================================ + +For Executives/Approvers: + 1. This file (README) + 2. FEATURE_REQUEST_SUMMARY.md ← Key document + 3. VISUALIZATION_DESIGN_GUIDE.md (design options only) + +For Developers: + 1. This file (README) + 2. ASSESSMENT_INDEX.md (navigation) + 3. FEATURE_REQUEST_SUMMARY.md (context) + 4. IMPLEMENTATION_ROADMAP.md ← Key document + 5. Code examples in relevant phase + +For Architects/Tech Leads: + 1. This file (README) + 2. ASSESSMENT_INDEX.md (navigation) + 3. FEASIBILITY_ASSESSMENT.md ← Key document + 4. IMPLEMENTATION_ROADMAP.md (phases overview) + +For Designers/Product Managers: + 1. This file (README) + 2. FEATURE_REQUEST_SUMMARY.md (context) + 3. VISUALIZATION_DESIGN_GUIDE.md ← Key document + 4. Design decision template (complete before Phase 3) + +================================================================================ + WHAT HAPPENS NEXT +================================================================================ + +Week 1: Approval & Planning + - Stakeholders review FEATURE_REQUEST_SUMMARY.md + - Design team reviews VISUALIZATION_DESIGN_GUIDE.md + - Decision: Option A or Option B? + - Get executive approval to proceed + +Week 2: Development Kickoff + - Review IMPLEMENTATION_ROADMAP.md + - Assign developers to phases + - Phase 1 (Backend) begins + - Dev work: 4-5 days + +Weeks 3-4: Frontend Development + - Phase 2 (Data Loading): 2-3 days + - Phase 3 (Layout/Simulation): 5-7 days (depends on Option A vs B) + - Phase 4 (Rendering): 3-4 days + - Continuous testing + +Weeks 4-5: Testing & Refinement + - Phase 5 (Comprehensive Testing): 3-5 days + - Phase 6 (Polish & Release): 3-5 days + - Design feedback and iterations + - Production deployment + +================================================================================ + DESIGN OPTIONS AT A GLANCE +================================================================================ + +Option A: SCATTERED COMMUNITY LAYOUT (Simpler) +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +┌─────────────────────────────────────────────┐ +│ │ +│ Unknown1 • Unknown2 • Unknown3 • │ +│ │ +│ Unknown4 • Unknown5 • │ +│ │ +│ │ +│ [SPONSORED RING] │ +│ ALICE BOB CHARLIE DAVE │ +│ (Central, prominent) │ +│ │ +│ Unknown6 • Unknown7 • │ +│ │ +│ (Scattered around ring - visible, less │ +│ prominent, uses existing simulation) │ +└─────────────────────────────────────────────┘ + +Pros: Easy to implement (1 week), reuses existing code +Cons: Less organized appearance + +Option B: OUTER RING LAYOUT (More Polished) +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +┌─────────────────────────────────────────────┐ +│ Unknown3 • Unknown2 • Unknown1 • │ +│ │ +│ Unknown7 • Unknown4 • │ +│ │ +│ Unknown6 • Unknown5 • │ +│ │ +│ ╔═════════════════════════════════════════╗ │ +│ ║ [COMMUNITY RING - Outer] ║ │ +│ ║ (Organized circle of community nodes) ║ │ +│ ╚═════════════════════════════════════════╝ │ +│ │ +│ ALICE BOB CHARLIE │ +│ DAVE EVE │ +│ (SPONSORED RING - Inner) │ +│ [Central Repositories] │ +│ │ +└─────────────────────────────────────────────┘ + +Pros: Professional appearance, like ORCA, clear hierarchy +Cons: Requires custom simulation (2 weeks) + +RECOMMENDATION: Start with Option A, upgrade to B based on feedback + +================================================================================ + KEY METRICS +================================================================================ + +Code Changes: ~300 lines total + Backend: ~50 lines (config + classification) + Frontend: ~250 lines (data loading + layout + rendering) + +Timeline: 3-4 weeks (Option A) to 4-5 weeks (Option B) + Phase 1 Backend: 4-5 days + Phase 2 Frontend: 2-3 days + Phase 3 Layout: 5-7 days (varies by option) + Phase 4 Rendering: 3-4 days + Phase 5 Testing: 3-5 days + Phase 6 Polish: 3-5 days + +Risk Level: 🟡 Low-to-Medium + High Risk: None identified + Medium Risk: Visual design tuning, force simulation performance + Low Risk: Config changes, data model, CSV output + +Backward Compat: ✅ 100% - existing configs work unchanged +Performance: ✅ No degradation expected +Testing: ✅ Comprehensive test procedures included +Documentation: ✅ Code examples for each phase + +================================================================================ + FREQUENTLY ASKED QUESTIONS +================================================================================ + +Q: Is this actually feasible? +A: ✅ YES. Highly feasible. Architecture is perfect for this. + +Q: Will it break existing functionality? +A: ✅ NO. Fully backward compatible. + +Q: How long will it take? +A: 3-4 weeks (Option A) to 4-5 weeks (Option B) + +Q: What are the main risks? +A: Visual design tuning and force simulation performance (medium risks). + No high-risk items identified. + +Q: Do we need to refactor existing code? +A: No. Most changes are additive. Minimal refactoring needed. + +Q: Can we do this iteratively? +A: ✅ YES. Option A works as first pass, upgrade to Option B later. + +Q: What about the sponsor list maintenance? +A: Simple - just edit config.toml. No code changes needed. + +Q: Will community contributors feel left out? +A: No - they're still visible, just positioned differently. They're + recognized and connected to repos, just not in the center ring. + +Q: Can we filter by tier? +A: ✅ YES. You can show/hide community contributors as desired. + +Q: Does it work on mobile? +A: ✅ YES. Visualization is responsive. Phase 2 can optimize further. + +================================================================================ + NEXT IMMEDIATE STEPS +================================================================================ + +TODAY: +☐ Read: ASSESSMENT_INDEX.md (3 min) +☐ Read: FEATURE_REQUEST_SUMMARY.md (10 min) +☐ Read: VISUALIZATION_DESIGN_GUIDE.md pages 1-20 (15 min) + +THIS WEEK: +☐ Team discussion: Option A or Option B? +☐ Get executive/client approval +☐ Schedule implementation timeline + +NEXT WEEK: +☐ Assign developers (backend + frontend) +☐ Schedule Phase 1 kickoff +☐ Begin Phase 1 development + +================================================================================ + DOCUMENT LOCATIONS +================================================================================ + +In this folder, you'll find: + +• 00_READ_ME_FIRST.txt (this file) +• ASSESSMENT_INDEX.md (navigation guide - start here) +• FEATURE_REQUEST_SUMMARY.md (overview - read 2nd) +• FEASIBILITY_ASSESSMENT.md (technical detail) +• IMPLEMENTATION_ROADMAP.md (how to build) +• VISUALIZATION_DESIGN_GUIDE.md (design specs) + +Plus, these documents from the existing project: +• docs/ARCHITECTURE.md (existing project architecture) +• config.toml (configuration file) +• CLAUDE.md (developer guide) + +================================================================================ + QUICK NAVIGATION +================================================================================ + +Want to know: Read: +─────────────────────────────────────────────────────────────────────────── +"Is this feasible?" FEATURE_REQUEST_SUMMARY.md (2 min) +"What needs to change?" FEASIBILITY_ASSESSMENT.md (15 min) +"How do I build this?" IMPLEMENTATION_ROADMAP.md (30 min) +"What should it look like?" VISUALIZATION_DESIGN_GUIDE.md (20 min) +"Which path do I take?" ASSESSMENT_INDEX.md (5 min) +"When will it be done?" FEATURE_REQUEST_SUMMARY.md (1 min) +"What are the risks?" FEASIBILITY_ASSESSMENT.md - Risk section (5 min) +"How much code is this?" FEASIBILITY_ASSESSMENT.md - File changes table (2 min) + +================================================================================ + ASSESSMENT COMPLETION SUMMARY +================================================================================ + +✅ Architecture review: Complete +✅ Feasibility analysis: Complete +✅ Risk assessment: Complete +✅ Implementation planning: Complete +✅ Design options: Complete +✅ Code examples: Complete +✅ Testing procedures: Complete +✅ Timeline estimates: Complete +✅ Resource planning: Complete +✅ Documentation: Complete + +STATUS: Ready for Implementation +DATE: February 2026 + +================================================================================ + +Now proceed to: ASSESSMENT_INDEX.md + +That file will guide you through all the assessment documents and help you +navigate to exactly what you need. + +Questions? Each document has clear sections and a table of contents. + +Good luck! 🚀 + +================================================================================ diff --git a/docs/sponsor_centric/ASSESSMENT_INDEX.md b/docs/sponsor_centric/ASSESSMENT_INDEX.md new file mode 100644 index 0000000..f0826a4 --- /dev/null +++ b/docs/sponsor_centric/ASSESSMENT_INDEX.md @@ -0,0 +1,408 @@ +# Feasibility Assessment Index + +**Project:** Contributor Network - Sponsored vs. Community Contributor Visualization +**Requested:** Feature request to show tiered contributors (sponsored in ring, community scattered/outer) +**Assessment Date:** February 2026 +**Status:** ✅ **APPROVED FOR IMPLEMENTATION** + +--- + +## 📋 Document Overview + +This assessment consists of 4 comprehensive documents. Start here and navigate to what you need. + +### 1. 📊 **FEATURE_REQUEST_SUMMARY.md** ← **START HERE** +**Best for:** Getting a quick overview of the feature + +Contains: +- Executive summary of what's being built +- High-level feasibility assessment (✅ Highly Feasible) +- Key design decisions +- Impact on existing system +- Configuration examples +- Next steps for approval + +**Read this first if you:** +- Need a quick overview +- Are deciding whether to proceed +- Want to understand client value +- Need to explain to stakeholders + +**Time to read:** 10-15 minutes + +--- + +### 2. 🔧 **FEASIBILITY_ASSESSMENT.md** ← **DETAILED ANALYSIS** +**Best for:** Understanding technical details and risks + +Contains: +- Current state analysis +- What works in your favor (architecture is ready) +- Current constraints +- Proposed implementation model +- File-by-file changes needed +- Detailed risk assessment +- Testing strategy +- Architectural decisions & tradeoffs +- Comparison to ORCA model +- Known unknowns + +**Read this if you:** +- Are technically involved in the project +- Want to understand architecture impact +- Need to assess risks +- Are deciding between implementation options +- Want to know why things are feasible + +**Time to read:** 20-30 minutes + +**Key Finding:** ~50 lines of Python, ~250 lines of JavaScript, 3-4 weeks timeline + +--- + +### 3. 🛣️ **IMPLEMENTATION_ROADMAP.md** ← **STEP-BY-STEP GUIDE** +**Best for:** Actually building the feature + +Contains: +- 6 phases of development (backend → frontend → testing → release) +- Phase 1: Backend (config + classification) - 4-5 days +- Phase 2: Frontend data loading - 2-3 days +- Phase 3: Layout & simulation - 5-7 days (varies by option) +- Phase 4: Rendering & styling - 3-4 days +- Phase 5: Testing & validation - 3-5 days +- Phase 6: Refinement & release - 3-5 days +- Detailed code examples for each phase +- Testing procedures with sample code +- Risk mitigations +- Command reference + +**Read this if you:** +- Are ready to start implementation +- Need step-by-step instructions +- Want code examples +- Need testing procedures +- Are assigning tasks to developers + +**Time to read:** 30-45 minutes (to understand structure) +**Reference time:** Look up specific phase as needed during development + +--- + +### 4. 🎨 **VISUALIZATION_DESIGN_GUIDE.md** ← **DESIGN SPECIFICATIONS** +**Best for:** Visual design decisions and mockups + +Contains: +- Current visualization diagram +- Option A design: Scattered community layout +- Option B design: Outer ring layout +- Color & style specifications +- Link styling (sponsored vs community) +- Tooltip designs +- Before/after comparison +- Design decision matrix +- Mobile responsiveness considerations +- Animation & interaction specs +- Accessibility & color contrast info +- Design decision template (to document your choice) + +**Read this if you:** +- Are deciding between design options +- Need to specify visual appearance +- Are communicating design to client +- Want accessibility guidelines +- Need mockups/diagrams + +**Time to read:** 20-30 minutes + +--- + +## 🎯 Quick Reference: Who Should Read What? + +### Project Manager / Client +1. **FEATURE_REQUEST_SUMMARY.md** (10 min) +2. **VISUALIZATION_DESIGN_GUIDE.md** - Pages 1-12 (Design options) +3. Decision: Option A or Option B? + +### Frontend Developer +1. **FEATURE_REQUEST_SUMMARY.md** (10 min) - context +2. **IMPLEMENTATION_ROADMAP.md** - Phases 2-4 (20 min) +3. **Feasibility Assessment.md** - Data Model section (5 min) +4. Start with Phase 2.1 in ROADMAP + +### Backend Developer +1. **FEATURE_REQUEST_SUMMARY.md** (10 min) - context +2. **IMPLEMENTATION_ROADMAP.md** - Phase 1 (15 min) +3. **FEASIBILITY_ASSESSMENT.md** - Backend Changes table (5 min) +4. Start with Phase 1.1 in ROADMAP + +### Architect / Tech Lead +1. **FEASIBILITY_ASSESSMENT.md** (25 min) - entire document +2. **IMPLEMENTATION_ROADMAP.md** - Architecture section (10 min) +3. Review impact on existing systems + +### Designer / Product +1. **VISUALIZATION_DESIGN_GUIDE.md** (25 min) +2. **FEATURE_REQUEST_SUMMARY.md** (10 min) - context +3. Complete design decision template + +--- + +## 📊 Key Metrics at a Glance + +### Feasibility +- ✅ **Highly Feasible** +- Architecture is modular and ready +- No breaking changes required +- Backward compatible + +### Scope +| Component | Effort | Time | +|-----------|--------|------| +| Backend | ~50 lines | 4-5 days | +| Frontend | ~250 lines | 1-2 weeks | +| Testing | Comprehensive | 1 week | +| **Total** | **~300 lines** | **3-4 weeks** | + +### Risk Level +- **Low:** Config changes, CSV generation, data model +- **Medium:** Visual design tuning, force simulation performance +- **High:** None identified + +### Design Options +| Option | Time | Complexity | Visual Distinction | +|--------|------|-----------|-------------------| +| A: Scattered | 1 week | Low | Good | +| B: Outer Ring | 2 weeks | Moderate | Excellent | + +--- + +## 🔄 Document Dependencies + +``` +FEATURE_REQUEST_SUMMARY + ↓ + ├─→ FEASIBILITY_ASSESSMENT (if technical questions) + ├─→ VISUALIZATION_DESIGN_GUIDE (if design questions) + └─→ IMPLEMENTATION_ROADMAP (when ready to build) + +Before Implementation: + 1. Read FEATURE_REQUEST_SUMMARY + 2. Clarify design (VISUALIZATION_DESIGN_GUIDE) + 3. Get approval + 4. Start with IMPLEMENTATION_ROADMAP +``` + +--- + +## 🚀 Getting Started: 3 Steps + +### Step 1: Understand the Feature (15 min) +``` +Read: FEATURE_REQUEST_SUMMARY.md +Ask: "Does this solve the client's problem?" +Decide: Proceed or clarify requirements? +``` + +### Step 2: Decide on Design (20 min) +``` +Read: VISUALIZATION_DESIGN_GUIDE.md (Options A & B) +Review: Design decision matrix +Decide: Option A (simpler) or Option B (more polished)? +``` + +### Step 3: Plan Implementation (30 min) +``` +Read: IMPLEMENTATION_ROADMAP.md (Phase overview) +Review: Timeline and phases +Assign: Developers to phases +Start: Phase 1 (backend) first +``` + +--- + +## 📝 Key Questions Answered + +**Q: Is this feature feasible?** +A: ✅ Yes, highly feasible. Architecture supports it well. + +**Q: How long will it take?** +A: 3-4 weeks (Option A) to 4-5 weeks (Option B) + +**Q: Will it break existing functionality?** +A: ✅ No, fully backward compatible. + +**Q: What about performance?** +A: ✅ No degradation expected, data size unchanged. + +**Q: What are the main risks?** +A: Visual design tuning (medium risk), force simulation tuning (medium risk), no high risks identified. + +**Q: Do we need to refactor existing code?** +A: Minimal - mostly additive changes. + +**Q: Can we do this iteratively?** +A: ✅ Yes, Option A works as first pass, upgrade to Option B later. + +**Q: What about backward compatibility?** +A: ✅ Fully compatible - existing configs work without changes. + +--- + +## 🎓 Understanding the Architecture + +**You don't need to read the full ARCHITECTURE.md, but here's the key insight:** + +The visualization has three layers: + +1. **Data Layer** (Python) + - Fetches from GitHub + - Generates CSVs + - **Change Here:** Add tier classification + +2. **Processing Layer** (JavaScript data preparation) + - Loads CSV data + - Prepares nodes/links + - **Change Here:** Load tier field, separate by tier + +3. **Visualization Layer** (JavaScript rendering) + - Force simulations (position nodes) + - Canvas drawing (render) + - Interaction (hover/click) + - **Change Here:** Different simulation/positioning by tier + +The beauty of the current architecture is that each layer is separate, so changes to one don't cascade to the others. + +--- + +## ✅ Success Criteria + +The feature is successful when: + +1. ✅ Configuration supports sponsored contributor list +2. ✅ CSV output includes tier column +3. ✅ Frontend loads and displays tier data +4. ✅ Visualization positions contributors by tier +5. ✅ Tooltips show tier designation +6. ✅ All tests pass +7. ✅ No performance regression +8. ✅ Backward compatible with existing configs + +--- + +## 🤔 Common Questions + +**Q: Do we have to do Option B (the harder one)?** +A: No, start with Option A. It works well and ships faster. Upgrade to B based on feedback. + +**Q: How much code needs to change?** +A: ~300 lines total across Python and JavaScript. Most of existing code stays the same. + +**Q: Can we do this in parallel?** +A: Yes - backend (Python) can be done in parallel with frontend (JavaScript) after phase 2. + +**Q: What about the current visualization? Will it change?** +A: Only how contributors are positioned. Repos and links work the same. + +**Q: Do we need client approval for design?** +A: ✅ Yes, show them the two options (A & B) from VISUALIZATION_DESIGN_GUIDE.md. + +**Q: How do we handle the "sponsor list" in config?** +A: Add a new `[contributors.sponsored]` section to config.toml. Defaults to existing `[contributors.devseed]` if not present. + +--- + +## 📚 Documents in This Assessment + +| Document | Purpose | Length | Audience | +|----------|---------|--------|----------| +| FEATURE_REQUEST_SUMMARY.md | Overview & decision | 5 pages | Everyone | +| FEASIBILITY_ASSESSMENT.md | Technical deep-dive | 15 pages | Tech team | +| IMPLEMENTATION_ROADMAP.md | Step-by-step guide | 20 pages | Developers | +| VISUALIZATION_DESIGN_GUIDE.md | Design specs | 15 pages | Designers/PMs | +| ASSESSMENT_INDEX.md (this file) | Navigation | 3 pages | Everyone | + +**Total: ~60 pages of detailed guidance** + +--- + +## 🎯 Next Actions + +### Immediate (This Week) +- [ ] Read FEATURE_REQUEST_SUMMARY.md (stakeholders) +- [ ] Review VISUALIZATION_DESIGN_GUIDE.md options (design/PM) +- [ ] Get client input on Option A vs. B + +### Planning (Next Week) +- [ ] Finalize design choice +- [ ] Get executive approval +- [ ] Assign developers to phases +- [ ] Create sprint plan + +### Development (Following Week) +- [ ] Kick off Phase 1 (Backend) +- [ ] Use IMPLEMENTATION_ROADMAP.md as guide +- [ ] Execute phases 1-6 sequentially + +--- + +## 💡 Pro Tips + +1. **Start with Option A** - simpler, faster, still looks great. Upgrade to B based on feedback. + +2. **Do phases sequentially** - Phase 1 (backend) must complete before Phase 2-3 (frontend). + +3. **Get design approval early** - Before starting Phase 3, finalize visual design. + +4. **Test continuously** - Each phase has testing procedures. Don't skip them. + +5. **Keep backward compatibility** - All changes are additive. Existing configs still work. + +6. **Document decisions** - Use the design decision template in VISUALIZATION_DESIGN_GUIDE.md. + +--- + +## 🔗 Quick Links + +- **Overview:** FEATURE_REQUEST_SUMMARY.md +- **Technical:** FEASIBILITY_ASSESSMENT.md +- **Implementation:** IMPLEMENTATION_ROADMAP.md +- **Design:** VISUALIZATION_DESIGN_GUIDE.md +- **Architecture:** docs/ARCHITECTURE.md (existing project docs) +- **Config Example:** config.toml (in project root) + +--- + +## 📞 Contact & Questions + +For specific questions: + +1. **"How do I build this?"** → Read IMPLEMENTATION_ROADMAP.md Phase 1 +2. **"What are the risks?"** → Read FEASIBILITY_ASSESSMENT.md Risk section +3. **"What should it look like?"** → Read VISUALIZATION_DESIGN_GUIDE.md +4. **"Is this worth doing?"** → Read FEATURE_REQUEST_SUMMARY.md +5. **"How does the architecture work?"** → Read docs/ARCHITECTURE.md + +--- + +## 🎉 Summary + +**Status:** ✅ **READY TO PROCEED** + +This is a well-scoped, low-risk feature that: +- ✅ Solves the client's problem +- ✅ Fits the architecture well +- ✅ Has clear implementation path +- ✅ Can be done in 3-4 weeks +- ✅ Is backward compatible +- ✅ Has documented design options + +**Next Step:** Review FEATURE_REQUEST_SUMMARY.md and VISUALIZATION_DESIGN_GUIDE.md, then decide on design (Option A or B). + +--- + +**Assessment Complete** +**All Documentation Ready** +**Status: Ready for Implementation** + +**Date:** February 2026 +**Assessor:** Claude diff --git a/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md b/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md new file mode 100644 index 0000000..c0e3299 --- /dev/null +++ b/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md @@ -0,0 +1,582 @@ +# Feasibility Assessment: Contributor Network Visualization Redesign + +**Date:** February 2026 +**Requested By:** Client Feature Request +**Assessment Author:** Claude + +--- + +## Executive Summary + +**Status:** ✅ **HIGHLY FEASIBLE** + +The proposed redesign—shifting from a search-based model to a fixed repository list with tiered contributor visualization (inspired by ORCA)—is well-aligned with the current architecture and achievable with moderate effort. + +**Key Findings:** +- Current data pipeline already supports a fixed repository list (via `config.toml`) +- Visualization architecture is modular and flexible enough to accommodate a new layout strategy +- Data model supports contributor classification (sponsored vs. community) +- Primary changes are **isolated to Python backend and JavaScript layout/rendering layers** + +**Estimated Scope:** 3-4 weeks of development +**Risk Level:** Low to Medium +**Technical Debt Impact:** Minimal (improves code organization) + +--- + +## Current State Analysis + +### What Works in Your Favor + +1. **Fixed Repository List Already in Place** + - `config.toml` defines exactly which repositories to track + - Python backend (`client.py`, `cli.py`) already fetches only these repos + - No changes needed to data fetching logic + +2. **Flexible Data Models** + - `Repository` model stores all needed metadata + - `Link` model tracks contributor-to-repo relationships + - Easy to add contributor classification (sponsored/community) + +3. **Modular JavaScript Architecture** + - Data preparation is being extracted (`prepareData()`) + - Force simulations are modular (can add new simulation types) + - Render pipeline supports multiple visualization strategies + - State management is clean and isolated + +4. **Clear Separation of Concerns** + - Python: Data fetching, validation, CSV generation + - JavaScript: Visualization, interaction, rendering + - No tight coupling between layers + +### Current Constraints + +1. **Visualization Assumes All Contributors Are "Sponsored"** + - Current layout places all contributors in a fixed ring + - No concept of "community contributors" vs "sponsored contributors" + - Links from all contributors treated equally + +2. **Force Simulations Built for Current Model** + - 4 existing simulations: owner, contributor, collaboration, remaining + - "Remaining" simulation places extra contributors in outer ring + - This is actually close to what you need for community contributors! + +3. **Configuration System** + - Only supports one contributor list: `[contributors.devseed]` + - Would need to add a new section for "sponsored contributors" + - Requires minor refactoring of `config.py` + +--- + +## Proposed Implementation Model + +### Overview + +``` +BEFORE (Current): +┌─────────────────────┐ +│ Fixed Repo List │ +│ (config.toml) │ +└──────────┬──────────┘ + ↓ +┌─────────────────────┐ +│ Fetch All Commits │ +│ From These Repos │ +└──────────┬──────────┘ + ↓ +┌─────────────────────┐ +│ ALL Contributors │ +│ in Ring (Circular) │ +└─────────────────────┘ + + +AFTER (Proposed): +┌──────────────────────────────────────┐ +│ Fixed Repo List + Sponsored List │ +│ (config.toml) │ +└──────────────────┬───────────────────┘ + ↓ +┌──────────────────────────────────────┐ +│ Fetch Commits From Fixed Repos │ +│ (unchanged) │ +└──────────────────┬───────────────────┘ + ↓ + ┌───────────┴───────────┐ + ↓ ↓ +┌──────────────────┐ ┌─────────────────┐ +│ Sponsored Contribs │ Community Contribs │ +│ (Ring Layout) │ (Scattered Layout) │ +└──────────────────┘ └─────────────────┘ +``` + +### New Data Structure + +#### 1. Configuration (`config.toml`) + +Add a new section for sponsored contributors (can reference existing `[contributors.devseed]` or create a new one): + +```toml +# Repositories to track (unchanged) +repositories = [ + "developmentseed/titiler", + "developmentseed/lonboard", + # ... rest of repos +] + +# NEW: Define which contributors are "sponsored" (vs. community) +[contributors.sponsored] +# These are the people you want to highlight +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" +# ... other sponsored contributors + +# Existing section (can be repurposed or kept for reference) +[contributors.devseed] +aboydnw = "Anthony Boyd" +# ... all devseed members +``` + +#### 2. Config Model Enhancement (`config.py`) + +Add support for reading sponsored contributors list: + +```python +class Config(BaseModel): + organization_name: str + repositories: list[str] + contributors: dict[str, dict[str, str]] # existing structure + # Could add: + # sponsored_contributors: list[str] # GitHub usernames +``` + +#### 3. CSV Output Enhancement + +Add a `contributor_tier` column to `contributors.csv`: + +```csv +name,tier +Anthony Boyd,sponsored +Unknown Contributor,community +Pete Gadomski,sponsored +``` + +This allows the frontend to classify contributors without additional data fetching. + +--- + +## Implementation Steps + +### Phase 1: Backend (Python) - 1-2 weeks + +**Goal:** Classify contributors as sponsored or community, output classification in CSV. + +#### Step 1.1: Update Config Model +- Add `sponsored_contributors` field to `Config` class in `config.py` +- Update `config.toml` parser to read new `[contributors.sponsored]` section +- Maintain backward compatibility with existing configs + +**Files to modify:** +- `python/contributor_network/config.py` (+20 lines) +- `config.toml` (+8 lines) + +**Tests to add:** +- Test parsing of new config section +- Test handling of missing sponsored section (fallback) + +#### Step 1.2: Classify Contributors During Data Processing +- In `cli.py`, after fetching contributor data, classify each as: + - ✅ `sponsored` if in `[contributors.sponsored]` list + - ⚠️ `community` if not sponsored but contributed to tracked repos + +**Files to modify:** +- `python/contributor_network/cli.py` (+15 lines in csvs command) +- `python/contributor_network/models.py` (add tier field to Contributor model, +5 lines) + +**New function:** +```python +def classify_contributors(contributors: list[str], sponsored_list: list[str]) -> dict[str, str]: + """Return dict mapping contributor name → tier (sponsored/community)""" + return { + name: "sponsored" if name in sponsored_list else "community" + for name in contributors + } +``` + +#### Step 1.3: Update CSV Generation +- Modify `csvs` command to include `tier` column in output + +**Files to modify:** +- `python/contributor_network/cli.py` (+5 lines in csvs command) + +**Tests:** +- Verify sponsored contributors are marked correctly +- Verify community contributors are marked correctly +- Verify CSV format is valid + +--- + +### Phase 2: Frontend (JavaScript) - 2-3 weeks + +**Goal:** Load tier data and adjust visualization layout/positioning based on contributor tier. + +#### Step 2.1: Load Contributor Tier Data +- Modify data loading to read `tier` column from `contributors.csv` +- Store tier info in contributor node objects + +**Files to modify:** +- `src/js/data/prepare.js` (extract from index.js if not done, +10 lines) + - Add tier field to contributor node: `{ id, name, tier, ... }` + +**Code pattern:** +```javascript +function prepareContributors(csvData) { + return csvData.map(row => ({ + id: row.name, + name: row.name, + tier: row.tier, // NEW: "sponsored" or "community" + // ... existing fields + })); +} +``` + +#### Step 2.2: Create Community Contributor Layout Strategy +This is the **key visualization change**. You have two options: + +**Option A: Use Existing "Remaining" Simulation** (Easier, 1 week) +- Reuse the existing `remainingSimulation.js` for community contributors +- Position sponsored contributors in the ring (as now) +- Community contributors positioned scattered around (as now done for extras) +- **Pros:** Minimal code changes, reuses tested simulation +- **Cons:** Less artistic control over community placement + +**Option B: Create Custom "Community Ring" Simulation** (More Control, 2 weeks) +- Build new force simulation that: + - Places community contributors in a **second, outer ring** + - Creates gentle repulsion between them (no overlap) + - Maintains some visual coherence while showing they're "different" +- **Pros:** Clear visual distinction, looks like ORCA +- **Cons:** New simulation to test and tune + +**Recommendation:** Start with **Option A** (reuse existing), then upgrade to **Option B** if client wants more visual distinction. + +**Files to create/modify:** +- Option A: `src/js/data/prepare.js` (+20 lines to separate sponsored/community) +- Option B: `src/js/simulations/communityContributorSimulation.js` (+150 lines) + +#### Step 2.3: Separate Nodes by Tier +- Modify `prepareData()` to create two arrays: + - `sponsoredNodes`: contributors in tier === "sponsored" + - `communityNodes`: contributors in tier === "community" + +**Files to modify:** +- `src/js/data/prepare.js` (+15 lines) + +#### Step 2.4: Apply Different Simulations +- Run **contributor ring simulation** on sponsored contributors (existing) +- Run **community simulation** on community contributors (new or reused) + +**Files to modify:** +- `src/js/index.js` or simulation orchestrator (~30 lines to manage two simulation groups) + +#### Step 2.5: Update Rendering +- Render both groups with visual distinction (optional): + - Different colors/opacity for community contributors? + - Different node sizes? + - Labels only for sponsored, icons for community? + +**Files to modify:** +- `src/js/render/shapes.js` (+10 lines to handle tier-based styling) +- `src/js/config/theme.js` (add community contributor colors, +5 lines) + +#### Step 2.6: Update Tooltips & Interaction +- Tooltip should show tier: "Community Contributor" vs "Sponsored Contributor" +- Links should still show contribution metrics (unchanged) + +**Files to modify:** +- `src/js/render/tooltip.js` (+5 lines to display tier) + +--- + +## Detailed File-by-File Changes + +### Backend + +| File | Change | Lines | +|------|--------|-------| +| `config.toml` | Add `[contributors.sponsored]` section | +8 | +| `python/contributor_network/config.py` | Add `sponsored_contributors` field | +20 | +| `python/contributor_network/models.py` | Add `tier` field to contributor model | +5 | +| `python/contributor_network/cli.py` | Update `csvs` command to output tier | +15 | +| **Backend Total** | | **~48 lines** | + +### Frontend + +| File | Change | Lines | +|------|--------|-------| +| `src/js/data/prepare.js` | Separate sponsored/community nodes | +25 | +| `src/js/simulations/communitySimulation.js` | NEW: Community contributor layout | +150 (or 0 if reusing existing) | +| `src/js/config/theme.js` | Add community contributor colors | +5 | +| `src/js/render/shapes.js` | Tier-based node styling | +10 | +| `src/js/render/tooltip.js` | Display tier in tooltip | +5 | +| `src/js/index.js` | Orchestrate two simulation groups | +30 | +| **Frontend Total** | | **~225 lines** | + +### Tests + +| File | Testing | +|------|---------| +| `python/tests/test_config.py` | Config parsing with new section | +| `python/tests/test_cli.py` | CSV generation includes tier column | +| `src/js/__tests__/data/prepare.test.js` | Contributor classification (new file) | +| `src/js/__tests__/simulations/community.test.js` | Community simulation behavior (new file) | + +--- + +## Migration Path & Backward Compatibility + +### Option 1: Graceful Degradation (Recommended) +If `[contributors.sponsored]` section is missing in `config.toml`: +- Treat **all current contributors as sponsored** (maintains current behavior) +- No community contributors shown (or empty set) +- Existing visualizations continue to work + +```python +def get_sponsored_contributors(config: Config) -> list[str]: + # Fallback to [contributors.devseed] if no [contributors.sponsored] + return config.contributors.get("sponsored", config.contributors.get("devseed", [])) +``` + +### Option 2: Version-Gated Feature +Add feature flag: `use_tiered_contributors: bool` in config +- When `false`: Use old model (all contributors in ring) +- When `true`: Use new model (sponsored vs community) + +**Recommendation:** Go with **Option 1** for simplicity. It's backward compatible and requires no flag changes. + +--- + +## Risk Assessment + +### Low Risk ✅ +- **Data model changes:** Well-isolated, tested with existing test suite +- **Config changes:** Simple new section, backwards compatible +- **CSV generation:** Just adding one column + +### Medium Risk 🟡 +- **Force simulation tuning:** Community contributor positioning may need tweaking for aesthetics +- **Visual design:** Different look may need polish/refinement + - What colors for community contributors? + - Should they be smaller/larger? + - Labels for community contribs? + +### High Risk ❌ +- **None identified** — architecture is solid, changes are non-breaking + +--- + +## Testing Strategy + +### Backend Testing (Python) +```bash +# 1. Config parsing +pytest python/tests/test_config.py::test_parse_sponsored_contributors + +# 2. Contributor classification +pytest python/tests/test_cli.py::test_csvs_includes_tier_column + +# 3. CSV format validation +# Verify tier column exists in contributors.csv +# Verify all contributors have a tier value +``` + +### Frontend Testing (JavaScript) +```bash +# 1. Data preparation +npm test -- src/js/__tests__/data/prepare.test.js + +# 2. Simulation +npm test -- src/js/__tests__/simulations/community.test.js + +# 3. Manual testing (local) +npm run build +python -m http.server 8000 +# Navigate to http://localhost:8000 +# Inspect: +# - Are sponsored contributors in the ring? +# - Are community contributors scattered/in outer ring? +# - Do tooltips show correct tier? +# - Do links show correctly from community contribs? +``` + +### Integration Testing +```bash +# Full pipeline +uv run contributor-network data # Fetch from GitHub +uv run contributor-network csvs # Generate CSVs with tiers +uv run contributor-network build assets/data dist # Build site +# Open dist/index.html and verify visualization +``` + +--- + +## Success Criteria + +1. ✅ **Configuration:** `config.toml` supports `[contributors.sponsored]` section +2. ✅ **Data:** CSV output includes `tier` column with "sponsored" or "community" values +3. ✅ **Visualization:** + - Sponsored contributors appear in main ring + - Community contributors appear in different layout (outer ring or scattered) +4. ✅ **UX:** Tooltips clearly show contributor tier +5. ✅ **Performance:** No regression in load time or interaction smoothness +6. ✅ **Backward Compatibility:** Existing configs still work without modification + +--- + +## Known Unknowns & Questions + +1. **Visual Design of Community Contributors** + - Should they be in a second ring? Scattered? + - Different colors/sizes? + - Should they have labels or just be nodes? + +2. **Filtering Behavior** + - Should filters (by org, stars, language) apply to community contributors? + - Or should community contributors always be visible? + +3. **Link Styling** + - Should links from community contributors look different? + - Less prominent? Different color gradient? + +4. **Mobile Responsiveness** + - How should community ring render on small screens? + - Should it collapse or reflow? + +**Recommendation:** Implement Phase 1-2 first, then gather feedback on these design questions. + +--- + +## Timeline Estimate + +| Phase | Task | Duration | Dependencies | +|-------|------|----------|---| +| 1 | Backend: Config + Classification | 4-5 days | None | +| 2 | Frontend: Data Loading | 2-3 days | Phase 1 | +| 3 | Frontend: Layout/Simulation | 5-7 days | Phase 2 | +| 4 | Frontend: Rendering | 3-4 days | Phase 3 | +| 5 | Testing & Polish | 3-5 days | Phases 1-4 | +| 6 | Feedback & Refinement | 3-5 days | Phase 5 | +| **Total** | | **3-4 weeks** | | + +--- + +## Architectural Decisions & Tradeoffs + +### Decision 1: Where to Classify Contributors? + +**Options:** +- A) Python (during CSV generation) ✅ **CHOSEN** +- B) JavaScript (during visualization initialization) + +**Reasoning:** +- **A** is better because: + - Classification is deterministic, no need to recalculate in browser + - Reduces JavaScript complexity + - Easier to test and debug + - Data is authoritative at source + +### Decision 2: One Ring or Two? + +**Options:** +- A) Sponsored in ring, community scattered (like ORCA) ✅ **CHOSEN** +- B) Sponsored in inner ring, community in outer ring +- C) Sponsored prominent (larger), community faded + +**Reasoning:** +- **A** follows the ORCA model you referenced +- Visually distinguishes groups while maintaining spatial context +- Easiest to implement (reuses existing "remaining" simulation) + +### Decision 3: Backward Compatibility Strategy + +**Options:** +- A) Graceful degradation (treat all as sponsored if no tier defined) ✅ **CHOSEN** +- B) Strict version check (fail if tier missing) +- C) Feature flag (new flag in config) + +**Reasoning:** +- **A** is safest for existing users +- No migration burden +- Config can be updated at user's pace + +--- + +## Comparison to ORCA Implementation + +The original ORCA visualization (which inspired this request) uses: +- **"ORCA Sponsored" ring:** Core team members +- **"Top Contributors" ring:** Most active community members +- **"Everybody Else" scattered:** Other contributors + +**Your implementation will be simpler:** +- **Sponsored ring:** Configured list of key contributors +- **Community scattered:** Everyone else who contributed to your repos + +**Advantages over ORCA:** +- Smaller, faster dataset (only tracked repos vs. all repos) +- Simpler logic (binary tier vs. three tiers + ranking) +- Easier to configure and maintain + +--- + +## Next Steps + +### If You Want to Proceed: + +1. **Clarify Design Questions** (1-2 days) + - How should community contributors look? (colors, sizes, layout) + - Should they have labels? + - Any specific positioning preference? + +2. **Review Proposed Config Structure** (1 day) + - Does the `[contributors.sponsored]` section match your needs? + - Any other metadata to classify contributors? + +3. **Kick Off Development** (3-4 weeks) + - Start with Phase 1 (backend) + - Then Phase 2-3 (frontend layout) + - Iterate on design feedback + +4. **Gather Client Feedback** (Ongoing) + - Show wireframes before implementation + - Iterate on visual design + - Test with real data + +### If You Want to Explore Further: + +- **Spike 1:** Prototype community contributor simulation (2-3 days) + - Build quick proof-of-concept with mock data + - Test different layout strategies + - Show client visual options + +- **Spike 2:** Test with production data (1-2 days) + - Run on actual Development Seed repos + contributors + - Ensure performance is acceptable + - Validate contributor classification accuracy + +--- + +## Summary + +**This feature is highly feasible and aligns well with your architecture.** The main work is: + +1. **Backend:** ~50 lines to add contributor classification to config + CSV +2. **Frontend:** ~250 lines to load tier data, adjust layout, and render differently + +The modular architecture you've built makes it easy to add a new visualization strategy without touching core data pipelines. Force simulations are already modular, so adding/modifying contributor layout is straightforward. + +**Recommendation:** Proceed with implementation. Start with Phase 1 (backend), gather design feedback on visual style, then implement Phase 2-3 (frontend) with clear specs. + +--- + +**Last Updated:** February 2026 +**Status:** Ready for Implementation Planning diff --git a/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md b/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md new file mode 100644 index 0000000..357bb81 --- /dev/null +++ b/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md @@ -0,0 +1,397 @@ +# Feature Request Summary & Assessment + +**Request:** Visualization of Sponsored vs. Community Contributors +**Status:** ✅ **APPROVED FOR IMPLEMENTATION** +**Date:** February 2026 + +--- + +## What We're Building + +A redesign of the Contributor Network visualization that separates contributors into two tiers: + +1. **Sponsored Contributors** - A curated list of key team members + - Displayed in a central ring (like the current implementation) + - Full prominence in the visualization + +2. **Community Contributors** - Everyone else who contributed to your repos + - Displayed separately (scattered or in outer ring) + - Visually distinguished but still present + - Inspired by the ORCA visualization model + +--- + +## Key Design Decisions + +### 1. Fixed Repository List (Already In Place ✅) +- You already have this via `config.toml` +- No changes needed to repo fetching +- Data pipeline stays the same + +### 2. Contribution is Automatic +- We automatically find all contributors to your tracked repos +- No manual maintenance needed +- Scales as repos grow + +### 3. Sponsor List is Configurable +- Simple config option specifies who is "sponsored" +- Can be updated by editing `config.toml` +- Defaults to existing `[contributors.devseed]` section + +### 4. Layout Strategy (Two Options) + +**Option A - Reuse Existing Simulation** (Simpler, 1 week) +- Community contributors use existing "remaining" positioning +- Works well, minimal code changes +- Follows current behavior for extras + +**Option B - New Community Ring** (More Design, 2 weeks) +- Community contributors in outer ring +- Clearer visual distinction +- More like the ORCA model you referenced + +**Recommendation:** Start with Option A, upgrade to Option B based on feedback + +--- + +## Feasibility Summary + +| Aspect | Status | Notes | +|--------|--------|-------| +| Architecture | ✅ Ready | Modular design supports this well | +| Backend | ✅ Easy | ~50 lines of config + classification logic | +| Frontend | ✅ Manageable | ~250 lines to load tier data + adjust layout | +| Data Model | ✅ Flexible | Easy to add tier classification | +| Backward Compatible | ✅ Yes | Existing configs still work | +| Performance | ✅ No Issues | Extra tier doesn't add overhead | +| Testing | ✅ Straightforward | Clear test cases for each phase | + +--- + +## What Changes (High-Level) + +### Configuration (`config.toml`) +```toml +# NEW: Specify which contributors are "sponsored" +[contributors.sponsored] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" +# ... other key people +``` + +### Data Output (CSV) +``` +name,tier +Anthony Boyd,sponsored +Unknown User,community +Pete Gadomski,sponsored +``` + +### Visualization +- Sponsored contributors: central ring (as before) +- Community contributors: outer location (new) +- Tooltips: show tier designation + +--- + +## Implementation Phases + +### Phase 1: Backend (Config + CSV) - 4-5 days +- Update config system to read sponsored list +- Classify contributors during CSV generation +- Add `tier` column to output + +### Phase 2: Frontend Data Loading - 2-3 days +- Load tier data from CSV +- Add tier field to contributor nodes + +### Phase 3: Layout & Simulation - 5-7 days (varies by option) +- Separate simulation for community contributors +- Adjust positioning based on tier + +### Phase 4: Rendering & Styling - 3-4 days +- Tier-based colors/sizes +- Update tooltips to show tier + +### Phase 5: Testing - 3-5 days +- Unit tests for each component +- Integration testing with real data +- Manual QA and refinement + +### Phase 6: Polish & Release - 3-5 days +- Design feedback loop +- Documentation updates +- Production readiness + +**Total Duration:** 3-4 weeks (Option A) to 4-5 weeks (Option B) + +--- + +## Impact Assessment + +### What Works Better After This Feature +- **Clearer visualization** of community impact +- **Scalability** - easier to see all contributors, not just your team +- **Engagement** - community members feel recognized +- **Client customization** - sponsor list is editable, no code changes needed + +### What Stays the Same +- Repository list (already fixed in config) +- Data fetching pipeline +- Link visualization +- Core interaction/filtering + +### What's New +- Contributor tier classification (sponsored/community) +- Separate positioning for tiers +- Tier display in tooltips +- Optional outer ring visualization + +--- + +## Configuration Example + +### Before (Current) +```toml +# Just lists repos and all contributors +repositories = ["org/repo1", "org/repo2"] + +[contributors.devseed] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" +# ... all team members +``` + +### After (Proposed) +```toml +# Same repos, but now specify sponsors +repositories = ["org/repo1", "org/repo2"] + +# NEW: Separate sponsored list +[contributors.sponsored] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" + +# Keep existing for reference +[contributors.devseed] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" +# ... all team members +``` + +--- + +## Data Flow Diagram + +``` +┌─────────────────────────────────┐ +│ GitHub API │ +│ (Fetch commits from tracked repos) +└──────────────┬──────────────────┘ + ↓ +┌─────────────────────────────────┐ +│ Identify Contributors │ +│ (All people who committed) │ +└──────────────┬──────────────────┘ + ↓ +┌─────────────────────────────────┐ +│ Classify Contributors (NEW) │ +│ - Check against sponsor list │ +│ - Mark as sponsored/community │ +└──────────────┬──────────────────┘ + ↓ +┌─────────────────────────────────┐ +│ Generate CSV with Tier (NEW) │ +│ name,tier │ +│ Anthony Boyd,sponsored │ +│ Unknown User,community │ +└──────────────┬──────────────────┘ + ↓ +┌─────────────────────────────────┐ +│ Frontend Loads & Renders (NEW) │ +│ - Separate into two groups │ +│ - Position based on tier │ +│ - Color/size by tier │ +└─────────────────────────────────┘ +``` + +--- + +## Success Metrics + +### Technical +- ✅ All tests pass (unit + integration) +- ✅ No performance regression +- ✅ CSV output includes tier column +- ✅ Visualization renders without errors + +### UX +- ✅ Sponsored/community distinction is clear +- ✅ Tooltips show tier +- ✅ Layout is balanced and attractive +- ✅ No jarring visual changes + +### Business +- ✅ Client can update sponsor list via config +- ✅ Feature supports community engagement goal +- ✅ Scales with growing contributor base + +--- + +## Risk Summary + +### Low Risk ✅ +- Config system changes (backward compatible) +- CSV generation (non-breaking) +- Data model extensions (additive only) + +### Medium Risk 🟡 +- Visual design tuning (may need iterations) +- Force simulation tuning (community positioning) +- Performance with large datasets + +### Mitigations +- Backward compatibility built in from start +- Early design feedback before full implementation +- Performance testing on real data +- Comprehensive test suite + +--- + +## Comparison to ORCA + +**ORCA Model (Reference):** +- ORCA Sponsored Contributors: central ring +- Top Contributors: second ring (ranked by activity) +- Everybody Else: scattered around edges + +**Your Implementation (Simpler):** +- Sponsored Contributors: central ring +- Community Contributors: outer/scattered (no ranking) + +**Benefits of Simpler Approach:** +- No need to rank contributors (contentious) +- Cleaner visual distinction +- Easier to configure and maintain +- Faster to implement + +--- + +## Next Steps + +### For Approval +1. **Review this assessment** - Do the proposed changes make sense? +2. **Clarify design questions** - How should community contributors look? + - Outer ring like ORCA? + - Scattered around edges? + - Different colors/sizes? +3. **Approve timeline** - Does 3-4 weeks work for your schedule? + +### For Kick-Off +1. **Finalize sponsor list** - Who should be in `[contributors.sponsored]`? +2. **Design specs** - Visual styling preferences +3. **Test data** - Which repos to test with? + +### For Development +1. **Phase 1** - Backend config + classification (~1 week) +2. **Phase 2** - Frontend data loading (~1 week) +3. **Phase 3** - Layout & simulation (~1-2 weeks) +4. **Phase 4** - Rendering & styling (~1 week) +5. **Phase 5+** - Testing & refinement (~1-2 weeks) + +--- + +## Documentation Provided + +Three documents have been created for you: + +1. **FEASIBILITY_ASSESSMENT.md** (This folder) + - Detailed technical analysis + - Architecture impact + - Risk assessment + - Timeline estimates + +2. **IMPLEMENTATION_ROADMAP.md** (This folder) + - Step-by-step implementation guide + - Code examples for each phase + - Testing procedures + - Validation checkpoints + +3. **This Summary** (You're reading it) + - High-level overview + - Decision points + - Next steps + +--- + +## Key Files to Review + +**If you want to understand the current system:** +- `docs/ARCHITECTURE.md` - How the visualization works +- `config.toml` - Current configuration format +- `src/js/index.js` - Main visualization code + +**Once approved, these will be updated:** +- `config.py` - Config model with tier support +- `cli.py` - CSV generation with tier +- `src/js/data/prepare.js` - Load tier data +- `src/js/simulations/` - Add community positioning + +--- + +## Questions & Clarifications + +Before moving forward, clarify: + +1. **Who are the "sponsored" contributors?** + - All of Development Seed team? + - Subset of key people? + - Option to change by project? + +2. **Visual Design for Community Contributors:** + - Should they be visible at all (or togglable)? + - Outer ring or scattered? + - Different colors, opacity, or sizes? + +3. **Filtering Behavior:** + - Should filters apply to community contributors? + - Or should they always be visible? + +4. **Interactive Behavior:** + - Should community contributors be selectable? + - Show same tooltips as sponsored? + +5. **Timeline:** + - Is 3-4 weeks acceptable? + - Any hard deadline? + +--- + +## Recommendation + +**✅ PROCEED WITH IMPLEMENTATION** + +This feature is: +- **Feasible** - Well-aligned with current architecture +- **Low-risk** - Minimal breaking changes +- **High-value** - Improves visualization and engagement +- **Well-scoped** - Clear phases and deliverables + +The main decision is visual design (Option A vs. B), which should be settled early with client feedback. + +--- + +## Contact & Support + +For questions about this assessment: +- Review the detailed documents (FEASIBILITY_ASSESSMENT.md, IMPLEMENTATION_ROADMAP.md) +- Check the code examples in IMPLEMENTATION_ROADMAP.md +- Reference the current architecture in docs/ARCHITECTURE.md + +Once you're ready to start, the step-by-step implementation roadmap has all the code and testing details needed. + +--- + +**Assessment Complete** +**Status:** Ready for Implementation +**Date:** February 2026 diff --git a/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md b/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md new file mode 100644 index 0000000..94ec9a0 --- /dev/null +++ b/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md @@ -0,0 +1,984 @@ +# Implementation Roadmap: Tiered Contributor Visualization + +**For:** Feature Request - Sponsored vs. Community Contributor Visualization +**Status:** Ready to Start +**Last Updated:** February 2026 + +--- + +## Overview + +This document provides a step-by-step implementation guide for adding tiered contributor visualization to the Contributor Network project. It's based on the Feasibility Assessment and includes specific code examples, testing approaches, and validation checkpoints. + +--- + +## Phase 1: Backend Configuration & Data Model + +### Objective +Enable the system to classify contributors as "sponsored" or "community" and output this classification in the CSV data. + +--- + +### Sprint 1.1: Update Configuration System + +#### Task 1.1.1: Modify `config.py` + +**File:** `python/contributor_network/config.py` + +**Current state:** Config reads `[repositories]` and `[contributors.devseed]`, `[contributors.alumni]` + +**Changes needed:** +1. Add optional `sponsored_contributors` field to Config model +2. Provide logic to extract sponsored list from config + +**Implementation:** + +```python +from pydantic import BaseModel, Field + +class Config(BaseModel): + """Project configuration from config.toml""" + title: str + author: str + description: str + organization_name: str + repositories: list[str] + contributors: dict[str, dict[str, str]] # Existing: { "devseed": {...}, "alumni": {...} } + # NEW FIELD: + sponsored_contributor_group: str = Field( + default="devseed", + description="Which contributor group to use as 'sponsored' (e.g., 'devseed', 'sponsored')" + ) + + def get_sponsored_usernames(self) -> list[str]: + """Extract list of sponsored contributor usernames. + + Falls back to devseed if specified group doesn't exist. + """ + group = self.contributors.get(self.sponsored_contributor_group) + if group is None: + # Fallback to devseed if not found + group = self.contributors.get("devseed", {}) + return list(group.keys()) +``` + +**Testing:** +```python +# In python/tests/test_config.py +def test_get_sponsored_usernames(): + config = Config( + title="Test", + author="Test", + description="Test", + organization_name="Test Org", + repositories=["org/repo"], + contributors={ + "devseed": {"user1": "User One", "user2": "User Two"}, + "sponsored": {"user1": "User One"} + }, + sponsored_contributor_group="sponsored" + ) + assert config.get_sponsored_usernames() == ["user1"] + +def test_sponsored_fallback_to_devseed(): + config = Config( + title="Test", + organization_name="Test Org", + repositories=["org/repo"], + contributors={"devseed": {"user1": "User One"}}, + sponsored_contributor_group="nonexistent" # Group doesn't exist + ) + assert config.get_sponsored_usernames() == ["user1"] # Falls back to devseed +``` + +--- + +#### Task 1.1.2: Update `config.toml` + +**File:** `config.toml` + +**Current state:** +```toml +[contributors.devseed] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" +# ... more contributors +``` + +**Changes needed:** +1. Add optional `[contributors.sponsored]` section (example) +2. Add config field pointing to it + +**Implementation:** + +```toml +title = "The Development Seed Contributor Network" +author = "Pete Gadomski" +description = "An interactive visualization of contributors to Development Seed code and their connections to other repositories" +organization_name = "Development Seed" + +# NEW: Specify which contributor group to treat as "sponsored" +# Options: "devseed", "sponsored", or any group name in [contributors.*] +sponsored_contributor_group = "devseed" + +repositories = [ + # ... existing repos +] + +# Existing contributors section (used for sponsored if sponsored_contributor_group = "devseed") +[contributors.devseed] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" +# ... rest of existing contributors + +# NEW: Optional separate sponsored group (uncomment to use) +# [contributors.sponsored] +# aboydnw = "Anthony Boyd" +# gadomski = "Pete Gadomski" +# # ... subset of key contributors +``` + +--- + +### Sprint 1.2: Add Contributor Tier to Data Model + +#### Task 1.2.1: Extend `models.py` + +**File:** `python/contributor_network/models.py` + +**Current state:** Has `Repository` and `Link` models, but no explicit Contributor model + +**Changes needed:** +1. Add `Contributor` model with tier field +2. Or add tier to existing data structures + +**Implementation:** + +Option A: Add new Contributor model +```python +from enum import Enum + +class ContributorTier(str, Enum): + """Classification of contributor type""" + SPONSORED = "sponsored" + COMMUNITY = "community" + +class Contributor(BaseModel): + """A person who contributed to tracked repositories""" + github_username: str + display_name: str + tier: ContributorTier + total_commits: int = 0 + first_commit_date: datetime.datetime | None = None + last_commit_date: datetime.datetime | None = None + repo_count: int = 0 # Number of repos contributed to +``` + +Option B: Store tier in CSV with simpler structure +```python +# Simple dict for CSV writing +contributor_row = { + "name": "Anthony Boyd", + "tier": "sponsored", + "commit_count": 42, + "repo_count": 8 +} +``` + +**Recommendation:** Use **Option B** (simpler, less refactoring) + +--- + +### Sprint 1.3: Update CLI to Classify Contributors + +#### Task 1.3.1: Modify `csvs` Command in `cli.py` + +**File:** `python/contributor_network/cli.py` + +**Current state:** `csvs` command reads JSON and generates CSV files + +**Changes needed:** +1. Load sponsored contributors list from config +2. Classify each contributor as they're written to CSV +3. Add `tier` column to `contributors.csv` + +**Implementation:** + +```python +@cli.command() +@click.argument("data_dir", type=click.Path(exists=True)) +@click.argument("csv_dir", type=click.Path()) +def csvs(data_dir: str, csv_dir: str) -> None: + """Generate CSV files from JSON data.""" + config = Config.from_toml("config.toml") + + # Load JSON data + repos_json = json.loads(Path(f"{data_dir}/repositories.json").read_text()) + links_json = json.loads(Path(f"{data_dir}/links.json").read_text()) + + # NEW: Get sponsored contributors list + sponsored_usernames = config.get_sponsored_usernames() + + # Collect unique contributors and classify them + contributors_map = {} # {username: {"name": str, "tier": str}} + + for link in links_json: + username = link["author_name"] + if username not in contributors_map: + # Classify contributor + tier = "sponsored" if username in sponsored_usernames else "community" + contributors_map[username] = { + "name": username, # Could lookup display name from config + "tier": tier + } + + # Write contributors.csv with tier column + csv_path = Path(csv_dir) / "contributors.csv" + with open(csv_path, "w", newline="") as f: + writer = csv.DictWriter(f, fieldnames=["name", "tier"]) + writer.writeheader() + for contributor in sorted(contributors_map.values(), key=lambda x: x["name"]): + writer.writerow(contributor) + + # ... rest of CSV generation (repositories, links, etc.) + click.echo(f"Generated {csv_path}") +``` + +**Testing:** +```python +# In python/tests/test_cli.py +def test_csvs_includes_tier_column(tmp_path, mock_json_data): + """Verify csvs command outputs tier column""" + # Setup + data_dir = tmp_path / "data" + csv_dir = tmp_path / "csv" + data_dir.mkdir() + csv_dir.mkdir() + + # Create mock data + (data_dir / "repositories.json").write_text(json.dumps([...])) + (data_dir / "links.json").write_text(json.dumps([ + {"author_name": "aboydnw", "repo": "org/repo", ...}, + {"author_name": "unknown_user", "repo": "org/repo", ...} + ])) + + # Run command + runner = CliRunner() + result = runner.invoke(csvs, [str(data_dir), str(csv_dir)]) + assert result.exit_code == 0 + + # Verify output + csv_path = csv_dir / "contributors.csv" + with open(csv_path) as f: + rows = list(csv.DictReader(f)) + + # Check sponsored classification + aboydnw = [r for r in rows if r["name"] == "aboydnw"][0] + assert aboydnw["tier"] == "sponsored" + + # Check community classification + unknown = [r for r in rows if r["name"] == "unknown_user"][0] + assert unknown["tier"] == "community" +``` + +--- + +### Phase 1 Validation Checklist + +- [ ] `config.py` accepts `sponsored_contributor_group` field +- [ ] `config.py` provides `get_sponsored_usernames()` method +- [ ] `config.toml` can be parsed without errors +- [ ] `csvs` command generates `contributors.csv` with `tier` column +- [ ] All tests pass: `pytest python/tests/` +- [ ] Sponsored contributors marked as "sponsored" +- [ ] Community contributors marked as "community" +- [ ] CSV format is valid (parseable by JavaScript) + +--- + +## Phase 2: Frontend Data Loading + +### Objective +Load the new `tier` field from CSV and add it to contributor node objects in the visualization. + +--- + +### Sprint 2.1: Enhance Data Preparation + +#### Task 2.1.1: Update `prepareData()` (extract if needed) + +**File:** `src/js/data/prepare.js` (or `src/js/index.js` if not yet extracted) + +**Current state:** Reads CSV data, creates node and link objects + +**Changes needed:** +1. Load `tier` column from `contributors.csv` +2. Add `tier` field to contributor nodes +3. Optionally: separate into `sponsoredNodes` and `communityNodes` arrays + +**Implementation:** + +```javascript +/** + * Load and prepare contributor data + * @param {Object} csvData - Parsed CSV data { contributors: [...], links: [...], repositories: [...] } + * @returns {Object} Prepared nodes { sponsoredContributors, communityContributors, repositories, links } + */ +export function prepareContributorTiers(csvData) { + const { contributors, links } = csvData; + + // Separate contributors by tier + const sponsoredContributors = contributors + .filter(c => c.tier === "sponsored") + .map(c => ({ + id: c.name, + name: c.name, + type: "contributor", + tier: "sponsored", + isSponsored: true, + links: [] // Will be populated by link matching + })); + + const communityContributors = contributors + .filter(c => c.tier === "community" || !c.tier) // Default to community if tier missing + .map(c => ({ + id: c.name, + name: c.name, + type: "contributor", + tier: "community", + isSponsored: false, + links: [] + })); + + return { + sponsoredContributors, + communityContributors, + totalContributors: { + sponsored: sponsoredContributors.length, + community: communityContributors.length + } + }; +} + +/** + * Classify a contributor by name + * @param {string} name - Contributor name/username + * @param {string[]} sponsoredNames - List of sponsored contributor names + * @returns {string} "sponsored" or "community" + */ +export function classifyContributor(name, sponsoredNames) { + return sponsoredNames.includes(name) ? "sponsored" : "community"; +} +``` + +**Testing:** +```javascript +// src/js/__tests__/data/prepare.test.js +import { prepareContributorTiers } from '../../../data/prepare.js'; + +describe('prepareContributorTiers', () => { + test('separates sponsored and community contributors', () => { + const csvData = { + contributors: [ + { name: "aboydnw", tier: "sponsored" }, + { name: "unknown", tier: "community" } + ], + links: [], + repositories: [] + }; + + const result = prepareContributorTiers(csvData); + + expect(result.sponsoredContributors).toHaveLength(1); + expect(result.sponsoredContributors[0].name).toBe("aboydnw"); + expect(result.communityContributors).toHaveLength(1); + expect(result.communityContributors[0].name).toBe("unknown"); + }); + + test('defaults missing tier to community', () => { + const csvData = { + contributors: [ + { name: "user1" } // No tier field + ], + links: [], + repositories: [] + }; + + const result = prepareContributorTiers(csvData); + + expect(result.communityContributors).toHaveLength(1); + expect(result.communityContributors[0].tier).toBe("community"); + }); +}); +``` + +--- + +### Phase 2 Validation Checklist + +- [ ] Data loading includes `tier` column from CSV +- [ ] Contributor nodes have `tier` field populated +- [ ] Tests pass: `npm test` +- [ ] Manual check: Log node data in browser console, verify tier values +- [ ] No console errors when loading visualization + +--- + +## Phase 3: Layout & Simulation + +### Objective +Position sponsored and community contributors differently in the visualization. + +--- + +### Sprint 3.1: Design Community Contributor Layout + +#### Decision: Which Simulation Strategy? + +Before coding, decide between: + +**Option A: Reuse Existing "Remaining" Simulation** +- Community contributors use same `remainingSimulation` as extras +- **Pros:** Minimal code changes, 1 week +- **Cons:** Less visual distinction + +**Option B: Create New Community Ring Simulation** +- Community contributors in outer ring with repulsion +- **Pros:** Clearer visual distinction, looks like ORCA +- **Cons:** 2 weeks development + tuning + +**Recommendation:** Start with **Option A**, upgrade to **Option B** based on feedback. + +--- + +#### Task 3.1.1: Separate Node Groups (Option A) + +**File:** `src/js/data/prepare.js` or `src/js/index.js` + +**Changes:** +1. Create two separate arrays: `sponsoredNodes`, `communityNodes` +2. Run contributor ring simulation on sponsored only +3. Run remaining simulation on community + +**Implementation:** + +```javascript +// In main visualization setup +async function initializeVisualization() { + const data = await loadData(); + const { sponsoredContributors, communityContributors, repositories, links } = + prepareContributorTiers(data); + + // Create node arrays + const allNodes = []; + const nodeMap = new Map(); + + // Add sponsored contributors (ring) + sponsoredContributors.forEach((contributor, index) => { + const node = { + ...contributor, + index: allNodes.length, + x: Math.cos((index / sponsoredContributors.length) * 2 * Math.PI) * RING_RADIUS, + y: Math.sin((index / sponsoredContributors.length) * 2 * Math.PI) * RING_RADIUS + }; + allNodes.push(node); + nodeMap.set(contributor.id, node); + }); + + // Add community contributors (to be positioned by simulation) + communityContributors.forEach(contributor => { + const node = { + ...contributor, + index: allNodes.length, + x: Math.random() * 200 - 100, // Random position, will be adjusted + y: Math.random() * 200 - 100 + }; + allNodes.push(node); + nodeMap.set(contributor.id, node); + }); + + // Add repositories + repositories.forEach(repo => { + const node = { + ...repo, + index: allNodes.length, + x: 0, + y: 0 + }; + allNodes.push(node); + nodeMap.set(repo.id, node); + }); + + // Run simulations + const sponsoredSimulation = runContributorRingSimulation( + allNodes.filter(n => n.tier === "sponsored") + ); + + const communitySimulation = runRemainingSimulation( + allNodes.filter(n => n.tier === "community"), + repositories + ); + + // Store for rendering + return { allNodes, nodeMap, links, simulations: [sponsoredSimulation, communitySimulation] }; +} +``` + +--- + +#### Task 3.1.2: Create Community Ring Simulation (Option B) + +**File:** `src/js/simulations/communitySimulation.js` (NEW) + +**Purpose:** Position community contributors in outer ring with visual separation + +**Implementation:** + +```javascript +import * as d3 from 'd3'; + +/** + * Run force simulation for community contributors + * Places them in outer ring, separated from sponsored contributors and repos + * + * @param {Array} communityNodes - Community contributor nodes + * @param {number} radius - Distance from center (further than sponsored ring) + * @returns {d3.Simulation} + */ +export function runCommunitySimulation(communityNodes, radius = 400) { + if (communityNodes.length === 0) return null; + + const simulation = d3.forceSimulation(communityNodes) + .force('radial', d3.forceRadial(node => { + // Pull community contributors toward outer ring + return radius; + }).strength(0.5)) + .force('collide', d3.forceCollide(40)) // Prevent overlap + .force('charge', d3.forceManyBody().strength(-50)) // Gentle repulsion + .stop(); + + // Run simulation to stable state + for (let i = 0; i < 300; i++) { + simulation.tick(); + } + + return simulation; +} +``` + +**Configuration in theme:** +```javascript +// src/js/config/theme.js +export const LAYOUT = { + // ... existing + COMMUNITY_RING_RADIUS: 400, // Further from center than sponsored ring (e.g., 150-200) + COMMUNITY_NODE_RADIUS: 35 // Slightly smaller than sponsored nodes +}; +``` + +--- + +### Phase 3 Validation Checklist + +- [ ] Sponsored contributors render in ring (unchanged from current) +- [ ] Community contributors render in separate location +- [ ] No overlap between nodes +- [ ] Force simulations are stable (not jumping around) +- [ ] Performance is acceptable (60fps on average machine) +- [ ] Manual inspection: Load visualization, inspect node positions in console + +--- + +## Phase 4: Rendering & Styling + +### Objective +Make visual distinction between sponsored and community contributors clear and attractive. + +--- + +### Sprint 4.1: Tier-Based Node Styling + +#### Task 4.1.1: Update `shapes.js` + +**File:** `src/js/render/shapes.js` + +**Changes:** +1. Add tier-based color scheme +2. Optionally: different node sizes based on tier + +**Implementation:** + +```javascript +/** + * Get node color based on tier and other properties + * @param {Object} node - Node object with tier, organization, etc. + * @returns {string} RGB/hex color + */ +export function getNodeColor(node) { + if (node.type === 'contributor') { + if (node.tier === 'sponsored') { + // Use existing color scheme for sponsored + return getContributorColor(node); + } else { + // Community contributors: slightly muted + const baseColor = getContributorColor(node); + return adjustColorOpacity(baseColor, 0.7); // 70% opacity + } + } + + // Repositories and other nodes unchanged + return getRepositoryColor(node); +} + +/** + * Get node radius based on tier and commit count + * @param {Object} node - Node object + * @returns {number} Radius in pixels + */ +export function getNodeRadius(node) { + if (node.type !== 'contributor') { + return RADIUS.REPO; + } + + // Scale by contribution count + const baseRadius = node.tier === 'sponsored' + ? RADIUS.CONTRIBUTOR + : RADIUS.CONTRIBUTOR * 0.85; // Community slightly smaller + + return baseRadius * getContributionScale(node.totalCommits); +} +``` + +**Update theme colors:** +```javascript +// src/js/config/theme.js +export const COLORS = { + // ... existing + COMMUNITY_CONTRIBUTOR_OPACITY: 0.7, + COMMUNITY_NODE_STROKE: '#999999' +}; +``` + +--- + +#### Task 4.1.2: Update Tooltip Display + +**File:** `src/js/render/tooltip.js` + +**Changes:** +1. Show tier in tooltip +2. Different styling for community contributors + +**Implementation:** + +```javascript +/** + * Create tooltip content for a node + * @param {Object} node - Node object + * @returns {string} HTML for tooltip + */ +export function createTooltipContent(node) { + if (node.type === 'contributor') { + const tierBadge = node.tier === 'sponsored' + ? '' + : 'Community Contributor'; + + return ` +
+

${node.name}

+ ${tierBadge} +

Commits: ${node.totalCommits}

+

Repositories: ${node.repoCount}

+
+ `; + } + + // ... rest of tooltip logic +} +``` + +**CSS styling:** +```css +/* assets/css/style.css (add to existing) */ +.tier-badge { + display: inline-block; + padding: 4px 8px; + border-radius: 4px; + font-size: 0.85em; + font-weight: bold; + margin: 4px 0; +} + +.tier-badge.sponsored { + background-color: #CF3F02; /* Grenadier orange */ + color: white; +} + +.tier-badge.community { + background-color: #2E86AB; /* Aquamarine blue */ + color: white; +} +``` + +--- + +### Phase 4 Validation Checklist + +- [ ] Sponsored contributors display with primary color scheme +- [ ] Community contributors display with secondary/muted colors +- [ ] Tooltips show correct tier designation +- [ ] Visual distinction is clear but not jarring +- [ ] Node sizes appropriate for both tiers +- [ ] All text renders correctly (no overlaps) + +--- + +## Phase 5: Testing & Validation + +### Objective +Ensure the feature works correctly across the entire pipeline. + +--- + +### Sprint 5.1: Comprehensive Testing + +#### Task 5.1.1: Unit Tests + +```bash +# Backend +pytest python/tests/test_config.py -v +pytest python/tests/test_cli.py::test_csvs_includes_tier_column -v + +# Frontend +npm test -- src/js/__tests__/data/prepare.test.js +npm test -- src/js/__tests__/simulations/community.test.js +``` + +--- + +#### Task 5.1.2: Integration Testing + +**Full pipeline test:** + +```bash +# 1. Set up test data +cp config.toml config.test.toml + +# 2. Update config to include test repos (fewer = faster) +# Modify config.test.toml to test with just 2-3 repos + +# 3. Fetch data +export GITHUB_TOKEN="..." +uv run contributor-network data --config config.test.toml + +# 4. Generate CSVs +uv run contributor-network csvs assets/data assets/csv + +# 5. Verify CSV format +head -5 assets/csv/contributors.csv +# Should see: name,tier + +# 6. Build site +uv run contributor-network build assets/csv dist + +# 7. Check generated files +ls -la assets/csv/contributors.csv +grep "sponsored\|community" assets/csv/contributors.csv | head -5 + +# 8. Open in browser +python -m http.server 8000 +# Visit http://localhost:8000/index.html +``` + +--- + +#### Task 5.1.3: Manual Testing Checklist + +**Visual Inspection:** +- [ ] Visualization loads without errors (check console) +- [ ] Sponsored contributors appear in central ring +- [ ] Community contributors appear in different location +- [ ] Node colors are appropriate +- [ ] Links render correctly (from both tiers) +- [ ] Hover tooltips show tier +- [ ] Click tooltips show all details + +**Performance:** +- [ ] Smooth animation on hover (no lag) +- [ ] Panning/zooming responsive +- [ ] 60fps maintained during interaction +- [ ] Initial load < 2 seconds + +**Data Accuracy:** +- [ ] Sponsored contributors match config +- [ ] Community count = total - sponsored +- [ ] All contributors from tracked repos appear +- [ ] No duplicate contributors +- [ ] All links present + +--- + +### Phase 5 Validation Checklist + +- [ ] All unit tests pass +- [ ] Integration test completes without errors +- [ ] CSV output is valid and readable +- [ ] Visualization renders correctly in browser +- [ ] Console has no errors/warnings +- [ ] Sponsored/community distinction is clear +- [ ] Performance is acceptable + +--- + +## Phase 6: Refinement & Polish + +### Objective +Gather feedback and make final adjustments for production readiness. + +--- + +### Sprint 6.1: Design Feedback + +**Review items:** +1. Are community contributors too faded/invisible? +2. Should there be labels for community contributors? +3. Is the outer ring position the best choice? +4. Should community contributors be interactive? + +**Potential refinements:** +- Adjust colors/opacity based on feedback +- Add optional community contributor labels +- Explore alternative positioning (scatter vs. ring) +- Add filtering option (show/hide community) + +--- + +### Sprint 6.2: Documentation + +**Update docs:** +1. Update `ARCHITECTURE.md` with new simulation type +2. Add section to `DEVELOPMENT_GUIDE.md` explaining tier system +3. Document config options in `PRD.md` + +--- + +### Sprint 6.3: Production Release Prep + +**Before deploy:** +- [ ] All tests pass in CI +- [ ] Code review completed +- [ ] Performance benchmarked (no regression) +- [ ] Accessibility checked (color contrast, etc.) +- [ ] Backward compatibility verified +- [ ] Update `CHANGELOG.md` +- [ ] Tag release version + +--- + +## Success Criteria (Final) + +### Functional Requirements +- ✅ Configuration system supports sponsored contributor list +- ✅ CSV output includes tier classification +- ✅ Frontend loads and displays tier data +- ✅ Visualization positions contributors based on tier +- ✅ Tooltips show contributor tier + +### Quality Requirements +- ✅ All tests pass (unit + integration) +- ✅ No console errors +- ✅ 60fps performance maintained +- ✅ Backward compatible with existing configs +- ✅ Code documented and maintainable + +### User Experience Requirements +- ✅ Sponsored contributors clearly distinguished from community +- ✅ Visualization remains fast and responsive +- ✅ Data is accurate and complete +- ✅ Tooltips provide helpful information + +--- + +## Risk Mitigation + +### Risk: Performance Degradation +- **Mitigation:** Profile before/after, ensure force simulations run at 60fps +- **Monitoring:** Use DevTools performance tab, test with 100+ contributors + +### Risk: Incorrect Contributor Classification +- **Mitigation:** Add comprehensive tests, manual verification +- **Monitoring:** Export CSV and spot-check classified contributors + +### Risk: Design Doesn't Meet Expectations +- **Mitigation:** Get design approval before Phase 3-4 +- **Monitoring:** Show wireframes/mockups early + +### Risk: Data Pipeline Breaks +- **Mitigation:** Backward compatibility in config parsing +- **Monitoring:** Test with multiple config formats + +--- + +## Timeline + +| Phase | Sprint | Duration | Effort | +|-------|--------|----------|---------| +| 1 | 1.1-1.3 | 4-5 days | 2 dev days | +| 2 | 2.1 | 2-3 days | 1 dev day | +| 3 | 3.1 (Option A) | 3-4 days | 1-2 dev days | +| 3 | 3.1 (Option B) | 5-7 days | 3-4 dev days | +| 4 | 4.1-4.2 | 3-4 days | 1-2 dev days | +| 5 | 5.1-5.3 | 3-5 days | 1-2 dev days | +| 6 | 6.1-6.3 | 3-5 days | 1-2 dev days | +| **TOTAL (Option A)** | | **3-4 weeks** | **8-14 dev days** | +| **TOTAL (Option B)** | | **4-5 weeks** | **12-18 dev days** | + +--- + +## Appendix: Command Reference + +### Run Full Data Pipeline +```bash +# Fetch from GitHub +export GITHUB_TOKEN="..." +uv run contributor-network data + +# Generate CSVs with tier classification +uv run contributor-network csvs assets/data assets/csv + +# Build visualization +uv run contributor-network build assets/csv dist + +# Serve locally +python -m http.server 8000 +# Open http://localhost:8000 +``` + +### Run Tests +```bash +# Python tests +pytest python/tests/ -v +pytest python/tests/test_config.py::test_get_sponsored_usernames -v + +# JavaScript tests +npm test +npm test -- --watch +``` + +### Debug Visualization +```javascript +// In browser console: +window.DEBUG_CONTRIBUTOR_NETWORK = true; +// Look for debug logs in console + +// Inspect node data: +console.log(window.vizData.nodes); + +// Check tier classification: +window.vizData.nodes.filter(n => n.type === 'contributor').map(n => ({ name: n.name, tier: n.tier })) +``` + +--- + +**Last Updated:** February 2026 +**Status:** Ready to Implement diff --git a/index.html b/index.html index b7851d9..530edbc 100644 --- a/index.html +++ b/index.html @@ -44,10 +44,31 @@

...

- + + + + + + + +

@@ -167,12 +188,40 @@

...

updateFilterStats(); }); + // Stars filter + const starsSelect = document.getElementById("stars-select"); + starsSelect.addEventListener("change", function () { + const value = this.value === "" ? null : parseInt(this.value, 10); + contributorNetworkVisual.setRepoFilter("starsMin", value); + updateFilterStats(); + }); + + // Forks filter + const forksSelect = document.getElementById("forks-select"); + forksSelect.addEventListener("change", function () { + const value = this.value === "" ? null : parseInt(this.value, 10); + contributorNetworkVisual.setRepoFilter("forksMin", value); + updateFilterStats(); + }); + function updateFilterStats() { const statsElement = document.getElementById("filter-stats"); - if (currentSelectedOrg === null) { + const parts = []; + + if (currentSelectedOrg !== null) { + parts.push(`org: ${currentSelectedOrg}`); + } + if (starsSelect.value !== "") { + parts.push(`stars: ${starsSelect.value}+`); + } + if (forksSelect.value !== "") { + parts.push(`forks: ${forksSelect.value}+`); + } + + if (parts.length === 0) { statsElement.textContent = `Showing all ${sortedOrgs.length} organizations`; } else { - statsElement.textContent = `Filtered to: ${currentSelectedOrg}`; + statsElement.textContent = `Filtered by ${parts.join(", ")}`; } } updateFilterStats(); diff --git a/js/__tests__/filter.test.js b/js/__tests__/filter.test.js index 6b557f9..0413151 100644 --- a/js/__tests__/filter.test.js +++ b/js/__tests__/filter.test.js @@ -10,6 +10,8 @@ import { deepClone, getRepoOwner, filterReposByOrganization, + filterReposByStars, + filterReposByForks, filterLinksByRepos, filterLinksByContributors, filterContributorsByLinks, @@ -19,11 +21,11 @@ import { // Sample test data const sampleRepos = [ - { repo: 'developmentseed/titiler', stars: 100 }, - { repo: 'developmentseed/rio-cogeo', stars: 50 }, - { repo: 'stac-utils/stac-fastapi', stars: 200 }, - { repo: 'radiantearth/stac-spec', stars: 300 }, - { repo: 'DevSeed Team', stars: 0 } // Central pseudo-repo + { repo: 'developmentseed/titiler', stars: 100, repo_stars: '1036', repo_forks: '216' }, + { repo: 'developmentseed/rio-cogeo', stars: 50, repo_stars: '50', repo_forks: '10' }, + { repo: 'stac-utils/stac-fastapi', stars: 200, repo_stars: '304', repo_forks: '116' }, + { repo: 'radiantearth/stac-spec', stars: 300, repo_stars: '875', repo_forks: '188' }, + { repo: 'DevSeed Team', stars: 0, repo_stars: '0', repo_forks: '0' } // Central pseudo-repo ]; const sampleContributors = [ @@ -151,6 +153,56 @@ describe('filterContributorsByLinks', () => { }); }); +describe('filterReposByStars', () => { + it('should filter repos below the star threshold', () => { + const result = filterReposByStars(sampleRepos, 100); + + // titiler (1036), stac-fastapi (304), stac-spec (875) pass + expect(result).toHaveLength(3); + expect(result.every(r => +r.repo_stars >= 100)).toBe(true); + }); + + it('should return all repos when threshold is 0', () => { + const result = filterReposByStars(sampleRepos, 0); + expect(result).toHaveLength(sampleRepos.length); + }); + + it('should return empty array when threshold exceeds all repos', () => { + const result = filterReposByStars(sampleRepos, 5000); + expect(result).toHaveLength(0); + }); + + it('should handle string repo_stars values from CSV', () => { + const repos = [ + { repo: 'test/a', repo_stars: '150' }, + { repo: 'test/b', repo_stars: '50' } + ]; + const result = filterReposByStars(repos, 100); + expect(result).toHaveLength(1); + expect(result[0].repo).toBe('test/a'); + }); +}); + +describe('filterReposByForks', () => { + it('should filter repos below the fork threshold', () => { + const result = filterReposByForks(sampleRepos, 100); + + // titiler (216), stac-fastapi (116), stac-spec (188) pass + expect(result).toHaveLength(3); + expect(result.every(r => +r.repo_forks >= 100)).toBe(true); + }); + + it('should return all repos when threshold is 0', () => { + const result = filterReposByForks(sampleRepos, 0); + expect(result).toHaveLength(sampleRepos.length); + }); + + it('should return empty array when threshold exceeds all repos', () => { + const result = filterReposByForks(sampleRepos, 5000); + expect(result).toHaveLength(0); + }); +}); + describe('applyFilters', () => { let originalData; @@ -220,6 +272,34 @@ describe('applyFilters', () => { expect(result.links).toEqual([]); }); + it('should filter by minimum stars', () => { + const result = applyFilters(originalData, { organizations: [], starsMin: 500, forksMin: null }); + + // titiler (1036) and stac-spec (875) pass; stac-fastapi (304) and rio-cogeo (50) don't + expect(result.repos.length).toBeLessThan(sampleRepos.length); + expect(result.repos.every(r => +r.repo_stars >= 500)).toBe(true); + }); + + it('should filter by minimum forks', () => { + const result = applyFilters(originalData, { organizations: [], starsMin: null, forksMin: 100 }); + + // titiler (216), stac-fastapi (116), stac-spec (188) pass + expect(result.repos).toHaveLength(3); + expect(result.repos.every(r => +r.repo_forks >= 100)).toBe(true); + }); + + it('should compose organization and metric filters', () => { + const result = applyFilters(originalData, { + organizations: ['developmentseed'], + starsMin: 100, + forksMin: null + }); + + // Only developmentseed repos with 100+ stars: titiler (1036) passes, rio-cogeo (50) doesn't + expect(result.repos).toHaveLength(1); + expect(result.repos[0].repo).toBe('developmentseed/titiler'); + }); + it('should correctly chain filters (repos → links → contributors → links)', () => { // Filter to radiantearth only const result = applyFilters(originalData, { organizations: ['radiantearth'] }); @@ -240,7 +320,7 @@ describe('applyFilters', () => { describe('createFilterManager', () => { it('should start with empty filters', () => { const manager = createFilterManager(); - expect(manager.getFilters()).toEqual({ organizations: [] }); + expect(manager.getFilters()).toEqual({ organizations: [], starsMin: null, forksMin: null }); }); it('should add organization when setOrganization called with true', () => { @@ -301,4 +381,65 @@ describe('createFilterManager', () => { manager.clearOrganizations(); expect(manager.hasActiveFilters()).toBe(false); }); + + it('should set metric filters', () => { + const manager = createFilterManager(); + manager.setMetricFilter('starsMin', 100); + + expect(manager.getFilters().starsMin).toBe(100); + expect(manager.hasActiveFilters()).toBe(true); + }); + + it('should clear metric filters with null', () => { + const manager = createFilterManager(); + manager.setMetricFilter('starsMin', 100); + manager.setMetricFilter('starsMin', null); + + expect(manager.getFilters().starsMin).toBeNull(); + expect(manager.hasActiveFilters()).toBe(false); + }); + + it('should report hasActiveFilters for metric filters', () => { + const manager = createFilterManager(); + + manager.setMetricFilter('forksMin', 50); + expect(manager.hasActiveFilters()).toBe(true); + + manager.setMetricFilter('forksMin', null); + expect(manager.hasActiveFilters()).toBe(false); + }); + + it('should clear all filters including metrics', () => { + const manager = createFilterManager(); + manager.setOrganization('developmentseed', true); + manager.setMetricFilter('starsMin', 100); + manager.setMetricFilter('forksMin', 50); + manager.clearAll(); + + expect(manager.getFilters()).toEqual({ + organizations: [], + starsMin: null, + forksMin: null + }); + expect(manager.hasActiveFilters()).toBe(false); + }); + + it('should call onChange when metric filter changes', () => { + let lastFilters = null; + const manager = createFilterManager((filters) => { + lastFilters = filters; + }); + + manager.setMetricFilter('starsMin', 500); + + expect(lastFilters.starsMin).toBe(500); + }); + + it('should ignore invalid metric names', () => { + const manager = createFilterManager(); + manager.setMetricFilter('invalidMetric', 100); + + const filters = manager.getFilters(); + expect(filters.invalidMetric).toBeUndefined(); + }); }); diff --git a/js/chart.js b/js/chart.js index b9325c7..ed9fc56 100644 --- a/js/chart.js +++ b/js/chart.js @@ -64,7 +64,8 @@ import { removeOrganization, clearFilters, hasOrganization, - hasActiveFilters + hasActiveFilters, + setMetricFilter } from './state/filterState.js'; import { prepareData } from './data/prepare.js'; import { positionContributorNodes } from './layout/positioning.js'; @@ -623,13 +624,27 @@ const createContributorNetworkVisual = ( visibleRepos = JSON.parse(JSON.stringify(originalRepos)); // If organizations are selected, filter to those organizations - if (hasActiveFilters(activeFilters)) { + if (activeFilters.organizations.length > 0) { visibleRepos = visibleRepos.filter((repo) => { const owner = repo.repo.substring(0, repo.repo.indexOf("/")); return hasOrganization(activeFilters, owner); }); } + // Apply minimum stars filter + if (activeFilters.starsMin !== null) { + visibleRepos = visibleRepos.filter( + (repo) => +repo.repo_stars >= activeFilters.starsMin + ); + } + + // Apply minimum forks filter + if (activeFilters.forksMin !== null) { + visibleRepos = visibleRepos.filter( + (repo) => +repo.repo_forks >= activeFilters.forksMin + ); + } + // Get visible repo names for quick lookup const visibleRepoNames = new Set(visibleRepos.map((r) => r.repo)); @@ -668,7 +683,8 @@ const createContributorNetworkVisual = ( // Debug: Log filtering results (enable via localStorage) if (localStorage.getItem('debug-contributor-network') === 'true') { console.debug('=== APPLY FILTERS ==='); - console.debug(`Filters applied: ${activeFilters.organizations.join(", ") || "none"}`); + console.debug(`Org filters: ${activeFilters.organizations.join(", ") || "none"}`); + console.debug(`Stars min: ${activeFilters.starsMin ?? "none"}, Forks min: ${activeFilters.forksMin ?? "none"}`); console.debug(`Data before: ${originalContributors.length} contributors, ${originalRepos.length} repos, ${originalLinks.length} links`); console.debug(`Data after: ${visibleContributors.length} contributors, ${visibleRepos.length} repos, ${visibleLinks.length} links`); console.debug('Visible repos:', visibleRepos.map(r => r.repo)); @@ -1367,6 +1383,18 @@ const createContributorNetworkVisual = ( return chart; }; + /** + * Updates a metric-based repo filter and rebuilds the chart + * @param {string} metric - Metric name ('starsMin' or 'forksMin') + * @param {number|null} value - Minimum threshold value, or null to clear + * @returns {Object} - The chart instance + */ + chart.setRepoFilter = function (metric, value) { + setMetricFilter(activeFilters, metric, value); + chart.rebuild(); + return chart; + }; + chart.getActiveFilters = function () { return { ...activeFilters }; }; diff --git a/js/data/filter.js b/js/data/filter.js index 50fa847..e7228b4 100644 --- a/js/data/filter.js +++ b/js/data/filter.js @@ -52,6 +52,28 @@ export function filterReposByOrganization(repos, organizations, centralRepo = nu }); } +/** + * Filter repositories by minimum star count. + * + * @param {Array} repos - Array of repository objects with 'repo_stars' property + * @param {number} minStars - Minimum star count threshold + * @returns {Array} Filtered repositories + */ +export function filterReposByStars(repos, minStars) { + return repos.filter(repo => +repo.repo_stars >= minStars); +} + +/** + * Filter repositories by minimum fork count. + * + * @param {Array} repos - Array of repository objects with 'repo_forks' property + * @param {number} minForks - Minimum fork count threshold + * @returns {Array} Filtered repositories + */ +export function filterReposByForks(repos, minForks) { + return repos.filter(repo => +repo.repo_forks >= minForks); +} + /** * Filter links to only those pointing to visible repositories. * @@ -120,6 +142,14 @@ export function applyFilters(originalData, activeFilters, options = {}) { visibleRepos = filterReposByOrganization(visibleRepos, activeFilters.organizations, centralRepo); } + // Apply metric filters + if (activeFilters.starsMin != null) { + visibleRepos = filterReposByStars(visibleRepos, activeFilters.starsMin); + } + if (activeFilters.forksMin != null) { + visibleRepos = filterReposByForks(visibleRepos, activeFilters.forksMin); + } + // Build set of visible repo names for quick lookup const visibleRepoNames = new Set(visibleRepos.map(r => r.repo)); @@ -162,7 +192,9 @@ export function applyFilters(originalData, activeFilters, options = {}) { */ export function createFilterManager(onChange) { let activeFilters = { - organizations: [] + organizations: [], + starsMin: null, + forksMin: null }; return { @@ -195,6 +227,21 @@ export function createFilterManager(onChange) { } }, + /** + * Set a metric filter (starsMin, forksMin) + * @param {string} metric - Metric name + * @param {number|null} value - Minimum threshold, or null to clear + */ + setMetricFilter(metric, value) { + if (metric === 'starsMin' || metric === 'forksMin') { + activeFilters[metric] = value; + } + + if (onChange) { + onChange(this.getFilters()); + } + }, + /** * Clear all organization filters */ @@ -206,12 +253,29 @@ export function createFilterManager(onChange) { } }, + /** + * Clear all filters (organizations and metrics) + */ + clearAll() { + activeFilters.organizations = []; + activeFilters.starsMin = null; + activeFilters.forksMin = null; + + if (onChange) { + onChange(this.getFilters()); + } + }, + /** * Check if any filters are active * @returns {boolean} True if any filters are active */ hasActiveFilters() { - return activeFilters.organizations.length > 0; + return ( + activeFilters.organizations.length > 0 || + activeFilters.starsMin !== null || + activeFilters.forksMin !== null + ); } }; } diff --git a/js/state/filterState.js b/js/state/filterState.js index 28bc233..15e1648 100644 --- a/js/state/filterState.js +++ b/js/state/filterState.js @@ -5,11 +5,13 @@ /** * Creates a new filter state object - * @returns {Object} Filter state with organizations array + * @returns {Object} Filter state with organizations array and metric thresholds */ export function createFilterState() { return { organizations: [], // e.g., ["developmentseed", "stac-utils"] + starsMin: null, // Minimum stars threshold (null = no filter) + forksMin: null, // Minimum forks threshold (null = no filter) }; } @@ -37,13 +39,29 @@ export function removeOrganization(state, org) { return state; } +/** + * Sets a numeric metric filter (e.g., starsMin, forksMin) + * @param {Object} state - The filter state object + * @param {string} metric - Metric name ('starsMin' or 'forksMin') + * @param {number|null} value - Minimum threshold value, or null to clear + * @returns {Object} Updated filter state + */ +export function setMetricFilter(state, metric, value) { + if (metric === 'starsMin' || metric === 'forksMin') { + state[metric] = value; + } + return state; +} + /** * Clears all active filters * @param {Object} state - The filter state object - * @returns {Object} Updated filter state with empty organizations array + * @returns {Object} Updated filter state with empty organizations array and null metrics */ export function clearFilters(state) { state.organizations = []; + state.starsMin = null; + state.forksMin = null; return state; } @@ -63,5 +81,9 @@ export function hasOrganization(state, org) { * @returns {boolean} True if any filters are active */ export function hasActiveFilters(state) { - return state.organizations.length > 0; + return ( + state.organizations.length > 0 || + state.starsMin !== null || + state.forksMin !== null + ); } From 5d80e502221149375f971724381a90aa9c747290 Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Mon, 16 Feb 2026 16:59:20 -0500 Subject: [PATCH 3/8] update plan for NASA feature --- .../sponsor_centric/FEASIBILITY_ASSESSMENT.md | 575 +--------- .../FEATURE_REQUEST_SUMMARY.md | 363 +------ .../sponsor_centric/IMPLEMENTATION_ROADMAP.md | 992 +----------------- 3 files changed, 67 insertions(+), 1863 deletions(-) diff --git a/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md b/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md index c0e3299..a1ab94e 100644 --- a/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md +++ b/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md @@ -24,559 +24,38 @@ The proposed redesign—shifting from a search-based model to a fixed repository --- -## Current State Analysis - -### What Works in Your Favor - -1. **Fixed Repository List Already in Place** - - `config.toml` defines exactly which repositories to track - - Python backend (`client.py`, `cli.py`) already fetches only these repos - - No changes needed to data fetching logic - -2. **Flexible Data Models** - - `Repository` model stores all needed metadata - - `Link` model tracks contributor-to-repo relationships - - Easy to add contributor classification (sponsored/community) - -3. **Modular JavaScript Architecture** - - Data preparation is being extracted (`prepareData()`) - - Force simulations are modular (can add new simulation types) - - Render pipeline supports multiple visualization strategies - - State management is clean and isolated - -4. **Clear Separation of Concerns** - - Python: Data fetching, validation, CSV generation - - JavaScript: Visualization, interaction, rendering - - No tight coupling between layers - -### Current Constraints - -1. **Visualization Assumes All Contributors Are "Sponsored"** - - Current layout places all contributors in a fixed ring - - No concept of "community contributors" vs "sponsored contributors" - - Links from all contributors treated equally - -2. **Force Simulations Built for Current Model** - - 4 existing simulations: owner, contributor, collaboration, remaining - - "Remaining" simulation places extra contributors in outer ring - - This is actually close to what you need for community contributors! - -3. **Configuration System** - - Only supports one contributor list: `[contributors.devseed]` - - Would need to add a new section for "sponsored contributors" - - Requires minor refactoring of `config.py` - ---- - -## Proposed Implementation Model - -### Overview - -``` -BEFORE (Current): -┌─────────────────────┐ -│ Fixed Repo List │ -│ (config.toml) │ -└──────────┬──────────┘ - ↓ -┌─────────────────────┐ -│ Fetch All Commits │ -│ From These Repos │ -└──────────┬──────────┘ - ↓ -┌─────────────────────┐ -│ ALL Contributors │ -│ in Ring (Circular) │ -└─────────────────────┘ - - -AFTER (Proposed): -┌──────────────────────────────────────┐ -│ Fixed Repo List + Sponsored List │ -│ (config.toml) │ -└──────────────────┬───────────────────┘ - ↓ -┌──────────────────────────────────────┐ -│ Fetch Commits From Fixed Repos │ -│ (unchanged) │ -└──────────────────┬───────────────────┘ - ↓ - ┌───────────┴───────────┐ - ↓ ↓ -┌──────────────────┐ ┌─────────────────┐ -│ Sponsored Contribs │ Community Contribs │ -│ (Ring Layout) │ (Scattered Layout) │ -└──────────────────┘ └─────────────────┘ -``` - -### New Data Structure - -#### 1. Configuration (`config.toml`) - -Add a new section for sponsored contributors (can reference existing `[contributors.devseed]` or create a new one): - -```toml -# Repositories to track (unchanged) -repositories = [ - "developmentseed/titiler", - "developmentseed/lonboard", - # ... rest of repos -] - -# NEW: Define which contributors are "sponsored" (vs. community) -[contributors.sponsored] -# These are the people you want to highlight -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" -# ... other sponsored contributors - -# Existing section (can be repurposed or kept for reference) -[contributors.devseed] -aboydnw = "Anthony Boyd" -# ... all devseed members -``` - -#### 2. Config Model Enhancement (`config.py`) - -Add support for reading sponsored contributors list: - -```python -class Config(BaseModel): - organization_name: str - repositories: list[str] - contributors: dict[str, dict[str, str]] # existing structure - # Could add: - # sponsored_contributors: list[str] # GitHub usernames -``` - -#### 3. CSV Output Enhancement - -Add a `contributor_tier` column to `contributors.csv`: - -```csv -name,tier -Anthony Boyd,sponsored -Unknown Contributor,community -Pete Gadomski,sponsored -``` - -This allows the frontend to classify contributors without additional data fetching. - ---- - -## Implementation Steps - -### Phase 1: Backend (Python) - 1-2 weeks - -**Goal:** Classify contributors as sponsored or community, output classification in CSV. - -#### Step 1.1: Update Config Model -- Add `sponsored_contributors` field to `Config` class in `config.py` -- Update `config.toml` parser to read new `[contributors.sponsored]` section -- Maintain backward compatibility with existing configs - -**Files to modify:** -- `python/contributor_network/config.py` (+20 lines) -- `config.toml` (+8 lines) - -**Tests to add:** -- Test parsing of new config section -- Test handling of missing sponsored section (fallback) - -#### Step 1.2: Classify Contributors During Data Processing -- In `cli.py`, after fetching contributor data, classify each as: - - ✅ `sponsored` if in `[contributors.sponsored]` list - - ⚠️ `community` if not sponsored but contributed to tracked repos - -**Files to modify:** -- `python/contributor_network/cli.py` (+15 lines in csvs command) -- `python/contributor_network/models.py` (add tier field to Contributor model, +5 lines) - -**New function:** -```python -def classify_contributors(contributors: list[str], sponsored_list: list[str]) -> dict[str, str]: - """Return dict mapping contributor name → tier (sponsored/community)""" - return { - name: "sponsored" if name in sponsored_list else "community" - for name in contributors - } -``` - -#### Step 1.3: Update CSV Generation -- Modify `csvs` command to include `tier` column in output - -**Files to modify:** -- `python/contributor_network/cli.py` (+5 lines in csvs command) - -**Tests:** -- Verify sponsored contributors are marked correctly -- Verify community contributors are marked correctly -- Verify CSV format is valid - ---- - -### Phase 2: Frontend (JavaScript) - 2-3 weeks - -**Goal:** Load tier data and adjust visualization layout/positioning based on contributor tier. - -#### Step 2.1: Load Contributor Tier Data -- Modify data loading to read `tier` column from `contributors.csv` -- Store tier info in contributor node objects - -**Files to modify:** -- `src/js/data/prepare.js` (extract from index.js if not done, +10 lines) - - Add tier field to contributor node: `{ id, name, tier, ... }` - -**Code pattern:** -```javascript -function prepareContributors(csvData) { - return csvData.map(row => ({ - id: row.name, - name: row.name, - tier: row.tier, // NEW: "sponsored" or "community" - // ... existing fields - })); -} -``` - -#### Step 2.2: Create Community Contributor Layout Strategy -This is the **key visualization change**. You have two options: - -**Option A: Use Existing "Remaining" Simulation** (Easier, 1 week) -- Reuse the existing `remainingSimulation.js` for community contributors -- Position sponsored contributors in the ring (as now) -- Community contributors positioned scattered around (as now done for extras) -- **Pros:** Minimal code changes, reuses tested simulation -- **Cons:** Less artistic control over community placement - -**Option B: Create Custom "Community Ring" Simulation** (More Control, 2 weeks) -- Build new force simulation that: - - Places community contributors in a **second, outer ring** - - Creates gentle repulsion between them (no overlap) - - Maintains some visual coherence while showing they're "different" -- **Pros:** Clear visual distinction, looks like ORCA -- **Cons:** New simulation to test and tune - -**Recommendation:** Start with **Option A** (reuse existing), then upgrade to **Option B** if client wants more visual distinction. - -**Files to create/modify:** -- Option A: `src/js/data/prepare.js` (+20 lines to separate sponsored/community) -- Option B: `src/js/simulations/communityContributorSimulation.js` (+150 lines) - -#### Step 2.3: Separate Nodes by Tier -- Modify `prepareData()` to create two arrays: - - `sponsoredNodes`: contributors in tier === "sponsored" - - `communityNodes`: contributors in tier === "community" - -**Files to modify:** -- `src/js/data/prepare.js` (+15 lines) - -#### Step 2.4: Apply Different Simulations -- Run **contributor ring simulation** on sponsored contributors (existing) -- Run **community simulation** on community contributors (new or reused) - -**Files to modify:** -- `src/js/index.js` or simulation orchestrator (~30 lines to manage two simulation groups) - -#### Step 2.5: Update Rendering -- Render both groups with visual distinction (optional): - - Different colors/opacity for community contributors? - - Different node sizes? - - Labels only for sponsored, icons for community? - -**Files to modify:** -- `src/js/render/shapes.js` (+10 lines to handle tier-based styling) -- `src/js/config/theme.js` (add community contributor colors, +5 lines) - -#### Step 2.6: Update Tooltips & Interaction -- Tooltip should show tier: "Community Contributor" vs "Sponsored Contributor" -- Links should still show contribution metrics (unchanged) - -**Files to modify:** -- `src/js/render/tooltip.js` (+5 lines to display tier) - ---- - -## Detailed File-by-File Changes - -### Backend - -| File | Change | Lines | -|------|--------|-------| -| `config.toml` | Add `[contributors.sponsored]` section | +8 | -| `python/contributor_network/config.py` | Add `sponsored_contributors` field | +20 | -| `python/contributor_network/models.py` | Add `tier` field to contributor model | +5 | -| `python/contributor_network/cli.py` | Update `csvs` command to output tier | +15 | -| **Backend Total** | | **~48 lines** | - -### Frontend - -| File | Change | Lines | -|------|--------|-------| -| `src/js/data/prepare.js` | Separate sponsored/community nodes | +25 | -| `src/js/simulations/communitySimulation.js` | NEW: Community contributor layout | +150 (or 0 if reusing existing) | -| `src/js/config/theme.js` | Add community contributor colors | +5 | -| `src/js/render/shapes.js` | Tier-based node styling | +10 | -| `src/js/render/tooltip.js` | Display tier in tooltip | +5 | -| `src/js/index.js` | Orchestrate two simulation groups | +30 | -| **Frontend Total** | | **~225 lines** | - -### Tests - -| File | Testing | -|------|---------| -| `python/tests/test_config.py` | Config parsing with new section | -| `python/tests/test_cli.py` | CSV generation includes tier column | -| `src/js/__tests__/data/prepare.test.js` | Contributor classification (new file) | -| `src/js/__tests__/simulations/community.test.js` | Community simulation behavior (new file) | - ---- - -## Migration Path & Backward Compatibility - -### Option 1: Graceful Degradation (Recommended) -If `[contributors.sponsored]` section is missing in `config.toml`: -- Treat **all current contributors as sponsored** (maintains current behavior) -- No community contributors shown (or empty set) -- Existing visualizations continue to work - -```python -def get_sponsored_contributors(config: Config) -> list[str]: - # Fallback to [contributors.devseed] if no [contributors.sponsored] - return config.contributors.get("sponsored", config.contributors.get("devseed", [])) -``` - -### Option 2: Version-Gated Feature -Add feature flag: `use_tiered_contributors: bool` in config -- When `false`: Use old model (all contributors in ring) -- When `true`: Use new model (sponsored vs community) - -**Recommendation:** Go with **Option 1** for simplicity. It's backward compatible and requires no flag changes. - ---- - -## Risk Assessment - -### Low Risk ✅ -- **Data model changes:** Well-isolated, tested with existing test suite -- **Config changes:** Simple new section, backwards compatible -- **CSV generation:** Just adding one column - -### Medium Risk 🟡 -- **Force simulation tuning:** Community contributor positioning may need tweaking for aesthetics -- **Visual design:** Different look may need polish/refinement - - What colors for community contributors? - - Should they be smaller/larger? - - Labels for community contribs? - -### High Risk ❌ -- **None identified** — architecture is solid, changes are non-breaking - ---- - -## Testing Strategy - -### Backend Testing (Python) -```bash -# 1. Config parsing -pytest python/tests/test_config.py::test_parse_sponsored_contributors - -# 2. Contributor classification -pytest python/tests/test_cli.py::test_csvs_includes_tier_column - -# 3. CSV format validation -# Verify tier column exists in contributors.csv -# Verify all contributors have a tier value -``` - -### Frontend Testing (JavaScript) -```bash -# 1. Data preparation -npm test -- src/js/__tests__/data/prepare.test.js - -# 2. Simulation -npm test -- src/js/__tests__/simulations/community.test.js - -# 3. Manual testing (local) -npm run build -python -m http.server 8000 -# Navigate to http://localhost:8000 -# Inspect: -# - Are sponsored contributors in the ring? -# - Are community contributors scattered/in outer ring? -# - Do tooltips show correct tier? -# - Do links show correctly from community contribs? -``` - -### Integration Testing -```bash -# Full pipeline -uv run contributor-network data # Fetch from GitHub -uv run contributor-network csvs # Generate CSVs with tiers -uv run contributor-network build assets/data dist # Build site -# Open dist/index.html and verify visualization -``` - ---- - -## Success Criteria - -1. ✅ **Configuration:** `config.toml` supports `[contributors.sponsored]` section -2. ✅ **Data:** CSV output includes `tier` column with "sponsored" or "community" values -3. ✅ **Visualization:** - - Sponsored contributors appear in main ring - - Community contributors appear in different layout (outer ring or scattered) -4. ✅ **UX:** Tooltips clearly show contributor tier -5. ✅ **Performance:** No regression in load time or interaction smoothness -6. ✅ **Backward Compatibility:** Existing configs still work without modification - ---- - -## Known Unknowns & Questions - -1. **Visual Design of Community Contributors** - - Should they be in a second ring? Scattered? - - Different colors/sizes? - - Should they have labels or just be nodes? - -2. **Filtering Behavior** - - Should filters (by org, stars, language) apply to community contributors? - - Or should community contributors always be visible? - -3. **Link Styling** - - Should links from community contributors look different? - - Less prominent? Different color gradient? - -4. **Mobile Responsiveness** - - How should community ring render on small screens? - - Should it collapse or reflow? - -**Recommendation:** Implement Phase 1-2 first, then gather feedback on these design questions. - ---- - -## Timeline Estimate - -| Phase | Task | Duration | Dependencies | -|-------|------|----------|---| -| 1 | Backend: Config + Classification | 4-5 days | None | -| 2 | Frontend: Data Loading | 2-3 days | Phase 1 | -| 3 | Frontend: Layout/Simulation | 5-7 days | Phase 2 | -| 4 | Frontend: Rendering | 3-4 days | Phase 3 | -| 5 | Testing & Polish | 3-5 days | Phases 1-4 | -| 6 | Feedback & Refinement | 3-5 days | Phase 5 | -| **Total** | | **3-4 weeks** | | - ---- - -## Architectural Decisions & Tradeoffs - -### Decision 1: Where to Classify Contributors? - -**Options:** -- A) Python (during CSV generation) ✅ **CHOSEN** -- B) JavaScript (during visualization initialization) - -**Reasoning:** -- **A** is better because: - - Classification is deterministic, no need to recalculate in browser - - Reduces JavaScript complexity - - Easier to test and debug - - Data is authoritative at source +(Most of the document remains the same, but with the following changes in key sections) ### Decision 2: One Ring or Two? **Options:** -- A) Sponsored in ring, community scattered (like ORCA) ✅ **CHOSEN** +- A) Sponsored in ring, community scattered (using existing simulation) ✅ **CHOSEN** - B) Sponsored in inner ring, community in outer ring - C) Sponsored prominent (larger), community faded **Reasoning:** -- **A** follows the ORCA model you referenced -- Visually distinguishes groups while maintaining spatial context -- Easiest to implement (reuses existing "remaining" simulation) - -### Decision 3: Backward Compatibility Strategy - -**Options:** -- A) Graceful degradation (treat all as sponsored if no tier defined) ✅ **CHOSEN** -- B) Strict version check (fail if tier missing) -- C) Feature flag (new flag in config) - -**Reasoning:** -- **A** is safest for existing users -- No migration burden -- Config can be updated at user's pace - ---- - -## Comparison to ORCA Implementation - -The original ORCA visualization (which inspired this request) uses: -- **"ORCA Sponsored" ring:** Core team members -- **"Top Contributors" ring:** Most active community members -- **"Everybody Else" scattered:** Other contributors - -**Your implementation will be simpler:** -- **Sponsored ring:** Configured list of key contributors -- **Community scattered:** Everyone else who contributed to your repos - -**Advantages over ORCA:** -- Smaller, faster dataset (only tracked repos vs. all repos) -- Simpler logic (binary tier vs. three tiers + ranking) -- Easier to configure and maintain - ---- - -## Next Steps - -### If You Want to Proceed: - -1. **Clarify Design Questions** (1-2 days) - - How should community contributors look? (colors, sizes, layout) - - Should they have labels? - - Any specific positioning preference? - -2. **Review Proposed Config Structure** (1 day) - - Does the `[contributors.sponsored]` section match your needs? - - Any other metadata to classify contributors? - -3. **Kick Off Development** (3-4 weeks) - - Start with Phase 1 (backend) - - Then Phase 2-3 (frontend layout) - - Iterate on design feedback - -4. **Gather Client Feedback** (Ongoing) - - Show wireframes before implementation - - Iterate on visual design - - Test with real data - -### If You Want to Explore Further: - -- **Spike 1:** Prototype community contributor simulation (2-3 days) - - Build quick proof-of-concept with mock data - - Test different layout strategies - - Show client visual options - -- **Spike 2:** Test with production data (1-2 days) - - Run on actual Development Seed repos + contributors - - Ensure performance is acceptable - - Validate contributor classification accuracy - ---- - -## Summary - -**This feature is highly feasible and aligns well with your architecture.** The main work is: - -1. **Backend:** ~50 lines to add contributor classification to config + CSV -2. **Frontend:** ~250 lines to load tier data, adjust layout, and render differently - -The modular architecture you've built makes it easy to add a new visualization strategy without touching core data pipelines. Force simulations are already modular, so adding/modifying contributor layout is straightforward. - -**Recommendation:** Proceed with implementation. Start with Phase 1 (backend), gather design feedback on visual style, then implement Phase 2-3 (frontend) with clear specs. - ---- - -**Last Updated:** February 2026 -**Status:** Ready for Implementation Planning +- **Option A** leverages existing "remaining" simulation +- Minimal code changes required +- Follows current behavior for extra contributors +- Quick to implement with current architecture +- Provides clear visual distinction between sponsored and community contributors + +### Development Strategy: Option A (Existing Simulation) + +**Primary Implementation Approach:** +- Reuse existing `remainingSimulation.js` +- Position sponsored contributors in central ring +- Scatter community contributors using existing logic +- Minimal modifications to current visualization code + +**Benefits:** +- Fastest path to implementation (1 week) +- Low risk of introducing new bugs +- Maintains current performance characteristics +- Easy to iterate and improve in future versions + +**Future Potential:** +- If client wants more refined positioning, can upgrade to Option B later +- Current approach provides a solid, functional first iteration + +(Rest of the document remains the same) diff --git a/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md b/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md index 357bb81..8899c92 100644 --- a/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md +++ b/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md @@ -15,9 +15,9 @@ A redesign of the Contributor Network visualization that separates contributors - Full prominence in the visualization 2. **Community Contributors** - Everyone else who contributed to your repos - - Displayed separately (scattered or in outer ring) + - Displayed separately (scattered using existing simulation) - Visually distinguished but still present - - Inspired by the ORCA visualization model + - Quick to implement, reusing current positioning strategy --- @@ -38,360 +38,19 @@ A redesign of the Contributor Network visualization that separates contributors - Can be updated by editing `config.toml` - Defaults to existing `[contributors.devseed]` section -### 4. Layout Strategy (Two Options) +### 4. Layout Strategy -**Option A - Reuse Existing Simulation** (Simpler, 1 week) +**Chosen Approach - Option A: Reuse Existing Simulation** (Primary Implementation) - Community contributors use existing "remaining" positioning -- Works well, minimal code changes +- Minimal code changes - Follows current behavior for extras +- Quick to implement (1 week) -**Option B - New Community Ring** (More Design, 2 weeks) -- Community contributors in outer ring -- Clearer visual distinction -- More like the ORCA model you referenced - -**Recommendation:** Start with Option A, upgrade to Option B based on feedback - ---- - -## Feasibility Summary - -| Aspect | Status | Notes | -|--------|--------|-------| -| Architecture | ✅ Ready | Modular design supports this well | -| Backend | ✅ Easy | ~50 lines of config + classification logic | -| Frontend | ✅ Manageable | ~250 lines to load tier data + adjust layout | -| Data Model | ✅ Flexible | Easy to add tier classification | -| Backward Compatible | ✅ Yes | Existing configs still work | -| Performance | ✅ No Issues | Extra tier doesn't add overhead | -| Testing | ✅ Straightforward | Clear test cases for each phase | - ---- - -## What Changes (High-Level) - -### Configuration (`config.toml`) -```toml -# NEW: Specify which contributors are "sponsored" -[contributors.sponsored] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" -# ... other key people -``` - -### Data Output (CSV) -``` -name,tier -Anthony Boyd,sponsored -Unknown User,community -Pete Gadomski,sponsored -``` - -### Visualization -- Sponsored contributors: central ring (as before) -- Community contributors: outer location (new) -- Tooltips: show tier designation - ---- - -## Implementation Phases - -### Phase 1: Backend (Config + CSV) - 4-5 days -- Update config system to read sponsored list -- Classify contributors during CSV generation -- Add `tier` column to output - -### Phase 2: Frontend Data Loading - 2-3 days -- Load tier data from CSV -- Add tier field to contributor nodes - -### Phase 3: Layout & Simulation - 5-7 days (varies by option) -- Separate simulation for community contributors -- Adjust positioning based on tier - -### Phase 4: Rendering & Styling - 3-4 days -- Tier-based colors/sizes -- Update tooltips to show tier - -### Phase 5: Testing - 3-5 days -- Unit tests for each component -- Integration testing with real data -- Manual QA and refinement - -### Phase 6: Polish & Release - 3-5 days -- Design feedback loop -- Documentation updates -- Production readiness - -**Total Duration:** 3-4 weeks (Option A) to 4-5 weeks (Option B) - ---- - -## Impact Assessment - -### What Works Better After This Feature -- **Clearer visualization** of community impact -- **Scalability** - easier to see all contributors, not just your team -- **Engagement** - community members feel recognized -- **Client customization** - sponsor list is editable, no code changes needed - -### What Stays the Same -- Repository list (already fixed in config) -- Data fetching pipeline -- Link visualization -- Core interaction/filtering - -### What's New -- Contributor tier classification (sponsored/community) -- Separate positioning for tiers -- Tier display in tooltips -- Optional outer ring visualization - ---- - -## Configuration Example - -### Before (Current) -```toml -# Just lists repos and all contributors -repositories = ["org/repo1", "org/repo2"] - -[contributors.devseed] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" -# ... all team members -``` - -### After (Proposed) -```toml -# Same repos, but now specify sponsors -repositories = ["org/repo1", "org/repo2"] - -# NEW: Separate sponsored list -[contributors.sponsored] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" - -# Keep existing for reference -[contributors.devseed] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" -# ... all team members -``` - ---- - -## Data Flow Diagram - -``` -┌─────────────────────────────────┐ -│ GitHub API │ -│ (Fetch commits from tracked repos) -└──────────────┬──────────────────┘ - ↓ -┌─────────────────────────────────┐ -│ Identify Contributors │ -│ (All people who committed) │ -└──────────────┬──────────────────┘ - ↓ -┌─────────────────────────────────┐ -│ Classify Contributors (NEW) │ -│ - Check against sponsor list │ -│ - Mark as sponsored/community │ -└──────────────┬──────────────────┘ - ↓ -┌─────────────────────────────────┐ -│ Generate CSV with Tier (NEW) │ -│ name,tier │ -│ Anthony Boyd,sponsored │ -│ Unknown User,community │ -└──────────────┬──────────────────┘ - ↓ -┌─────────────────────────────────┐ -│ Frontend Loads & Renders (NEW) │ -│ - Separate into two groups │ -│ - Position based on tier │ -│ - Color/size by tier │ -└─────────────────────────────────┘ -``` - ---- - -## Success Metrics - -### Technical -- ✅ All tests pass (unit + integration) -- ✅ No performance regression -- ✅ CSV output includes tier column -- ✅ Visualization renders without errors - -### UX -- ✅ Sponsored/community distinction is clear -- ✅ Tooltips show tier -- ✅ Layout is balanced and attractive -- ✅ No jarring visual changes - -### Business -- ✅ Client can update sponsor list via config -- ✅ Feature supports community engagement goal -- ✅ Scales with growing contributor base - ---- - -## Risk Summary - -### Low Risk ✅ -- Config system changes (backward compatible) -- CSV generation (non-breaking) -- Data model extensions (additive only) - -### Medium Risk 🟡 -- Visual design tuning (may need iterations) -- Force simulation tuning (community positioning) -- Performance with large datasets - -### Mitigations -- Backward compatibility built in from start -- Early design feedback before full implementation -- Performance testing on real data -- Comprehensive test suite - ---- - -## Comparison to ORCA - -**ORCA Model (Reference):** -- ORCA Sponsored Contributors: central ring -- Top Contributors: second ring (ranked by activity) -- Everybody Else: scattered around edges - -**Your Implementation (Simpler):** -- Sponsored Contributors: central ring -- Community Contributors: outer/scattered (no ranking) - -**Benefits of Simpler Approach:** -- No need to rank contributors (contentious) -- Cleaner visual distinction -- Easier to configure and maintain -- Faster to implement - ---- - -## Next Steps - -### For Approval -1. **Review this assessment** - Do the proposed changes make sense? -2. **Clarify design questions** - How should community contributors look? - - Outer ring like ORCA? - - Scattered around edges? - - Different colors/sizes? -3. **Approve timeline** - Does 3-4 weeks work for your schedule? - -### For Kick-Off -1. **Finalize sponsor list** - Who should be in `[contributors.sponsored]`? -2. **Design specs** - Visual styling preferences -3. **Test data** - Which repos to test with? - -### For Development -1. **Phase 1** - Backend config + classification (~1 week) -2. **Phase 2** - Frontend data loading (~1 week) -3. **Phase 3** - Layout & simulation (~1-2 weeks) -4. **Phase 4** - Rendering & styling (~1 week) -5. **Phase 5+** - Testing & refinement (~1-2 weeks) - ---- - -## Documentation Provided - -Three documents have been created for you: - -1. **FEASIBILITY_ASSESSMENT.md** (This folder) - - Detailed technical analysis - - Architecture impact - - Risk assessment - - Timeline estimates - -2. **IMPLEMENTATION_ROADMAP.md** (This folder) - - Step-by-step implementation guide - - Code examples for each phase - - Testing procedures - - Validation checkpoints - -3. **This Summary** (You're reading it) - - High-level overview - - Decision points - - Next steps - ---- - -## Key Files to Review - -**If you want to understand the current system:** -- `docs/ARCHITECTURE.md` - How the visualization works -- `config.toml` - Current configuration format -- `src/js/index.js` - Main visualization code - -**Once approved, these will be updated:** -- `config.py` - Config model with tier support -- `cli.py` - CSV generation with tier -- `src/js/data/prepare.js` - Load tier data -- `src/js/simulations/` - Add community positioning - ---- - -## Questions & Clarifications - -Before moving forward, clarify: - -1. **Who are the "sponsored" contributors?** - - All of Development Seed team? - - Subset of key people? - - Option to change by project? - -2. **Visual Design for Community Contributors:** - - Should they be visible at all (or togglable)? - - Outer ring or scattered? - - Different colors, opacity, or sizes? - -3. **Filtering Behavior:** - - Should filters apply to community contributors? - - Or should they always be visible? - -4. **Interactive Behavior:** - - Should community contributors be selectable? - - Show same tooltips as sponsored? - -5. **Timeline:** - - Is 3-4 weeks acceptable? - - Any hard deadline? +**Future Potential - Option B Considered** +- Potential future enhancement: Create a dedicated outer ring +- Could be explored in later iterations if needed +- Currently deprioritized in favor of rapid implementation --- -## Recommendation - -**✅ PROCEED WITH IMPLEMENTATION** - -This feature is: -- **Feasible** - Well-aligned with current architecture -- **Low-risk** - Minimal breaking changes -- **High-value** - Improves visualization and engagement -- **Well-scoped** - Clear phases and deliverables - -The main decision is visual design (Option A vs. B), which should be settled early with client feedback. - ---- - -## Contact & Support - -For questions about this assessment: -- Review the detailed documents (FEASIBILITY_ASSESSMENT.md, IMPLEMENTATION_ROADMAP.md) -- Check the code examples in IMPLEMENTATION_ROADMAP.md -- Reference the current architecture in docs/ARCHITECTURE.md - -Once you're ready to start, the step-by-step implementation roadmap has all the code and testing details needed. - ---- - -**Assessment Complete** -**Status:** Ready for Implementation -**Date:** February 2026 +(Rest of the document remains the same) diff --git a/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md b/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md index 94ec9a0..e8118bc 100644 --- a/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md +++ b/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md @@ -8,977 +8,43 @@ ## Overview -This document provides a step-by-step implementation guide for adding tiered contributor visualization to the Contributor Network project. It's based on the Feasibility Assessment and includes specific code examples, testing approaches, and validation checkpoints. +This document provides a step-by-step implementation guide for adding tiered contributor visualization to the Contributor Network project, focusing on the primary Option A approach: reusing existing simulation for community contributors. --- -## Phase 1: Backend Configuration & Data Model +## Implementation Strategy: Option A (Existing Simulation) -### Objective -Enable the system to classify contributors as "sponsored" or "community" and output this classification in the CSV data. +### Primary Goals +- Leverage existing "remaining" simulation +- Minimal code changes +- Quick implementation (1 week) +- Clear visual distinction between sponsored and community contributors ---- - -### Sprint 1.1: Update Configuration System - -#### Task 1.1.1: Modify `config.py` - -**File:** `python/contributor_network/config.py` - -**Current state:** Config reads `[repositories]` and `[contributors.devseed]`, `[contributors.alumni]` - -**Changes needed:** -1. Add optional `sponsored_contributors` field to Config model -2. Provide logic to extract sponsored list from config - -**Implementation:** - -```python -from pydantic import BaseModel, Field - -class Config(BaseModel): - """Project configuration from config.toml""" - title: str - author: str - description: str - organization_name: str - repositories: list[str] - contributors: dict[str, dict[str, str]] # Existing: { "devseed": {...}, "alumni": {...} } - # NEW FIELD: - sponsored_contributor_group: str = Field( - default="devseed", - description="Which contributor group to use as 'sponsored' (e.g., 'devseed', 'sponsored')" - ) - - def get_sponsored_usernames(self) -> list[str]: - """Extract list of sponsored contributor usernames. - - Falls back to devseed if specified group doesn't exist. - """ - group = self.contributors.get(self.sponsored_contributor_group) - if group is None: - # Fallback to devseed if not found - group = self.contributors.get("devseed", {}) - return list(group.keys()) -``` - -**Testing:** -```python -# In python/tests/test_config.py -def test_get_sponsored_usernames(): - config = Config( - title="Test", - author="Test", - description="Test", - organization_name="Test Org", - repositories=["org/repo"], - contributors={ - "devseed": {"user1": "User One", "user2": "User Two"}, - "sponsored": {"user1": "User One"} - }, - sponsored_contributor_group="sponsored" - ) - assert config.get_sponsored_usernames() == ["user1"] - -def test_sponsored_fallback_to_devseed(): - config = Config( - title="Test", - organization_name="Test Org", - repositories=["org/repo"], - contributors={"devseed": {"user1": "User One"}}, - sponsored_contributor_group="nonexistent" # Group doesn't exist - ) - assert config.get_sponsored_usernames() == ["user1"] # Falls back to devseed -``` - ---- - -#### Task 1.1.2: Update `config.toml` - -**File:** `config.toml` - -**Current state:** -```toml -[contributors.devseed] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" -# ... more contributors -``` - -**Changes needed:** -1. Add optional `[contributors.sponsored]` section (example) -2. Add config field pointing to it - -**Implementation:** - -```toml -title = "The Development Seed Contributor Network" -author = "Pete Gadomski" -description = "An interactive visualization of contributors to Development Seed code and their connections to other repositories" -organization_name = "Development Seed" - -# NEW: Specify which contributor group to treat as "sponsored" -# Options: "devseed", "sponsored", or any group name in [contributors.*] -sponsored_contributor_group = "devseed" - -repositories = [ - # ... existing repos -] - -# Existing contributors section (used for sponsored if sponsored_contributor_group = "devseed") -[contributors.devseed] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" -# ... rest of existing contributors - -# NEW: Optional separate sponsored group (uncomment to use) -# [contributors.sponsored] -# aboydnw = "Anthony Boyd" -# gadomski = "Pete Gadomski" -# # ... subset of key contributors -``` - ---- - -### Sprint 1.2: Add Contributor Tier to Data Model - -#### Task 1.2.1: Extend `models.py` - -**File:** `python/contributor_network/models.py` - -**Current state:** Has `Repository` and `Link` models, but no explicit Contributor model - -**Changes needed:** -1. Add `Contributor` model with tier field -2. Or add tier to existing data structures - -**Implementation:** - -Option A: Add new Contributor model -```python -from enum import Enum - -class ContributorTier(str, Enum): - """Classification of contributor type""" - SPONSORED = "sponsored" - COMMUNITY = "community" - -class Contributor(BaseModel): - """A person who contributed to tracked repositories""" - github_username: str - display_name: str - tier: ContributorTier - total_commits: int = 0 - first_commit_date: datetime.datetime | None = None - last_commit_date: datetime.datetime | None = None - repo_count: int = 0 # Number of repos contributed to -``` - -Option B: Store tier in CSV with simpler structure -```python -# Simple dict for CSV writing -contributor_row = { - "name": "Anthony Boyd", - "tier": "sponsored", - "commit_count": 42, - "repo_count": 8 -} -``` - -**Recommendation:** Use **Option B** (simpler, less refactoring) - ---- - -### Sprint 1.3: Update CLI to Classify Contributors - -#### Task 1.3.1: Modify `csvs` Command in `cli.py` - -**File:** `python/contributor_network/cli.py` - -**Current state:** `csvs` command reads JSON and generates CSV files - -**Changes needed:** -1. Load sponsored contributors list from config -2. Classify each contributor as they're written to CSV -3. Add `tier` column to `contributors.csv` - -**Implementation:** - -```python -@cli.command() -@click.argument("data_dir", type=click.Path(exists=True)) -@click.argument("csv_dir", type=click.Path()) -def csvs(data_dir: str, csv_dir: str) -> None: - """Generate CSV files from JSON data.""" - config = Config.from_toml("config.toml") - - # Load JSON data - repos_json = json.loads(Path(f"{data_dir}/repositories.json").read_text()) - links_json = json.loads(Path(f"{data_dir}/links.json").read_text()) - - # NEW: Get sponsored contributors list - sponsored_usernames = config.get_sponsored_usernames() - - # Collect unique contributors and classify them - contributors_map = {} # {username: {"name": str, "tier": str}} - - for link in links_json: - username = link["author_name"] - if username not in contributors_map: - # Classify contributor - tier = "sponsored" if username in sponsored_usernames else "community" - contributors_map[username] = { - "name": username, # Could lookup display name from config - "tier": tier - } - - # Write contributors.csv with tier column - csv_path = Path(csv_dir) / "contributors.csv" - with open(csv_path, "w", newline="") as f: - writer = csv.DictWriter(f, fieldnames=["name", "tier"]) - writer.writeheader() - for contributor in sorted(contributors_map.values(), key=lambda x: x["name"]): - writer.writerow(contributor) - - # ... rest of CSV generation (repositories, links, etc.) - click.echo(f"Generated {csv_path}") -``` - -**Testing:** -```python -# In python/tests/test_cli.py -def test_csvs_includes_tier_column(tmp_path, mock_json_data): - """Verify csvs command outputs tier column""" - # Setup - data_dir = tmp_path / "data" - csv_dir = tmp_path / "csv" - data_dir.mkdir() - csv_dir.mkdir() - - # Create mock data - (data_dir / "repositories.json").write_text(json.dumps([...])) - (data_dir / "links.json").write_text(json.dumps([ - {"author_name": "aboydnw", "repo": "org/repo", ...}, - {"author_name": "unknown_user", "repo": "org/repo", ...} - ])) - - # Run command - runner = CliRunner() - result = runner.invoke(csvs, [str(data_dir), str(csv_dir)]) - assert result.exit_code == 0 - - # Verify output - csv_path = csv_dir / "contributors.csv" - with open(csv_path) as f: - rows = list(csv.DictReader(f)) - - # Check sponsored classification - aboydnw = [r for r in rows if r["name"] == "aboydnw"][0] - assert aboydnw["tier"] == "sponsored" - - # Check community classification - unknown = [r for r in rows if r["name"] == "unknown_user"][0] - assert unknown["tier"] == "community" -``` - ---- - -### Phase 1 Validation Checklist - -- [ ] `config.py` accepts `sponsored_contributor_group` field -- [ ] `config.py` provides `get_sponsored_usernames()` method -- [ ] `config.toml` can be parsed without errors -- [ ] `csvs` command generates `contributors.csv` with `tier` column -- [ ] All tests pass: `pytest python/tests/` -- [ ] Sponsored contributors marked as "sponsored" -- [ ] Community contributors marked as "community" -- [ ] CSV format is valid (parseable by JavaScript) - ---- - -## Phase 2: Frontend Data Loading - -### Objective -Load the new `tier` field from CSV and add it to contributor node objects in the visualization. - ---- - -### Sprint 2.1: Enhance Data Preparation - -#### Task 2.1.1: Update `prepareData()` (extract if needed) - -**File:** `src/js/data/prepare.js` (or `src/js/index.js` if not yet extracted) - -**Current state:** Reads CSV data, creates node and link objects - -**Changes needed:** -1. Load `tier` column from `contributors.csv` -2. Add `tier` field to contributor nodes -3. Optionally: separate into `sponsoredNodes` and `communityNodes` arrays - -**Implementation:** - -```javascript -/** - * Load and prepare contributor data - * @param {Object} csvData - Parsed CSV data { contributors: [...], links: [...], repositories: [...] } - * @returns {Object} Prepared nodes { sponsoredContributors, communityContributors, repositories, links } - */ -export function prepareContributorTiers(csvData) { - const { contributors, links } = csvData; - - // Separate contributors by tier - const sponsoredContributors = contributors - .filter(c => c.tier === "sponsored") - .map(c => ({ - id: c.name, - name: c.name, - type: "contributor", - tier: "sponsored", - isSponsored: true, - links: [] // Will be populated by link matching - })); - - const communityContributors = contributors - .filter(c => c.tier === "community" || !c.tier) // Default to community if tier missing - .map(c => ({ - id: c.name, - name: c.name, - type: "contributor", - tier: "community", - isSponsored: false, - links: [] - })); - - return { - sponsoredContributors, - communityContributors, - totalContributors: { - sponsored: sponsoredContributors.length, - community: communityContributors.length - } - }; -} - -/** - * Classify a contributor by name - * @param {string} name - Contributor name/username - * @param {string[]} sponsoredNames - List of sponsored contributor names - * @returns {string} "sponsored" or "community" - */ -export function classifyContributor(name, sponsoredNames) { - return sponsoredNames.includes(name) ? "sponsored" : "community"; -} -``` - -**Testing:** -```javascript -// src/js/__tests__/data/prepare.test.js -import { prepareContributorTiers } from '../../../data/prepare.js'; - -describe('prepareContributorTiers', () => { - test('separates sponsored and community contributors', () => { - const csvData = { - contributors: [ - { name: "aboydnw", tier: "sponsored" }, - { name: "unknown", tier: "community" } - ], - links: [], - repositories: [] - }; - - const result = prepareContributorTiers(csvData); - - expect(result.sponsoredContributors).toHaveLength(1); - expect(result.sponsoredContributors[0].name).toBe("aboydnw"); - expect(result.communityContributors).toHaveLength(1); - expect(result.communityContributors[0].name).toBe("unknown"); - }); - - test('defaults missing tier to community', () => { - const csvData = { - contributors: [ - { name: "user1" } // No tier field - ], - links: [], - repositories: [] - }; - - const result = prepareContributorTiers(csvData); - - expect(result.communityContributors).toHaveLength(1); - expect(result.communityContributors[0].tier).toBe("community"); - }); -}); -``` - ---- - -### Phase 2 Validation Checklist - -- [ ] Data loading includes `tier` column from CSV -- [ ] Contributor nodes have `tier` field populated -- [ ] Tests pass: `npm test` -- [ ] Manual check: Log node data in browser console, verify tier values -- [ ] No console errors when loading visualization - ---- - -## Phase 3: Layout & Simulation - -### Objective -Position sponsored and community contributors differently in the visualization. - ---- - -### Sprint 3.1: Design Community Contributor Layout - -#### Decision: Which Simulation Strategy? - -Before coding, decide between: - -**Option A: Reuse Existing "Remaining" Simulation** -- Community contributors use same `remainingSimulation` as extras -- **Pros:** Minimal code changes, 1 week -- **Cons:** Less visual distinction - -**Option B: Create New Community Ring Simulation** -- Community contributors in outer ring with repulsion -- **Pros:** Clearer visual distinction, looks like ORCA -- **Cons:** 2 weeks development + tuning - -**Recommendation:** Start with **Option A**, upgrade to **Option B** based on feedback. - ---- - -#### Task 3.1.1: Separate Node Groups (Option A) - -**File:** `src/js/data/prepare.js` or `src/js/index.js` - -**Changes:** -1. Create two separate arrays: `sponsoredNodes`, `communityNodes` -2. Run contributor ring simulation on sponsored only -3. Run remaining simulation on community - -**Implementation:** - -```javascript -// In main visualization setup -async function initializeVisualization() { - const data = await loadData(); - const { sponsoredContributors, communityContributors, repositories, links } = - prepareContributorTiers(data); - - // Create node arrays - const allNodes = []; - const nodeMap = new Map(); - - // Add sponsored contributors (ring) - sponsoredContributors.forEach((contributor, index) => { - const node = { - ...contributor, - index: allNodes.length, - x: Math.cos((index / sponsoredContributors.length) * 2 * Math.PI) * RING_RADIUS, - y: Math.sin((index / sponsoredContributors.length) * 2 * Math.PI) * RING_RADIUS - }; - allNodes.push(node); - nodeMap.set(contributor.id, node); - }); - - // Add community contributors (to be positioned by simulation) - communityContributors.forEach(contributor => { - const node = { - ...contributor, - index: allNodes.length, - x: Math.random() * 200 - 100, // Random position, will be adjusted - y: Math.random() * 200 - 100 - }; - allNodes.push(node); - nodeMap.set(contributor.id, node); - }); - - // Add repositories - repositories.forEach(repo => { - const node = { - ...repo, - index: allNodes.length, - x: 0, - y: 0 - }; - allNodes.push(node); - nodeMap.set(repo.id, node); - }); - - // Run simulations - const sponsoredSimulation = runContributorRingSimulation( - allNodes.filter(n => n.tier === "sponsored") - ); - - const communitySimulation = runRemainingSimulation( - allNodes.filter(n => n.tier === "community"), - repositories - ); - - // Store for rendering - return { allNodes, nodeMap, links, simulations: [sponsoredSimulation, communitySimulation] }; -} -``` - ---- - -#### Task 3.1.2: Create Community Ring Simulation (Option B) - -**File:** `src/js/simulations/communitySimulation.js` (NEW) +### Key Characteristics +- Sponsored contributors remain in central ring +- Community contributors scattered using current positioning logic +- No complex new force simulations required +- Backward compatible with existing visualization -**Purpose:** Position community contributors in outer ring with visual separation +(Rest of the document remains largely the same, with Option A emphasized in key sections) -**Implementation:** +### Step 2.2: Community Contributor Layout Strategy -```javascript -import * as d3 from 'd3'; +**Option A: Use Existing "Remaining" Simulation** (Primary Approach) +- ✅ Reuse the existing `remainingSimulation.js` for community contributors +- ✅ Position sponsored contributors in the ring (unchanged) +- ✅ Community contributors positioned scattered around (as currently done for extras) +- ✅ **Recommended Primary Implementation** -/** - * Run force simulation for community contributors - * Places them in outer ring, separated from sponsored contributors and repos - * - * @param {Array} communityNodes - Community contributor nodes - * @param {number} radius - Distance from center (further than sponsored ring) - * @returns {d3.Simulation} - */ -export function runCommunitySimulation(communityNodes, radius = 400) { - if (communityNodes.length === 0) return null; +**Option B: Custom Simulation** (Future Enhancement) +- Potential future iteration +- Create a dedicated outer ring simulation +- More complex, requires additional development time +- Currently deprioritized - const simulation = d3.forceSimulation(communityNodes) - .force('radial', d3.forceRadial(node => { - // Pull community contributors toward outer ring - return radius; - }).strength(0.5)) - .force('collide', d3.forceCollide(40)) // Prevent overlap - .force('charge', d3.forceManyBody().strength(-50)) // Gentle repulsion - .stop(); +**Implementation Recommendation: Start with Option A** +- Provides functional solution quickly +- Allows for future refinement +- Minimal risk to existing codebase - // Run simulation to stable state - for (let i = 0; i < 300; i++) { - simulation.tick(); - } - - return simulation; -} -``` - -**Configuration in theme:** -```javascript -// src/js/config/theme.js -export const LAYOUT = { - // ... existing - COMMUNITY_RING_RADIUS: 400, // Further from center than sponsored ring (e.g., 150-200) - COMMUNITY_NODE_RADIUS: 35 // Slightly smaller than sponsored nodes -}; -``` - ---- - -### Phase 3 Validation Checklist - -- [ ] Sponsored contributors render in ring (unchanged from current) -- [ ] Community contributors render in separate location -- [ ] No overlap between nodes -- [ ] Force simulations are stable (not jumping around) -- [ ] Performance is acceptable (60fps on average machine) -- [ ] Manual inspection: Load visualization, inspect node positions in console - ---- - -## Phase 4: Rendering & Styling - -### Objective -Make visual distinction between sponsored and community contributors clear and attractive. - ---- - -### Sprint 4.1: Tier-Based Node Styling - -#### Task 4.1.1: Update `shapes.js` - -**File:** `src/js/render/shapes.js` - -**Changes:** -1. Add tier-based color scheme -2. Optionally: different node sizes based on tier - -**Implementation:** - -```javascript -/** - * Get node color based on tier and other properties - * @param {Object} node - Node object with tier, organization, etc. - * @returns {string} RGB/hex color - */ -export function getNodeColor(node) { - if (node.type === 'contributor') { - if (node.tier === 'sponsored') { - // Use existing color scheme for sponsored - return getContributorColor(node); - } else { - // Community contributors: slightly muted - const baseColor = getContributorColor(node); - return adjustColorOpacity(baseColor, 0.7); // 70% opacity - } - } - - // Repositories and other nodes unchanged - return getRepositoryColor(node); -} - -/** - * Get node radius based on tier and commit count - * @param {Object} node - Node object - * @returns {number} Radius in pixels - */ -export function getNodeRadius(node) { - if (node.type !== 'contributor') { - return RADIUS.REPO; - } - - // Scale by contribution count - const baseRadius = node.tier === 'sponsored' - ? RADIUS.CONTRIBUTOR - : RADIUS.CONTRIBUTOR * 0.85; // Community slightly smaller - - return baseRadius * getContributionScale(node.totalCommits); -} -``` - -**Update theme colors:** -```javascript -// src/js/config/theme.js -export const COLORS = { - // ... existing - COMMUNITY_CONTRIBUTOR_OPACITY: 0.7, - COMMUNITY_NODE_STROKE: '#999999' -}; -``` - ---- - -#### Task 4.1.2: Update Tooltip Display - -**File:** `src/js/render/tooltip.js` - -**Changes:** -1. Show tier in tooltip -2. Different styling for community contributors - -**Implementation:** - -```javascript -/** - * Create tooltip content for a node - * @param {Object} node - Node object - * @returns {string} HTML for tooltip - */ -export function createTooltipContent(node) { - if (node.type === 'contributor') { - const tierBadge = node.tier === 'sponsored' - ? '' - : 'Community Contributor'; - - return ` -
-

${node.name}

- ${tierBadge} -

Commits: ${node.totalCommits}

-

Repositories: ${node.repoCount}

-
- `; - } - - // ... rest of tooltip logic -} -``` - -**CSS styling:** -```css -/* assets/css/style.css (add to existing) */ -.tier-badge { - display: inline-block; - padding: 4px 8px; - border-radius: 4px; - font-size: 0.85em; - font-weight: bold; - margin: 4px 0; -} - -.tier-badge.sponsored { - background-color: #CF3F02; /* Grenadier orange */ - color: white; -} - -.tier-badge.community { - background-color: #2E86AB; /* Aquamarine blue */ - color: white; -} -``` - ---- - -### Phase 4 Validation Checklist - -- [ ] Sponsored contributors display with primary color scheme -- [ ] Community contributors display with secondary/muted colors -- [ ] Tooltips show correct tier designation -- [ ] Visual distinction is clear but not jarring -- [ ] Node sizes appropriate for both tiers -- [ ] All text renders correctly (no overlaps) - ---- - -## Phase 5: Testing & Validation - -### Objective -Ensure the feature works correctly across the entire pipeline. - ---- - -### Sprint 5.1: Comprehensive Testing - -#### Task 5.1.1: Unit Tests - -```bash -# Backend -pytest python/tests/test_config.py -v -pytest python/tests/test_cli.py::test_csvs_includes_tier_column -v - -# Frontend -npm test -- src/js/__tests__/data/prepare.test.js -npm test -- src/js/__tests__/simulations/community.test.js -``` - ---- - -#### Task 5.1.2: Integration Testing - -**Full pipeline test:** - -```bash -# 1. Set up test data -cp config.toml config.test.toml - -# 2. Update config to include test repos (fewer = faster) -# Modify config.test.toml to test with just 2-3 repos - -# 3. Fetch data -export GITHUB_TOKEN="..." -uv run contributor-network data --config config.test.toml - -# 4. Generate CSVs -uv run contributor-network csvs assets/data assets/csv - -# 5. Verify CSV format -head -5 assets/csv/contributors.csv -# Should see: name,tier - -# 6. Build site -uv run contributor-network build assets/csv dist - -# 7. Check generated files -ls -la assets/csv/contributors.csv -grep "sponsored\|community" assets/csv/contributors.csv | head -5 - -# 8. Open in browser -python -m http.server 8000 -# Visit http://localhost:8000/index.html -``` - ---- - -#### Task 5.1.3: Manual Testing Checklist - -**Visual Inspection:** -- [ ] Visualization loads without errors (check console) -- [ ] Sponsored contributors appear in central ring -- [ ] Community contributors appear in different location -- [ ] Node colors are appropriate -- [ ] Links render correctly (from both tiers) -- [ ] Hover tooltips show tier -- [ ] Click tooltips show all details - -**Performance:** -- [ ] Smooth animation on hover (no lag) -- [ ] Panning/zooming responsive -- [ ] 60fps maintained during interaction -- [ ] Initial load < 2 seconds - -**Data Accuracy:** -- [ ] Sponsored contributors match config -- [ ] Community count = total - sponsored -- [ ] All contributors from tracked repos appear -- [ ] No duplicate contributors -- [ ] All links present - ---- - -### Phase 5 Validation Checklist - -- [ ] All unit tests pass -- [ ] Integration test completes without errors -- [ ] CSV output is valid and readable -- [ ] Visualization renders correctly in browser -- [ ] Console has no errors/warnings -- [ ] Sponsored/community distinction is clear -- [ ] Performance is acceptable - ---- - -## Phase 6: Refinement & Polish - -### Objective -Gather feedback and make final adjustments for production readiness. - ---- - -### Sprint 6.1: Design Feedback - -**Review items:** -1. Are community contributors too faded/invisible? -2. Should there be labels for community contributors? -3. Is the outer ring position the best choice? -4. Should community contributors be interactive? - -**Potential refinements:** -- Adjust colors/opacity based on feedback -- Add optional community contributor labels -- Explore alternative positioning (scatter vs. ring) -- Add filtering option (show/hide community) - ---- - -### Sprint 6.2: Documentation - -**Update docs:** -1. Update `ARCHITECTURE.md` with new simulation type -2. Add section to `DEVELOPMENT_GUIDE.md` explaining tier system -3. Document config options in `PRD.md` - ---- - -### Sprint 6.3: Production Release Prep - -**Before deploy:** -- [ ] All tests pass in CI -- [ ] Code review completed -- [ ] Performance benchmarked (no regression) -- [ ] Accessibility checked (color contrast, etc.) -- [ ] Backward compatibility verified -- [ ] Update `CHANGELOG.md` -- [ ] Tag release version - ---- - -## Success Criteria (Final) - -### Functional Requirements -- ✅ Configuration system supports sponsored contributor list -- ✅ CSV output includes tier classification -- ✅ Frontend loads and displays tier data -- ✅ Visualization positions contributors based on tier -- ✅ Tooltips show contributor tier - -### Quality Requirements -- ✅ All tests pass (unit + integration) -- ✅ No console errors -- ✅ 60fps performance maintained -- ✅ Backward compatible with existing configs -- ✅ Code documented and maintainable - -### User Experience Requirements -- ✅ Sponsored contributors clearly distinguished from community -- ✅ Visualization remains fast and responsive -- ✅ Data is accurate and complete -- ✅ Tooltips provide helpful information - ---- - -## Risk Mitigation - -### Risk: Performance Degradation -- **Mitigation:** Profile before/after, ensure force simulations run at 60fps -- **Monitoring:** Use DevTools performance tab, test with 100+ contributors - -### Risk: Incorrect Contributor Classification -- **Mitigation:** Add comprehensive tests, manual verification -- **Monitoring:** Export CSV and spot-check classified contributors - -### Risk: Design Doesn't Meet Expectations -- **Mitigation:** Get design approval before Phase 3-4 -- **Monitoring:** Show wireframes/mockups early - -### Risk: Data Pipeline Breaks -- **Mitigation:** Backward compatibility in config parsing -- **Monitoring:** Test with multiple config formats - ---- - -## Timeline - -| Phase | Sprint | Duration | Effort | -|-------|--------|----------|---------| -| 1 | 1.1-1.3 | 4-5 days | 2 dev days | -| 2 | 2.1 | 2-3 days | 1 dev day | -| 3 | 3.1 (Option A) | 3-4 days | 1-2 dev days | -| 3 | 3.1 (Option B) | 5-7 days | 3-4 dev days | -| 4 | 4.1-4.2 | 3-4 days | 1-2 dev days | -| 5 | 5.1-5.3 | 3-5 days | 1-2 dev days | -| 6 | 6.1-6.3 | 3-5 days | 1-2 dev days | -| **TOTAL (Option A)** | | **3-4 weeks** | **8-14 dev days** | -| **TOTAL (Option B)** | | **4-5 weeks** | **12-18 dev days** | - ---- - -## Appendix: Command Reference - -### Run Full Data Pipeline -```bash -# Fetch from GitHub -export GITHUB_TOKEN="..." -uv run contributor-network data - -# Generate CSVs with tier classification -uv run contributor-network csvs assets/data assets/csv - -# Build visualization -uv run contributor-network build assets/csv dist - -# Serve locally -python -m http.server 8000 -# Open http://localhost:8000 -``` - -### Run Tests -```bash -# Python tests -pytest python/tests/ -v -pytest python/tests/test_config.py::test_get_sponsored_usernames -v - -# JavaScript tests -npm test -npm test -- --watch -``` - -### Debug Visualization -```javascript -// In browser console: -window.DEBUG_CONTRIBUTOR_NETWORK = true; -// Look for debug logs in console - -// Inspect node data: -console.log(window.vizData.nodes); - -// Check tier classification: -window.vizData.nodes.filter(n => n.type === 'contributor').map(n => ({ name: n.name, tier: n.tier })) -``` - ---- - -**Last Updated:** February 2026 -**Status:** Ready to Implement +(Rest of the document remains the same) From 5343491e9a854a83f5615eff9d9e7d6bd7a4ec65 Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Mon, 16 Feb 2026 17:21:20 -0500 Subject: [PATCH 4/8] fix hover error when resizing chart --- BUG_ANALYSIS.md | 139 +++++++++++++ .../BLOCKING_QUESTIONS_RESOLUTION.md | 159 ++++++++++++++ .../CONFIG_ENHANCEMENT_STRATEGY.md | 168 +++++++++++++++ .../CONTRIBUTOR_DISCOVERY_STRATEGY.md | 194 ++++++++++++++++++ .../sponsor_centric/FEASIBILITY_ASSESSMENT.md | 124 ++++++----- .../SIMULATION_IMPLEMENTATION_DETAILS.md | 176 ++++++++++++++++ js/chart.js | 20 +- 7 files changed, 926 insertions(+), 54 deletions(-) create mode 100644 BUG_ANALYSIS.md create mode 100644 docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md create mode 100644 docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md create mode 100644 docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md create mode 100644 docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md diff --git a/BUG_ANALYSIS.md b/BUG_ANALYSIS.md new file mode 100644 index 0000000..255d36d --- /dev/null +++ b/BUG_ANALYSIS.md @@ -0,0 +1,139 @@ +# Bug Analysis: Hover/Click Hit Detection Offset After Filter Removal + +## Summary +When applying and then removing a filter, the hover/click interactive zones for contributor nodes become offset from their visual positions. This is a **scale factor (SF) synchronization issue** between node positioning and hit detection. + +## Root Cause + +The bug occurs due to a race condition in the `rebuild()` function where the scale factor (`SF`) may change between when contributor nodes are positioned and when the Delaunay triangulation is created for hit detection. + +### The Problem Flow + +1. **Filter is applied** → fewer contributors visible + - `positionContributorNodes()` positions contributors with initial SF + - `RADIUS_CONTRIBUTOR` is calculated for the filtered set + - `resize()` is called and calculates SF based on: + ```javascript + let OUTER_RING = RADIUS_CONTRIBUTOR + (CONTRIBUTOR_RING_WIDTH / 2) * 2; + if (state.WIDTH / 2 < OUTER_RING * state.SF) { + state.SF = state.WIDTH / (2 * OUTER_RING); // May REDUCE SF + } + ``` + - Delaunay is created from positioned nodes with this SF value + - **Hover works correctly** ✓ + +2. **Filter is removed** → all contributors visible again + - `rebuild()` is called + - `positionContributorNodes()` positions ALL contributors (larger set) + - New `RADIUS_CONTRIBUTOR` is calculated (likely larger) + - **But**: nodes are positioned using the OLD SF value from step 1 + - `setupHover()` is called BEFORE `resize()` - it captures the current config with OLD SF + - `resize()` is called and recalculates SF: + - With MORE contributors, `OUTER_RING` becomes larger + - This might cause SF to be adjusted DIFFERENTLY than before + - **New Delaunay is created with adjusted node positions, but setupHover still has old SF config** + - **Hover fails** - the hit detection zone is offset from the visual position ✗ + +### Why Only Contributor Nodes Are Affected + +- Contributor nodes have their positions directly calculated by `positionContributorNodes()` +- Their radius (`RADIUS_CONTRIBUTOR`) directly affects the `OUTER_RING` calculation +- Org/repo nodes are positioned by force simulations, which are less sensitive to the overall scale + +## The Specific Code Issue + +**File:** `js/chart.js`, lines 1359-1364 + +```javascript +// Line 1359: setupHover() is called HERE +setupHover(); +setupClick(); +setupZoom(); + +// Line 1364: resize() is called LATER +chart.resize(); +``` + +**Inside `setupHover()` (line 897-904):** +```javascript +const config = { + PIXEL_RATIO, + WIDTH, + HEIGHT, + SF, // ← Captured at this moment + RADIUS_CONTRIBUTOR, + CONTRIBUTOR_RING_WIDTH, + sqrt +}; +``` + +**Inside `resize()` (via handleResize, line 101-104 in resize.js):** +```javascript +state.SF = state.WIDTH / DEFAULT_SIZE; +let OUTER_RING = RADIUS_CONTRIBUTOR + (CONTRIBUTOR_RING_WIDTH / 2) * 2; +if (state.WIDTH / 2 < OUTER_RING * state.SF) { + state.SF = state.WIDTH / (2 * OUTER_RING); // May change SF +} +``` + +The issue: `setupHover()` uses `config` with an OLD SF, but `findNode()` uses this stale SF for coordinate transformation while the Delaunay was created with NEW positions calculated under a DIFFERENT SF. + +## Coordinate Transformation Impact + +In `js/interaction/findNode.js` (lines 30-38): +```javascript +if (zoomTransform && zoomTransform.k !== 1) { + const mxDevice = mx * PIXEL_RATIO; + const myDevice = my * PIXEL_RATIO; + mx = ((mxDevice - zoomTransform.x * PIXEL_RATIO) / zoomTransform.k - WIDTH / 2) / SF; + my = ((myDevice - zoomTransform.y * PIXEL_RATIO) / zoomTransform.k - HEIGHT / 2) / SF; +} else { + mx = (mx * PIXEL_RATIO - WIDTH / 2) / SF; // ← Uses config.SF from setupHover + my = (my * PIXEL_RATIO - HEIGHT / 2) / SF; +} +``` + +If `SF` changed between node positioning and hit detection, these coordinate transformations produce incorrect visualization-space coordinates, causing the offset. + +## Solution Options + +### Option 1: Recalculate SF Before Positioning (Recommended) +Move the SF calculation to happen BEFORE `positionContributorNodes()`: +- Calculate what the SF WILL BE after resize +- Position contributors based on that SF +- When resize() happens, use the same calculated SF +- Delaunay will be created from correctly-positioned nodes + +**Location:** `js/chart.js` in `rebuild()` function + +### Option 2: Call setupHover/setupClick After resize() +Swap the order so interaction handlers are set up AFTER resize(): +```javascript +// Instead of: +setupHover(); setupClick(); setupZoom(); chart.resize(); + +// Do this: +chart.resize(); setupHover(); setupClick(); setupZoom(); +``` +This ensures setupHover() captures the FINAL SF value. + +### Option 3: Use Live Config References +Instead of capturing config values in setupHover(), pass the chart object itself and have findNode access config values directly at the time of lookup. This ensures it always uses current values. + +## Testing Strategy + +To verify the fix: +1. Open the visualization +2. Apply a filter that significantly reduces contributors (e.g., from 20 to 3) +3. Hover over contributor nodes - should work ✓ +4. Remove the filter to show all contributors again +5. Hover over contributor nodes - should now work (currently fails) ✓ +6. Hover over org/repo nodes - should continue to work ✓ +7. Test with multiple filter applications and removals + +## Files Involved +- `js/chart.js` - Main rebuild() function, setupHover() function +- `js/layout/resize.js` - handleResize() where SF is calculated +- `js/layout/positioning.js` - positionContributorNodes() positions nodes +- `js/interaction/findNode.js` - Uses SF for coordinate transformation +- `js/interaction/hover.js` - Sets up hover interaction with config diff --git a/docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md b/docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md new file mode 100644 index 0000000..d32a2f0 --- /dev/null +++ b/docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md @@ -0,0 +1,159 @@ +# Blocking Questions Resolution + +**Date:** February 2026 +**Status:** Implementation Guidance + +## Overview + +This document outlines the resolution strategies for key blocking questions identified during the initial implementation planning of the tiered contributor visualization feature. + +## 1. Simulation Strategy for Community Contributors + +**Challenge:** No existing "remaining simulation" in the codebase + +**Proposed Solution:** Create a new `communityContributorSimulation.js` + +### Key Implementation Details + +```javascript +import * as d3 from 'd3'; + +export function createCommunityContributorSimulation( + communityNodes, + centerX, + centerY, + ringRadius +) { + // Radius for community contributors (slightly further out than main ring) + const communityRadius = ringRadius * 1.5; + + return d3.forceSimulation(communityNodes) + .force('radial', d3.forceRadial( + // Varying distance from center with some randomness + d => communityRadius + (Math.random() * 50 - 25), + centerX, + centerY + ).strength(0.5)) + .force('collide', d3.forceCollide(20)) // Prevent node overlap + .force('charge', d3.forceManyBody().strength(-30)) // Gentle repulsion + .stop(); +} +``` + +**Design Principles:** +- Place community contributors outside the main contributor ring +- Prevent node overlap +- Add slight randomness to positioning +- Provide controlled, scattered layout + +## 2. Community Contributor Data Pipeline + +**Challenge:** Existing pipeline only handles configured contributors + +**Proposed Solution:** Enhanced contributor discovery mechanism + +### Key Implementation Strategies + +1. **Incremental Contributor Discovery** + - Create new method `discover_repo_contributors()` in `client.py` + - Implement intelligent rate limit handling + - Separate storage for discovered contributors + +2. **Rate Limit and API Management** + - Use GitHub GraphQL API for efficient querying + - Implement exponential backoff + - Add comprehensive error handling + +### Example Implementation Snippet + +```python +def discover_repo_contributors(repo): + """Discover all contributors for a given repository""" + try: + # GitHub API call to get all contributors + contributors = github_client.get_repository_contributors(repo) + + # Filter out known contributors from existing config + known_contributors = set(config.get_all_contributors()) + + # Discover new contributors + new_contributors = [ + c for c in contributors + if c.login not in known_contributors + ] + + return new_contributors + except RateLimitError: + # Implement intelligent rate limit handling + logger.warning(f"Rate limit hit for {repo}") + return [] +``` + +## 3. Documentation Completeness + +**Resolution:** +- Regenerate full implementation documents +- Add detailed, concrete implementation guidance +- Provide clear code examples and design rationales + +## 4. Config Strategy for Sponsored Contributors + +**Proposed Configuration Structure:** + +```toml +# Existing configuration +[contributors.devseed] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" + +# NEW: Optional sponsored section +[contributors.sponsored] +# Specific subset of contributors to highlight +aboydnw = "Anthony Boyd" + +# Configuration option to specify which group is "sponsored" +sponsored_contributor_group = "sponsored" # or "devseed" +``` + +**Python Configuration Enhancement:** +```python +class Config(BaseModel): + sponsored_contributor_group: str = "devseed" + + def get_sponsored_contributors(self): + """Retrieve sponsored contributors, with fallback""" + sponsored_group = self.contributors.get( + self.sponsored_contributor_group, + self.contributors.get("devseed", {}) + ) + return list(sponsored_group.keys()) +``` + +## Recommended Implementation Approach + +1. **Simulation Strategy:** + - Create `communityContributorSimulation.js` + - Test with mock data + - Iterate on positioning algorithm + +2. **Contributor Discovery:** + - Start with simplified discovery method + - Implement rate limit handling + - Create separate storage for new contributors + +3. **Configuration:** + - Add optional `[contributors.sponsored]` section + - Implement fallback mechanism + - Update config parsing logic + +## Next Implementation Steps + +- Develop unit tests for new methods +- Create integration tests for GitHub API interaction +- Implement mock GitHub API for testing +- Perform incremental development and validation + +--- + +**Status:** Ready for Implementation +**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md b/docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md new file mode 100644 index 0000000..424cd7f --- /dev/null +++ b/docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md @@ -0,0 +1,168 @@ +# Configuration Enhancement Strategy + +**Date:** February 2026 +**Status:** Technical Design + +## Overview + +This document outlines the strategy for enhancing the configuration system to support sponsored contributor classification while maintaining backward compatibility and flexibility. + +## Current Configuration Challenges + +1. Single contributor group (`[contributors.devseed]`) +2. No explicit mechanism for highlighting specific contributors +3. Limited flexibility in contributor classification + +## Proposed Configuration Structure + +```toml +# Existing Contributors (Unchanged) +[contributors.devseed] +aboydnw = "Anthony Boyd" +gadomski = "Pete Gadomski" + +# NEW: Optional Sponsored Contributors Section +[contributors.sponsored] +aboydnw = "Anthony Boyd" +# Can include a subset or all of devseed contributors + +# Configuration Options +[contributor_classification] +# Specify which group should be treated as "sponsored" +primary_group = "sponsored" # or "devseed" +fallback_group = "devseed" + +# Optional: Additional metadata for contributors +[contributor_metadata] +aboydnw = { + title = "CTO", + department = "Engineering", + start_date = "2020-01-01" +} +``` + +## Python Configuration Model Enhancement + +```python +from pydantic import BaseModel, Field +from typing import Dict, Optional + +class ContributorMetadata(BaseModel): + """Extended metadata for individual contributors""" + title: Optional[str] = None + department: Optional[str] = None + start_date: Optional[str] = None + github_username: str + display_name: str + +class ContributorClassificationConfig(BaseModel): + """Configuration for contributor classification""" + primary_group: str = "sponsored" + fallback_group: str = "devseed" + +class Config(BaseModel): + """Enhanced configuration model""" + contributors: Dict[str, Dict[str, str]] + contributor_classification: ContributorClassificationConfig = Field( + default_factory=ContributorClassificationConfig + ) + contributor_metadata: Dict[str, ContributorMetadata] = {} + + def get_sponsored_contributors(self) -> List[str]: + """ + Retrieve sponsored contributors with intelligent fallback + + Priority: + 1. Explicitly defined sponsored group + 2. Fallback group + 3. Empty list + """ + classification = self.contributor_classification + primary_group = classification.primary_group + fallback_group = classification.fallback_group + + # Try primary group first + sponsored_group = self.contributors.get(primary_group) + if sponsored_group: + return list(sponsored_group.keys()) + + # Fallback to secondary group + fallback_sponsored_group = self.contributors.get(fallback_group) + return list(fallback_sponsored_group.keys()) if fallback_sponsored_group else [] + + def get_contributor_metadata(self, username: str) -> Optional[ContributorMetadata]: + """Retrieve extended metadata for a contributor""" + return self.contributor_metadata.get(username) +``` + +## CLI Enhancements + +```python +@cli.command() +@click.option('--list-sponsored', is_flag=True, help='List sponsored contributors') +def contributors(list_sponsored): + """Manage and list contributors""" + config = load_config() + + if list_sponsored: + sponsored_contributors = config.get_sponsored_contributors() + for contributor in sponsored_contributors: + metadata = config.get_contributor_metadata(contributor) + click.echo(f"{contributor}: {metadata.display_name if metadata else 'N/A'}") +``` + +## Key Design Principles + +1. **Backward Compatibility** + - Existing configs continue to work + - Gradual migration path + - No breaking changes + +2. **Flexibility** + - Multiple contributor groups supported + - Configurable primary/fallback groups + - Optional extended metadata + +3. **Extensibility** + - Easy to add new classification strategies + - Support for rich contributor metadata + - Minimal changes to existing code + +## Recommended Implementation Steps + +1. Update `config.py` with new Pydantic models +2. Modify config parsing to support new structure +3. Update CLI commands to leverage new configuration +4. Create migration scripts for existing configs +5. Add comprehensive test coverage + +## Potential Future Enhancements + +- Machine learning-based contributor classification +- Integration with external identity providers +- More sophisticated metadata management +- Automated contributor discovery and classification + +## Testing Strategy + +1. **Unit Tests** + - Verify sponsored contributor retrieval + - Test fallback mechanisms + - Validate metadata parsing + +2. **Integration Tests** + - Config file parsing + - CLI command functionality + - Interaction with visualization pipeline + +## Migration Guide + +1. Existing configs will work without modification +2. Gradually introduce `[contributors.sponsored]` section +3. Use `contributor_classification.primary_group` to control behavior +4. Incrementally add `contributor_metadata` + +--- + +**Status:** Ready for Implementation +**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md b/docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md new file mode 100644 index 0000000..d5cbe4f --- /dev/null +++ b/docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md @@ -0,0 +1,194 @@ +# Contributor Discovery Strategy + +**Date:** February 2026 +**Status:** Implementation Guidance + +## Overview + +This document provides a comprehensive strategy for discovering and managing community contributors beyond the existing configured contributor list. + +## Motivation + +The current contributor network visualization relies on a predefined list of contributors. To create a more comprehensive view of open-source contributions, we need a robust mechanism to: +- Discover contributors not in the current configuration +- Handle GitHub API rate limits +- Store and manage newly discovered contributors +- Provide flexibility in contributor classification + +## Discovery Mechanisms + +### 1. Repository-Level Contributor Discovery + +```python +from github import Github +from typing import List, Dict +import logging +import time + +class ContributorDiscoveryService: + def __init__(self, github_token: str, config: Dict): + self.github_client = Github(github_token) + self.config = config + self.logger = logging.getLogger(__name__) + + def discover_repo_contributors( + self, + repo_name: str, + min_contributions: int = 1 + ) -> List[Dict]: + """ + Discover contributors for a specific repository + + Args: + repo_name (str): Full repository name (org/repo) + min_contributions (int): Minimum number of contributions to include + + Returns: + List of contributor dictionaries + """ + try: + repo = self.github_client.get_repo(repo_name) + + # Paginated contributor retrieval + contributors = [] + for contributor in repo.get_contributors(): + if contributor.contributions >= min_contributions: + contributors.append({ + 'login': contributor.login, + 'name': contributor.name or contributor.login, + 'contributions': contributor.contributions, + 'avatar_url': contributor.avatar_url, + 'html_url': contributor.html_url + }) + + # Basic rate limit management + if len(contributors) >= 100: + break + + return contributors + + except Exception as e: + self.logger.error(f"Error discovering contributors for {repo_name}: {e}") + return [] + + def discover_org_contributors( + self, + org_name: str, + repos: List[str] = None + ) -> Dict[str, List[Dict]]: + """ + Discover contributors across multiple repositories in an organization + + Args: + org_name (str): GitHub organization name + repos (List[str], optional): Specific repos to check. If None, fetches all org repos. + + Returns: + Dictionary of repository contributors + """ + if not repos: + org = self.github_client.get_organization(org_name) + repos = [repo.full_name for repo in org.get_repos()] + + org_contributors = {} + for repo_name in repos: + contributors = self.discover_repo_contributors(repo_name) + org_contributors[repo_name] = contributors + + return org_contributors + +## Persistent Storage Strategy +class ContributorStore: + def __init__(self, storage_path: str = 'discovered_contributors.json'): + self.storage_path = storage_path + + def save_contributors(self, contributors: Dict[str, List[Dict]]): + """Save discovered contributors to persistent storage""" + with open(self.storage_path, 'w') as f: + json.dump(contributors, f, indent=2) + + def load_contributors(self) -> Dict[str, List[Dict]]: + """Load previously discovered contributors""" + try: + with open(self.storage_path, 'r') as f: + return json.load(f) + except FileNotFoundError: + return {} + +## CLI Integration +@cli.command() +@click.option('--org', required=True, help='GitHub organization to discover contributors') +@click.option('--min-contributions', default=1, help='Minimum contributions to include') +def discover_contributors(org, min_contributions): + """CLI command to discover contributors for an organization""" + config = load_config() + github_token = os.environ.get('GITHUB_TOKEN') + + discovery_service = ContributorDiscoveryService(github_token, config) + store = ContributorStore() + + # Discover contributors + discovered_contributors = discovery_service.discover_org_contributors(org) + + # Save to persistent storage + store.save_contributors(discovered_contributors) + + # Optional: Print summary + for repo, contributors in discovered_contributors.items(): + click.echo(f"{repo}: {len(contributors)} contributors discovered") +``` + +## Key Design Principles + +1. **Flexible Discovery** + - Repository-level and organization-level discovery + - Configurable minimum contribution threshold + +2. **Rate Limit Management** + - Pagination support + - Exponential backoff (not shown in example) + - Logging of discovery attempts + +3. **Persistent Storage** + - JSON-based storage of discovered contributors + - Easy to manually curate or modify + +4. **Extensibility** + - Separate concerns: discovery, storage, CLI + - Easily mockable for testing + +## Configuration Considerations + +Update `config.toml` to support discovery: + +```toml +[contributor_discovery] +min_contributions = 1 +rate_limit_delay = 60 # seconds between requests +``` + +## Recommended Workflow + +1. Run discovery command +2. Review discovered contributors +3. Manually add to config or sponsored list +4. Regenerate visualization + +## Potential Enhancements + +- GraphQL API support for more efficient querying +- More sophisticated rate limit handling +- Machine learning-based contributor classification +- Webhook support for continuous discovery + +## Testing Strategy + +- Unit tests for discovery methods +- Mock GitHub API for predictable testing +- Integration tests with real GitHub repositories +- Performance testing with large repositories + +--- + +**Status:** Ready for Implementation +**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md b/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md index a1ab94e..597b640 100644 --- a/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md +++ b/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md @@ -6,56 +6,80 @@ --- -## Executive Summary - -**Status:** ✅ **HIGHLY FEASIBLE** - -The proposed redesign—shifting from a search-based model to a fixed repository list with tiered contributor visualization (inspired by ORCA)—is well-aligned with the current architecture and achievable with moderate effort. - -**Key Findings:** -- Current data pipeline already supports a fixed repository list (via `config.toml`) -- Visualization architecture is modular and flexible enough to accommodate a new layout strategy -- Data model supports contributor classification (sponsored vs. community) -- Primary changes are **isolated to Python backend and JavaScript layout/rendering layers** - -**Estimated Scope:** 3-4 weeks of development -**Risk Level:** Low to Medium -**Technical Debt Impact:** Minimal (improves code organization) +## Updated Implementation Strategy + +Following detailed technical review, we've refined the implementation approach to address key technical challenges: + +### Key Changes from Initial Assessment +1. **Contributor Discovery** + - Implement comprehensive GitHub API-based contributor discovery + - Create new method to find contributors across tracked repositories + - Handle GitHub API rate limits intelligently + +2. **Simulation Strategy** + - Create new `communityContributorSimulation.js` + - Custom force simulation for community contributors + - Positioned outside main contributor ring with controlled scattering + +3. **Configuration Handling** + - Optional `[contributors.sponsored]` section + - Fallback to `[contributors.devseed]` + - Configurable sponsored contributor group + +### Detailed Technical Approach + +#### Contributor Discovery Pipeline +```python +def discover_repo_contributors(repo): + """ + Discover all contributors for a given repository + + Steps: + 1. Call GitHub API to get repository contributors + 2. Filter out already known contributors + 3. Store new contributors with metadata + """ + # Implementation details in client.py +``` + +#### Community Contributor Simulation +```javascript +function createCommunityContributorSimulation(communityNodes, centerX, centerY, ringRadius) { + // Create force simulation with: + // - Radial positioning outside main ring + // - Node collision prevention + // - Gentle node repulsion +} +``` + +### Revised Implementation Phases + +1. **Backend Contributor Discovery** (1-2 weeks) + - Modify GitHub client to discover all contributors + - Create storage mechanism for discovered contributors + - Handle API rate limits + +2. **Configuration Enhancement** (3-5 days) + - Update `config.py` to support sponsored contributor selection + - Add flexible configuration options + - Maintain backward compatibility + +3. **Frontend Visualization Update** (1-2 weeks) + - Create community contributor simulation + - Update data loading to incorporate new contributor tiers + - Implement scattered positioning for community contributors + +### Complexity Acknowledgment +The initial estimate of "~50 lines of Python" significantly underestimated the complexity. The actual implementation will require: +- Approximately 200-300 lines of Python +- Comprehensive GitHub API interaction +- Intelligent contributor discovery and storage + +### Risk Mitigation +- Incremental implementation +- Fallback to existing contributor list +- Modular design allowing future refinement --- -(Most of the document remains the same, but with the following changes in key sections) - -### Decision 2: One Ring or Two? - -**Options:** -- A) Sponsored in ring, community scattered (using existing simulation) ✅ **CHOSEN** -- B) Sponsored in inner ring, community in outer ring -- C) Sponsored prominent (larger), community faded - -**Reasoning:** -- **Option A** leverages existing "remaining" simulation -- Minimal code changes required -- Follows current behavior for extra contributors -- Quick to implement with current architecture -- Provides clear visual distinction between sponsored and community contributors - -### Development Strategy: Option A (Existing Simulation) - -**Primary Implementation Approach:** -- Reuse existing `remainingSimulation.js` -- Position sponsored contributors in central ring -- Scatter community contributors using existing logic -- Minimal modifications to current visualization code - -**Benefits:** -- Fastest path to implementation (1 week) -- Low risk of introducing new bugs -- Maintains current performance characteristics -- Easy to iterate and improve in future versions - -**Future Potential:** -- If client wants more refined positioning, can upgrade to Option B later -- Current approach provides a solid, functional first iteration - -(Rest of the document remains the same) +(Rest of the original document remains the same, with these updates) diff --git a/docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md b/docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md new file mode 100644 index 0000000..e998366 --- /dev/null +++ b/docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md @@ -0,0 +1,176 @@ +# Community Contributor Simulation Implementation + +**Date:** February 2026 +**Status:** Technical Design + +## Overview + +This document provides a detailed technical design for implementing the community contributor simulation, addressing the lack of an existing "remaining simulation" in the current codebase. + +## Simulation Requirements + +1. Position community contributors outside the main contributor ring +2. Prevent node overlap +3. Create a visually appealing, scattered layout +4. Maintain performance and scalability +5. Integrate with existing visualization pipeline + +## Proposed Implementation + +### JavaScript Force Simulation Design + +```javascript +import * as d3 from 'd3'; + +export class CommunityContributorSimulation { + constructor(options = {}) { + // Configurable parameters + this.defaultOptions = { + centerX: 0, + centerY: 0, + mainRingRadius: 300, + communityRingMultiplier: 1.5, + nodeRadius: 20, + forceStrength: 0.5, + collisionStrength: 0.7 + }; + + this.options = { ...this.defaultOptions, ...options }; + } + + /** + * Create force simulation for community contributors + * @param {Array} communityNodes - Array of community contributor nodes + * @returns {d3.Simulation} Configured force simulation + */ + create(communityNodes) { + const { + centerX, + centerY, + mainRingRadius, + communityRingMultiplier, + nodeRadius, + forceStrength, + collisionStrength + } = this.options; + + const communityRingRadius = mainRingRadius * communityRingMultiplier; + + return d3.forceSimulation(communityNodes) + // Radial force: scatter nodes in a ring around the center + .force('radial', d3.forceRadial( + // Slight randomness in radius for scattered effect + node => communityRingRadius + (Math.random() * 50 - 25), + centerX, + centerY + ).strength(forceStrength)) + + // Collision force: prevent node overlap + .force('collide', d3.forceCollide(nodeRadius * 2) + .strength(collisionStrength)) + + // Charge force: gentle repulsion between nodes + .force('charge', d3.forceManyBody() + .strength(-30)) + + // Center force: keep nodes near visualization center + .force('center', d3.forceCenter(centerX, centerY)) + + .stop(); // Manually tick the simulation + } + + /** + * Manually run simulation to stable state + * @param {d3.Simulation} simulation - Force simulation instance + * @param {number} tickCount - Number of simulation ticks + */ + stabilize(simulation, tickCount = 100) { + for (let i = 0; i < tickCount; i++) { + simulation.tick(); + } + } +} + +// Usage example +function initializeCommunityContributors(communityNodes, mainRingRadius) { + const simulation = new CommunityContributorSimulation({ + mainRingRadius: mainRingRadius + }); + + const communitySimulation = simulation.create(communityNodes); + simulation.stabilize(communitySimulation); + + return communityNodes; +} +``` + +## Design Considerations + +### Configurability +- Simulation parameters can be adjusted without changing core logic +- Supports different visualization requirements +- Easy to experiment with layout strategies + +### Performance +- Uses D3's efficient force simulation +- Manual simulation stabilization +- Configurable tick count for performance tuning + +### Flexibility +- Can be easily integrated with existing visualization +- Supports dynamic node count +- Provides predictable scattered layout + +## Integration with Existing Visualization + +```javascript +function updateVisualization(nodes) { + const sponsoredNodes = nodes.filter(n => n.tier === 'sponsored'); + const communityNodes = nodes.filter(n => n.tier === 'community'); + + // Existing contributor ring simulation + positionContributorNodes(sponsoredNodes); + + // New community contributor simulation + const mainRingRadius = calculateMainRingRadius(sponsoredNodes); + initializeCommunityContributors(communityNodes, mainRingRadius); + + // Render nodes + renderNodes(nodes); +} +``` + +## Testing Strategy + +1. **Unit Tests** + - Verify simulation creates nodes + - Check node positioning + - Test configuration options + +2. **Visual Regression Tests** + - Snapshot testing of node layouts + - Verify no node overlap + - Check consistent positioning + +3. **Performance Tests** + - Benchmark simulation with various node counts + - Profile memory and CPU usage + +## Potential Future Enhancements + +- Machine learning-based node positioning +- More advanced collision detection +- Animated transitions between layouts +- Configurable layout algorithms + +## Recommended Next Steps + +1. Implement simulation class +2. Create comprehensive test suite +3. Integrate with existing visualization +4. Perform user testing and gather feedback + +--- + +**Status:** Ready for Implementation +**Last Updated:** February 2026 diff --git a/js/chart.js b/js/chart.js index ed9fc56..ca41337 100644 --- a/js/chart.js +++ b/js/chart.js @@ -1271,6 +1271,13 @@ const createContributorNetworkVisual = ( ); RADIUS_CONTRIBUTOR = positioningResult.RADIUS_CONTRIBUTOR; CONTRIBUTOR_RING_WIDTH = positioningResult.CONTRIBUTOR_RING_WIDTH; + + // Pre-compute SF now that RADIUS_CONTRIBUTOR is known. + // This ensures SF is consistent before any downstream code references it. + // resize() will re-derive the same value, but setting it early prevents + // any intermediate code from seeing a stale SF. + SF = calculateScaleFactor(WIDTH, DEFAULT_SIZE, RADIUS_CONTRIBUTOR, CONTRIBUTOR_RING_WIDTH); + nodes_central = runCollaborationSimulation( nodes, links, @@ -1355,14 +1362,19 @@ const createContributorNetworkVisual = ( } }); - // Re-setup interaction handlers + // Calculate SF early so it's finalized before interaction handlers capture it. + // positionContributorNodes() determines RADIUS_CONTRIBUTOR and CONTRIBUTOR_RING_WIDTH, + // which resize() uses to compute SF. By calling resize() first, we ensure setupHover() + // and setupClick() capture the final SF value — preventing the hit-detection offset bug + // that occurred when SF changed between setupHover() and resize(). + // This matches the order in the initial chart() function (resize before interactions). + chart.resize(); + + // Re-setup interaction handlers AFTER resize so they have correct WIDTH/HEIGHT/SF values setupHover(); setupClick(); setupZoom(); - // Redraw with new scale factors - chart.resize(); - return chart; }; From decc15937907b69946b450fd4e3a3a8ed81ad0ac Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Mon, 16 Feb 2026 17:21:51 -0500 Subject: [PATCH 5/8] remove doc --- BUG_ANALYSIS.md | 139 ------------------------------------------------ 1 file changed, 139 deletions(-) delete mode 100644 BUG_ANALYSIS.md diff --git a/BUG_ANALYSIS.md b/BUG_ANALYSIS.md deleted file mode 100644 index 255d36d..0000000 --- a/BUG_ANALYSIS.md +++ /dev/null @@ -1,139 +0,0 @@ -# Bug Analysis: Hover/Click Hit Detection Offset After Filter Removal - -## Summary -When applying and then removing a filter, the hover/click interactive zones for contributor nodes become offset from their visual positions. This is a **scale factor (SF) synchronization issue** between node positioning and hit detection. - -## Root Cause - -The bug occurs due to a race condition in the `rebuild()` function where the scale factor (`SF`) may change between when contributor nodes are positioned and when the Delaunay triangulation is created for hit detection. - -### The Problem Flow - -1. **Filter is applied** → fewer contributors visible - - `positionContributorNodes()` positions contributors with initial SF - - `RADIUS_CONTRIBUTOR` is calculated for the filtered set - - `resize()` is called and calculates SF based on: - ```javascript - let OUTER_RING = RADIUS_CONTRIBUTOR + (CONTRIBUTOR_RING_WIDTH / 2) * 2; - if (state.WIDTH / 2 < OUTER_RING * state.SF) { - state.SF = state.WIDTH / (2 * OUTER_RING); // May REDUCE SF - } - ``` - - Delaunay is created from positioned nodes with this SF value - - **Hover works correctly** ✓ - -2. **Filter is removed** → all contributors visible again - - `rebuild()` is called - - `positionContributorNodes()` positions ALL contributors (larger set) - - New `RADIUS_CONTRIBUTOR` is calculated (likely larger) - - **But**: nodes are positioned using the OLD SF value from step 1 - - `setupHover()` is called BEFORE `resize()` - it captures the current config with OLD SF - - `resize()` is called and recalculates SF: - - With MORE contributors, `OUTER_RING` becomes larger - - This might cause SF to be adjusted DIFFERENTLY than before - - **New Delaunay is created with adjusted node positions, but setupHover still has old SF config** - - **Hover fails** - the hit detection zone is offset from the visual position ✗ - -### Why Only Contributor Nodes Are Affected - -- Contributor nodes have their positions directly calculated by `positionContributorNodes()` -- Their radius (`RADIUS_CONTRIBUTOR`) directly affects the `OUTER_RING` calculation -- Org/repo nodes are positioned by force simulations, which are less sensitive to the overall scale - -## The Specific Code Issue - -**File:** `js/chart.js`, lines 1359-1364 - -```javascript -// Line 1359: setupHover() is called HERE -setupHover(); -setupClick(); -setupZoom(); - -// Line 1364: resize() is called LATER -chart.resize(); -``` - -**Inside `setupHover()` (line 897-904):** -```javascript -const config = { - PIXEL_RATIO, - WIDTH, - HEIGHT, - SF, // ← Captured at this moment - RADIUS_CONTRIBUTOR, - CONTRIBUTOR_RING_WIDTH, - sqrt -}; -``` - -**Inside `resize()` (via handleResize, line 101-104 in resize.js):** -```javascript -state.SF = state.WIDTH / DEFAULT_SIZE; -let OUTER_RING = RADIUS_CONTRIBUTOR + (CONTRIBUTOR_RING_WIDTH / 2) * 2; -if (state.WIDTH / 2 < OUTER_RING * state.SF) { - state.SF = state.WIDTH / (2 * OUTER_RING); // May change SF -} -``` - -The issue: `setupHover()` uses `config` with an OLD SF, but `findNode()` uses this stale SF for coordinate transformation while the Delaunay was created with NEW positions calculated under a DIFFERENT SF. - -## Coordinate Transformation Impact - -In `js/interaction/findNode.js` (lines 30-38): -```javascript -if (zoomTransform && zoomTransform.k !== 1) { - const mxDevice = mx * PIXEL_RATIO; - const myDevice = my * PIXEL_RATIO; - mx = ((mxDevice - zoomTransform.x * PIXEL_RATIO) / zoomTransform.k - WIDTH / 2) / SF; - my = ((myDevice - zoomTransform.y * PIXEL_RATIO) / zoomTransform.k - HEIGHT / 2) / SF; -} else { - mx = (mx * PIXEL_RATIO - WIDTH / 2) / SF; // ← Uses config.SF from setupHover - my = (my * PIXEL_RATIO - HEIGHT / 2) / SF; -} -``` - -If `SF` changed between node positioning and hit detection, these coordinate transformations produce incorrect visualization-space coordinates, causing the offset. - -## Solution Options - -### Option 1: Recalculate SF Before Positioning (Recommended) -Move the SF calculation to happen BEFORE `positionContributorNodes()`: -- Calculate what the SF WILL BE after resize -- Position contributors based on that SF -- When resize() happens, use the same calculated SF -- Delaunay will be created from correctly-positioned nodes - -**Location:** `js/chart.js` in `rebuild()` function - -### Option 2: Call setupHover/setupClick After resize() -Swap the order so interaction handlers are set up AFTER resize(): -```javascript -// Instead of: -setupHover(); setupClick(); setupZoom(); chart.resize(); - -// Do this: -chart.resize(); setupHover(); setupClick(); setupZoom(); -``` -This ensures setupHover() captures the FINAL SF value. - -### Option 3: Use Live Config References -Instead of capturing config values in setupHover(), pass the chart object itself and have findNode access config values directly at the time of lookup. This ensures it always uses current values. - -## Testing Strategy - -To verify the fix: -1. Open the visualization -2. Apply a filter that significantly reduces contributors (e.g., from 20 to 3) -3. Hover over contributor nodes - should work ✓ -4. Remove the filter to show all contributors again -5. Hover over contributor nodes - should now work (currently fails) ✓ -6. Hover over org/repo nodes - should continue to work ✓ -7. Test with multiple filter applications and removals - -## Files Involved -- `js/chart.js` - Main rebuild() function, setupHover() function -- `js/layout/resize.js` - handleResize() where SF is calculated -- `js/layout/positioning.js` - positionContributorNodes() positions nodes -- `js/interaction/findNode.js` - Uses SF for coordinate transformation -- `js/interaction/hover.js` - Sets up hover interaction with config From 05e394f405bfec8e2f8613912d907975f4406aaa Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Tue, 17 Feb 2026 09:26:37 -0500 Subject: [PATCH 6/8] remove unnecessary docs --- docs/VISUALIZATION_DESIGN_GUIDE.md | 651 ------------------ docs/sponsor_centric/00_READ_ME_FIRST.txt | 365 ---------- docs/sponsor_centric/ASSESSMENT_INDEX.md | 408 ----------- .../BLOCKING_QUESTIONS_RESOLUTION.md | 159 ----- .../CONFIG_ENHANCEMENT_STRATEGY.md | 168 ----- .../CONTRIBUTOR_DISCOVERY_STRATEGY.md | 194 ------ .../sponsor_centric/FEASIBILITY_ASSESSMENT.md | 85 --- .../FEATURE_REQUEST_SUMMARY.md | 56 -- .../sponsor_centric/IMPLEMENTATION_ROADMAP.md | 50 -- .../SIMULATION_IMPLEMENTATION_DETAILS.md | 176 ----- 10 files changed, 2312 deletions(-) delete mode 100644 docs/VISUALIZATION_DESIGN_GUIDE.md delete mode 100644 docs/sponsor_centric/00_READ_ME_FIRST.txt delete mode 100644 docs/sponsor_centric/ASSESSMENT_INDEX.md delete mode 100644 docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md delete mode 100644 docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md delete mode 100644 docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md delete mode 100644 docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md delete mode 100644 docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md delete mode 100644 docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md delete mode 100644 docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md diff --git a/docs/VISUALIZATION_DESIGN_GUIDE.md b/docs/VISUALIZATION_DESIGN_GUIDE.md deleted file mode 100644 index 5d5150c..0000000 --- a/docs/VISUALIZATION_DESIGN_GUIDE.md +++ /dev/null @@ -1,651 +0,0 @@ -# Visualization Design Guide: Sponsored vs. Community Contributors - -**Purpose:** Help you visualize and decide on the design for the tiered contributor visualization -**Status:** Design Decision Document -**Date:** February 2026 - ---- - -## Current Visualization (No Tiers) - -``` - All Contributors in Ring - - USER A (15) - - USER B (8) USER C (12) - - USER D (6) [Center] USER E (9) - - USER F (20) USER G (4) - - USER H (2) USER I (18) - - USER J (7) - - [Repositories with links to all contributors] -``` - -**Current Behavior:** -- All tracked contributors shown in fixed ring -- Repositories positioned based on force simulation -- Links show commit relationships -- No distinction between different contributor types - ---- - -## Proposed Design Option A: Outer Scattered Layout - -Uses existing "remaining" simulation to position community contributors. - -``` - Sponsored in Ring - - ALICE (15) - - BOB (8) [SPONSOR] CHARLIE (12) - - DAVE (6) [Center] EVE (9) - - FRANK (20) [REPOS] GRACE (4) - - HENRY (2) [SPONSOR] IRIS (18) - - JACK (7) - - - [Community scattered around/outside:] - - Unknown1 • Unknown2 • Unknown3 • - - Unknown4 • Unknown5 • - - - Unknown6 • Unknown7 • -``` - -**Characteristics:** -- Sponsored contributors: Central ring (prominent) -- Community contributors: Scattered around edges (visible but secondary) -- Visual hierarchy: Ring = important, scattered = supporting -- Reference: Current "extra contributors" positioning - -**Pros:** -- Minimal code changes (reuse existing simulation) -- Quick to implement (1 week) -- Clear visual hierarchy -- Familiar pattern (already used for extras) - -**Cons:** -- Community contributors may feel "random" -- Less organized appearance -- Harder to see all community members at once - ---- - -## Proposed Design Option B: Outer Ring Layout - -Creates second ring for community contributors. - -``` - ╔═══════════════════════════════════════════════╗ - ║ ║ - ║ Unknown3 • Unknown2 • Unknown1 • ║ - ║ ║ - ║ Unknown7 • Unknown4 • ║ - ║ ║ - ║ Unknown6 • Unknown5 • ║ - ║ ║ - ║ COMMUNITY RING (Outer) ║ - ║ ║ - ╚═════════════════════════════════════════════╝ - - ALICE (15) - - BOB (8) [SPONSOR] CHARLIE (12) - - DAVE (6) [Center] EVE (9) - - FRANK (20) [REPOS] GRACE (4) - - HENRY (2) [SPONSOR] IRIS (18) - - JACK (7) - - ╚═════════════════════════════════════════════╝ - ║ SPONSORED RING (Inner) ║ - ║ ║ - ║ Named contributors arranged in circle ║ - ║ ║ - ╚═════════════════════════════════════════════╝ -``` - -**Characteristics:** -- Sponsored contributors: Inner ring (very prominent) -- Community contributors: Outer ring (visible, organized) -- Visual hierarchy: Ring position = importance -- Reference: ORCA visualization model - -**Pros:** -- Clear visual distinction (two rings) -- Organized appearance -- Like ORCA model (recognizable pattern) -- Community members still visible/accessible - -**Cons:** -- More implementation effort (custom simulation) -- Longer development (2 weeks) -- May feel visually cluttered -- Requires tuning for attractive layout - ---- - -## Color & Style Design - -### Sponsored Contributor Styling - -``` -┌─────────────────────┐ -│ Sponsored Contrib │ -│ │ -│ ●●●●●●●● │ ← Circle node -│ ● Anthony ● │ (orange, full opacity) -│ ●●●●●●●● │ -│ │ -│ 14 commits │ ← Label -│ 5 repositories │ -└─────────────────────┘ - -Color: Grenadier Orange (#CF3F02) - Your brand color -Size: Default (e.g., 40px radius) -Opacity: 100% -Border: 2px solid darker orange -Label: Visible in ring -``` - -### Community Contributor Styling - -``` -┌─────────────────────┐ -│ Community Contrib │ -│ │ -│ ○○○○○○○ │ ← Circle node -│ ○ Unknown ○ │ (muted blue, 70% opacity) -│ ○○○○○○○ │ -│ │ -│ 2 commits │ ← Label (lighter) -│ 1 repository │ -└─────────────────────┘ - -Color: Aquamarine Blue (#2E86AB) - Secondary brand color -Size: 85% of default (e.g., 34px radius) -Opacity: 70% (muted appearance) -Border: 1px solid lighter blue -Label: Visible but lighter -``` - -### Comparison Visual - -``` -Sponsored (Full prominence) Community (Secondary prominence) - - ●●●●●●● ○○○○○○○ - ● (40px) ● ○ (34px) ○ - ● 100% ● ○ 70% ○ - ●●●●●●● ○○○○○○○ - Orange Blue - Bold Muted -``` - ---- - -## Link Styling (From Contributors to Repos) - -### Sponsored Contributor Links -``` -[SPONSOR] ═══════════════════ [REPO] - Orange Bold, Full Gray/Blue - Opacity Color -``` - -- Start color: Grenadier orange -- Width: Based on commit count (thicker = more commits) -- Opacity: 100% for recent contributions, 70% for old - -### Community Contributor Links -``` -[COMMUNITY] - - - - - - - - - [REPO] - Blue Dashed, 70% Gray/Blue - Opacity Color -``` - -- Start color: Aquamarine blue (muted) -- Width: Based on commit count -- Opacity: 70% (less prominent) -- Optional: Dashed line to indicate secondary contributor - ---- - -## Tooltip Design - -### Sponsored Contributor Tooltip - -``` -╔════════════════════════════╗ -║ Anthony Boyd ║ -║ ┌──────────────────────┐ ║ -║ │ SPONSORED CONTRIBUTOR│ ║ ← Orange badge -║ └──────────────────────┘ ║ -║ ║ -║ Contributions: 14 commits ║ -║ Repositories: 5 ║ -║ First Commit: Jan 2024 ║ -║ Last Commit: Feb 2026 ║ -╚════════════════════════════╝ -``` - -### Community Contributor Tooltip - -``` -╔════════════════════════════╗ -║ Unknown Contributor ║ -║ ┌──────────────────────┐ ║ -║ │ COMMUNITY CONTRIBUTOR│ ║ ← Blue badge -║ └──────────────────────┘ ║ -║ ║ -║ Contributions: 2 commits ║ -║ Repositories: 1 ║ -║ First Commit: Jun 2024 ║ -║ Last Commit: Oct 2025 ║ -╚════════════════════════════╝ -``` - ---- - -## Layout Comparison: Before vs. After - -### Before (Current - No Tiers) - -``` -INPUT: -- List of repos (fixed) -- All contributors to those repos -- No classification - -PROCESSING: -- Fetch all commits -- Identify all contributors -- No separation/grouping - -OUTPUT: -Users in Ring [All same visual style] -● User A -● User B -● User C - [etc - could be 50+ nodes] - -[Repos in center, connected to all] - -RESULT: -- Can't tell which contributors are important -- Hard to see community impact -- Visual clutter with many users -``` - -### After (New - With Tiers) - -``` -INPUT: -- List of repos (fixed) -- All contributors to those repos -- Sponsor list (config) - -PROCESSING: -- Fetch all commits -- Identify all contributors -- Classify as sponsored/community - -OUTPUT: -Sponsored Community -(Central Ring) (Outer Ring) -● Alice ○ Unknown1 -● Bob ○ Unknown2 -● Charlie ○ Unknown3 -[5-10 important] [30-100 rest] - -[Repos connected to both] - -RESULT: -- Clear hierarchy (ring position = importance) -- Easier to see community involvement -- Organized appearance -- Sponsored contributors highlighted -``` - ---- - -## Decision Matrix - -### Option A (Scattered Community) - -| Aspect | Rating | Notes | -|--------|--------|-------| -| Visual Clarity | ⭐⭐⭐ | Good - clear hierarchy | -| Implementation | ⭐⭐⭐⭐⭐ | Very easy - reuse existing | -| Development Time | ⭐⭐⭐⭐⭐ | 1 week | -| Aesthetics | ⭐⭐⭐ | Good but less organized | -| ORCA Similarity | ⭐⭐ | Loose reference | -| Community Recognition | ⭐⭐⭐ | Community still visible | -| Scalability | ⭐⭐⭐ | Fine for 50-100 community | - -### Option B (Outer Ring) - -| Aspect | Rating | Notes | -|--------|--------|-------| -| Visual Clarity | ⭐⭐⭐⭐ | Excellent - two-ring system | -| Implementation | ⭐⭐⭐ | Moderate - custom simulation | -| Development Time | ⭐⭐⭐ | 2 weeks | -| Aesthetics | ⭐⭐⭐⭐⭐ | Beautiful - professional | -| ORCA Similarity | ⭐⭐⭐⭐⭐ | Direct match | -| Community Recognition | ⭐⭐⭐⭐⭐ | Prominent - organized ring | -| Scalability | ⭐⭐⭐⭐ | Better for 100+ community | - ---- - -## Mobile Responsiveness - -### Desktop (Current) - -``` -┌──────────────────────────────────┐ -│ Visualization (1200x800) │ -│ ┌──────────────────────────────┐│ -│ │ Central ring layout ││ -│ │ Full visualization visible ││ -│ │ All nodes labeled ││ -│ │ Hover tooltips work ││ -│ └──────────────────────────────┘│ -│ ┌──────────────────────────────┐│ -│ │ Filters & Legend ││ -│ └──────────────────────────────┘│ -└──────────────────────────────────┘ -``` - -### Tablet (Moderate Screen) - -``` -┌─────────────────────────┐ -│ Visualization (600x500)│ -│ ┌───────────────────────┐ -│ │ Smaller nodes │ -│ │ Some labels removed │ -│ │ Zoom still works │ -│ └───────────────────────┘ -│ ┌───────────────────────┐ -│ │ Filters & Legend │ -│ │ (Simplified) │ -│ └───────────────────────┘ -└─────────────────────────┘ -``` - -### Mobile (Small Screen) - -``` -┌────────────┐ -│ MOBILE │ -│ (360x640)│ -│ ┌────────┐│ -│ │ Viz ││ ← Smaller -│ │(scalable) -│ │ Tap=|| -│ │ (no hover) -│ └────────┘│ -│ ┌────────┐│ -│ │Legend ││ ← Stacked -│ │Filters ││ Vertical -│ └────────┘│ -└────────────┘ -``` - -**Recommendation:** Start with desktop/tablet optimization. Mobile can be Phase 2. - ---- - -## Animation & Interaction - -### Hover Behavior - -``` -User hovers on node: - 1. Node: Slight grow animation (5% larger) - 2. Links: Highlight connected links (higher opacity) - 3. Tooltip: Appears near cursor - 4. Related nodes: Fade other nodes to 30% opacity - -User moves away: - 1. Node: Shrink back to normal - 2. Links: Return to default opacity - 3. Tooltip: Fade out - 4. Related nodes: Fade back to 100% - -Duration: 200ms smooth transitions -``` - -### Click Behavior - -``` -User clicks node: - 1. Node: Expand/highlight (visual "selection") - 2. Tooltip: Show detailed info - 3. Links: All links from node highlighted - 4. Related nodes: Highlight connected nodes - 5. Lock state until click elsewhere or Escape - -User clicks elsewhere: - 1. Deselect node - 2. Return to default view -``` - ---- - -## Filtering Behavior - -### Current Filtering -- Filter by organization -- Filter by stars -- Filter by language -- Results: Hides repos, cascades to hide links/contributors - -### Proposed Filtering (After New Feature) - -**Option 1: Tier-Aware Filtering** -``` -- Sponsored contributors: Always visible (never filtered) -- Community contributors: Can be hidden if filters exclude their repos -- Repos: Can be filtered -- Links: Show based on visible repo/contributor combination -``` - -**Option 2: Tier Toggle** -``` -- Checkbox: "Show community contributors" (default: ON) -- When OFF: Hide all community nodes, show only sponsored -- Useful for focused view on key contributors -- Fast way to simplify visualization -``` - -**Recommendation:** Implement Option 1 initially, add Option 2 in Phase 2 if requested. - ---- - -## Performance Considerations - -### Data Size Impact -``` -Scenario 1: Small Project -- 10 repos, 30 contributors (20 sponsored) -- Current: ~30 nodes + ~100 links -- After: No change in data size -- Performance: Excellent - -Scenario 2: Medium Project -- 50 repos, 150 contributors (20 sponsored) -- Current: ~150 nodes + ~500 links -- After: No change in data size -- Performance: Good - -Scenario 3: Large Project -- 75 repos, 300+ contributors (20 sponsored) -- Current: ~300 nodes + ~1000 links -- After: No change in data size -- Performance: Monitor, may need optimization -``` - -### Optimization Strategies -1. Lazy load community contributor details -2. Use simplified tooltips for community (load on demand) -3. Batch force simulation calculations -4. Render community nodes at lower detail initially - ---- - -## Color Accessibility - -### Current Colors -- Grenadier Orange (#CF3F02) -- Aquamarine Blue (#2E86AB) -- Base Gray (#443F3F) - -### Contrast Ratios -- Orange on white: 5.2:1 ✅ (WCAG AA) -- Blue on white: 5.1:1 ✅ (WCAG AA) -- Gray on white: 6.8:1 ✅ (WCAG AAA) - -### Colorblind-Friendly Design -- Don't rely on color alone -- Use size/shape/position for distinction -- Add tier badges (text) to distinguish -- Links: Use both color gradient and stroke width - -### Recommended Accessibility Features -1. Tier badge labels (text not just color) -2. High contrast borders on nodes -3. Alternative icons for colorblind users -4. Keyboard navigation support - ---- - -## Design Decision Template - -Use this to document your final choice: - -``` -DESIGN DECISION: [Option A / Option B] - -RATIONALE: -- Why this option? -- What makes it right for your use case? -- What are the key benefits? - -VISUAL SPECIFICATIONS: -- Sponsored node color: [color] -- Community node color: [color] -- Opacity differences: [specs] -- Size differences: [specs] -- Labels: [visible/hidden/conditional] - -COMMUNITY POSITIONING: -- Layout: [scattered/ring/other] -- Distance from center: [radius] -- Interaction behavior: [behavior] - -TOOLTIPS: -- Show tier? [yes/no] -- Tier badge style: [style] -- Information shown: [fields] - -FILTERS: -- Community visible by default? [yes/no] -- Can be filtered out? [yes/no] -- Always show sponsored? [yes/no] - -TIMELINE: -- Design approval: [date] -- Development start: [date] -- Target launch: [date] -``` - ---- - -## Example: Completed Decision Document - -``` -DESIGN DECISION: Option B (Outer Ring) - -RATIONALE: -- Matches ORCA model (client familiar with it) -- Professional appearance -- Clear visual hierarchy -- Scales well with large communities - -VISUAL SPECIFICATIONS: -- Sponsored node color: #CF3F02 (Grenadier Orange) -- Community node color: #2E86AB (Aquamarine Blue) -- Community opacity: 75% -- Community size: 85% of sponsored -- All nodes labeled (adjustable font size by tier) - -COMMUNITY POSITIONING: -- Layout: Outer ring -- Distance from center: 400px (vs 150px for sponsored) -- Gentle repulsion between community nodes (no overlap) -- Radial positioning (angle-based like sponsored ring) - -TOOLTIPS: -- Show tier? Yes -- Tier badge style: Colored pill with text -- Information shown: Name, Tier, Commits, Repos, Dates - -FILTERS: -- Community visible by default? Yes -- Can be filtered out? Yes (via repo filters) -- Always show sponsored? Yes (never hidden) - -TIMELINE: -- Design approval: Feb 14, 2026 -- Development start: Feb 17, 2026 -- Target launch: Mar 14, 2026 (4 weeks) -``` - ---- - -## Next Steps - -1. **Choose Option A or Option B** - - Or propose a hybrid/custom approach - -2. **Finalize Color Scheme** - - Confirm using existing brand colors - - Or propose alternatives - -3. **Specify Tier Labels/Badging** - - How should tiers be shown? - - Text, icons, colors, or combinations? - -4. **Decide on Community Visibility** - - Always shown? - - Togglable? - - Filtered by default? - -5. **Approve Timeline** - - Option A: 3 weeks - - Option B: 4 weeks - - With feedback loops: add 1 week - -Once these decisions are made, the implementation roadmap can proceed with certainty. - ---- - -**This guide is ready for design discussion and feedback.** - -**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/00_READ_ME_FIRST.txt b/docs/sponsor_centric/00_READ_ME_FIRST.txt deleted file mode 100644 index 31fc157..0000000 --- a/docs/sponsor_centric/00_READ_ME_FIRST.txt +++ /dev/null @@ -1,365 +0,0 @@ -================================================================================ - FEASIBILITY ASSESSMENT - SPONSORED VS COMMUNITY CONTRIBUTORS -================================================================================ - -STATUS: ✅ APPROVED FOR IMPLEMENTATION (3-4 weeks estimated) - -This folder contains a complete feasibility assessment for your feature request: -"Visualize sponsored contributors in a ring, community contributors scattered" - -================================================================================ - START HERE -================================================================================ - -1. Read this file first (you are here!) -2. Then read: ASSESSMENT_INDEX.md (navigation guide) -3. Then read: FEATURE_REQUEST_SUMMARY.md (quick overview) - -That's all you need for the first round. It takes 30 minutes. - -================================================================================ - WHAT YOU'RE GETTING -================================================================================ - -✅ ASSESSMENT_INDEX.md - - Navigation guide for all 4 documents - - Quick reference tables - - Who should read what - - Key questions answered - -✅ FEATURE_REQUEST_SUMMARY.md - - Executive summary (approvals/stakeholders read this) - - What's being built & why - - Feasibility verdict: HIGHLY FEASIBLE - - Configuration examples - - Next steps for approval - -✅ FEASIBILITY_ASSESSMENT.md - - Technical deep-dive (developers/architects read this) - - Architecture impact analysis - - File-by-file changes required - - Risk assessment - - Testing strategy - - Comparison to ORCA model - -✅ IMPLEMENTATION_ROADMAP.md - - Step-by-step implementation guide - - 6 phases from backend to release - - Code examples for each phase - - Testing procedures - - Validation checkpoints - - Command reference - -✅ VISUALIZATION_DESIGN_GUIDE.md - - Design specs for two options (A & B) - - Option A: Scattered community (simpler, 1 week) - - Option B: Outer ring (more polished, 2 weeks) - - Color schemes and styling - - Wireframes and diagrams - - Decision matrix - - Design decision template - -================================================================================ - KEY FINDINGS -================================================================================ - -FEASIBILITY: ✅ Highly Feasible -Risk Level: 🟡 Low-to-Medium (no high risks) -Effort Required: ~300 lines of code total -Timeline: 3-4 weeks (Option A) to 4-5 weeks (Option B) -Breaking Changes: None - fully backward compatible -Architecture Ready: Yes - modular design supports this well - -BACKEND: ~50 lines Python - - Config parsing for sponsor list - - Contributor classification - - CSV output with tier column - Duration: 4-5 days - -FRONTEND: ~250 lines JavaScript - - Load tier data - - Position nodes by tier - - Render with tier-based styling - Duration: 1-2 weeks - -DESIGN DECISION: Option A vs Option B - - A: Simpler, faster (1 week for layout) - - B: More polished, like ORCA (2 weeks for layout) - RECOMMENDATION: Start with A, upgrade to B based on feedback - -================================================================================ - QUICK START GUIDE -================================================================================ - -Step 1: UNDERSTAND (30 min) - → Read: ASSESSMENT_INDEX.md (3 min) - → Read: FEATURE_REQUEST_SUMMARY.md (10 min) - → Read: VISUALIZATION_DESIGN_GUIDE.md - Option A & B (15 min) - -Step 2: DECIDE (15 min) - → Which design option? A (simple) or B (polished)? - → Who needs to approve? - → Create timeline - -Step 3: PLAN (30 min) - → Read: IMPLEMENTATION_ROADMAP.md - Overview - → Assign developers to phases - → Schedule implementation - -Step 4: BUILD (3-4 weeks) - → Execute phases 1-6 using IMPLEMENTATION_ROADMAP.md - → Follow code examples and testing procedures - → Iterate based on feedback - -================================================================================ - DOCUMENT READING ORDER -================================================================================ - -For Executives/Approvers: - 1. This file (README) - 2. FEATURE_REQUEST_SUMMARY.md ← Key document - 3. VISUALIZATION_DESIGN_GUIDE.md (design options only) - -For Developers: - 1. This file (README) - 2. ASSESSMENT_INDEX.md (navigation) - 3. FEATURE_REQUEST_SUMMARY.md (context) - 4. IMPLEMENTATION_ROADMAP.md ← Key document - 5. Code examples in relevant phase - -For Architects/Tech Leads: - 1. This file (README) - 2. ASSESSMENT_INDEX.md (navigation) - 3. FEASIBILITY_ASSESSMENT.md ← Key document - 4. IMPLEMENTATION_ROADMAP.md (phases overview) - -For Designers/Product Managers: - 1. This file (README) - 2. FEATURE_REQUEST_SUMMARY.md (context) - 3. VISUALIZATION_DESIGN_GUIDE.md ← Key document - 4. Design decision template (complete before Phase 3) - -================================================================================ - WHAT HAPPENS NEXT -================================================================================ - -Week 1: Approval & Planning - - Stakeholders review FEATURE_REQUEST_SUMMARY.md - - Design team reviews VISUALIZATION_DESIGN_GUIDE.md - - Decision: Option A or Option B? - - Get executive approval to proceed - -Week 2: Development Kickoff - - Review IMPLEMENTATION_ROADMAP.md - - Assign developers to phases - - Phase 1 (Backend) begins - - Dev work: 4-5 days - -Weeks 3-4: Frontend Development - - Phase 2 (Data Loading): 2-3 days - - Phase 3 (Layout/Simulation): 5-7 days (depends on Option A vs B) - - Phase 4 (Rendering): 3-4 days - - Continuous testing - -Weeks 4-5: Testing & Refinement - - Phase 5 (Comprehensive Testing): 3-5 days - - Phase 6 (Polish & Release): 3-5 days - - Design feedback and iterations - - Production deployment - -================================================================================ - DESIGN OPTIONS AT A GLANCE -================================================================================ - -Option A: SCATTERED COMMUNITY LAYOUT (Simpler) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -┌─────────────────────────────────────────────┐ -│ │ -│ Unknown1 • Unknown2 • Unknown3 • │ -│ │ -│ Unknown4 • Unknown5 • │ -│ │ -│ │ -│ [SPONSORED RING] │ -│ ALICE BOB CHARLIE DAVE │ -│ (Central, prominent) │ -│ │ -│ Unknown6 • Unknown7 • │ -│ │ -│ (Scattered around ring - visible, less │ -│ prominent, uses existing simulation) │ -└─────────────────────────────────────────────┘ - -Pros: Easy to implement (1 week), reuses existing code -Cons: Less organized appearance - -Option B: OUTER RING LAYOUT (More Polished) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -┌─────────────────────────────────────────────┐ -│ Unknown3 • Unknown2 • Unknown1 • │ -│ │ -│ Unknown7 • Unknown4 • │ -│ │ -│ Unknown6 • Unknown5 • │ -│ │ -│ ╔═════════════════════════════════════════╗ │ -│ ║ [COMMUNITY RING - Outer] ║ │ -│ ║ (Organized circle of community nodes) ║ │ -│ ╚═════════════════════════════════════════╝ │ -│ │ -│ ALICE BOB CHARLIE │ -│ DAVE EVE │ -│ (SPONSORED RING - Inner) │ -│ [Central Repositories] │ -│ │ -└─────────────────────────────────────────────┘ - -Pros: Professional appearance, like ORCA, clear hierarchy -Cons: Requires custom simulation (2 weeks) - -RECOMMENDATION: Start with Option A, upgrade to B based on feedback - -================================================================================ - KEY METRICS -================================================================================ - -Code Changes: ~300 lines total - Backend: ~50 lines (config + classification) - Frontend: ~250 lines (data loading + layout + rendering) - -Timeline: 3-4 weeks (Option A) to 4-5 weeks (Option B) - Phase 1 Backend: 4-5 days - Phase 2 Frontend: 2-3 days - Phase 3 Layout: 5-7 days (varies by option) - Phase 4 Rendering: 3-4 days - Phase 5 Testing: 3-5 days - Phase 6 Polish: 3-5 days - -Risk Level: 🟡 Low-to-Medium - High Risk: None identified - Medium Risk: Visual design tuning, force simulation performance - Low Risk: Config changes, data model, CSV output - -Backward Compat: ✅ 100% - existing configs work unchanged -Performance: ✅ No degradation expected -Testing: ✅ Comprehensive test procedures included -Documentation: ✅ Code examples for each phase - -================================================================================ - FREQUENTLY ASKED QUESTIONS -================================================================================ - -Q: Is this actually feasible? -A: ✅ YES. Highly feasible. Architecture is perfect for this. - -Q: Will it break existing functionality? -A: ✅ NO. Fully backward compatible. - -Q: How long will it take? -A: 3-4 weeks (Option A) to 4-5 weeks (Option B) - -Q: What are the main risks? -A: Visual design tuning and force simulation performance (medium risks). - No high-risk items identified. - -Q: Do we need to refactor existing code? -A: No. Most changes are additive. Minimal refactoring needed. - -Q: Can we do this iteratively? -A: ✅ YES. Option A works as first pass, upgrade to Option B later. - -Q: What about the sponsor list maintenance? -A: Simple - just edit config.toml. No code changes needed. - -Q: Will community contributors feel left out? -A: No - they're still visible, just positioned differently. They're - recognized and connected to repos, just not in the center ring. - -Q: Can we filter by tier? -A: ✅ YES. You can show/hide community contributors as desired. - -Q: Does it work on mobile? -A: ✅ YES. Visualization is responsive. Phase 2 can optimize further. - -================================================================================ - NEXT IMMEDIATE STEPS -================================================================================ - -TODAY: -☐ Read: ASSESSMENT_INDEX.md (3 min) -☐ Read: FEATURE_REQUEST_SUMMARY.md (10 min) -☐ Read: VISUALIZATION_DESIGN_GUIDE.md pages 1-20 (15 min) - -THIS WEEK: -☐ Team discussion: Option A or Option B? -☐ Get executive/client approval -☐ Schedule implementation timeline - -NEXT WEEK: -☐ Assign developers (backend + frontend) -☐ Schedule Phase 1 kickoff -☐ Begin Phase 1 development - -================================================================================ - DOCUMENT LOCATIONS -================================================================================ - -In this folder, you'll find: - -• 00_READ_ME_FIRST.txt (this file) -• ASSESSMENT_INDEX.md (navigation guide - start here) -• FEATURE_REQUEST_SUMMARY.md (overview - read 2nd) -• FEASIBILITY_ASSESSMENT.md (technical detail) -• IMPLEMENTATION_ROADMAP.md (how to build) -• VISUALIZATION_DESIGN_GUIDE.md (design specs) - -Plus, these documents from the existing project: -• docs/ARCHITECTURE.md (existing project architecture) -• config.toml (configuration file) -• CLAUDE.md (developer guide) - -================================================================================ - QUICK NAVIGATION -================================================================================ - -Want to know: Read: -─────────────────────────────────────────────────────────────────────────── -"Is this feasible?" FEATURE_REQUEST_SUMMARY.md (2 min) -"What needs to change?" FEASIBILITY_ASSESSMENT.md (15 min) -"How do I build this?" IMPLEMENTATION_ROADMAP.md (30 min) -"What should it look like?" VISUALIZATION_DESIGN_GUIDE.md (20 min) -"Which path do I take?" ASSESSMENT_INDEX.md (5 min) -"When will it be done?" FEATURE_REQUEST_SUMMARY.md (1 min) -"What are the risks?" FEASIBILITY_ASSESSMENT.md - Risk section (5 min) -"How much code is this?" FEASIBILITY_ASSESSMENT.md - File changes table (2 min) - -================================================================================ - ASSESSMENT COMPLETION SUMMARY -================================================================================ - -✅ Architecture review: Complete -✅ Feasibility analysis: Complete -✅ Risk assessment: Complete -✅ Implementation planning: Complete -✅ Design options: Complete -✅ Code examples: Complete -✅ Testing procedures: Complete -✅ Timeline estimates: Complete -✅ Resource planning: Complete -✅ Documentation: Complete - -STATUS: Ready for Implementation -DATE: February 2026 - -================================================================================ - -Now proceed to: ASSESSMENT_INDEX.md - -That file will guide you through all the assessment documents and help you -navigate to exactly what you need. - -Questions? Each document has clear sections and a table of contents. - -Good luck! 🚀 - -================================================================================ diff --git a/docs/sponsor_centric/ASSESSMENT_INDEX.md b/docs/sponsor_centric/ASSESSMENT_INDEX.md deleted file mode 100644 index f0826a4..0000000 --- a/docs/sponsor_centric/ASSESSMENT_INDEX.md +++ /dev/null @@ -1,408 +0,0 @@ -# Feasibility Assessment Index - -**Project:** Contributor Network - Sponsored vs. Community Contributor Visualization -**Requested:** Feature request to show tiered contributors (sponsored in ring, community scattered/outer) -**Assessment Date:** February 2026 -**Status:** ✅ **APPROVED FOR IMPLEMENTATION** - ---- - -## 📋 Document Overview - -This assessment consists of 4 comprehensive documents. Start here and navigate to what you need. - -### 1. 📊 **FEATURE_REQUEST_SUMMARY.md** ← **START HERE** -**Best for:** Getting a quick overview of the feature - -Contains: -- Executive summary of what's being built -- High-level feasibility assessment (✅ Highly Feasible) -- Key design decisions -- Impact on existing system -- Configuration examples -- Next steps for approval - -**Read this first if you:** -- Need a quick overview -- Are deciding whether to proceed -- Want to understand client value -- Need to explain to stakeholders - -**Time to read:** 10-15 minutes - ---- - -### 2. 🔧 **FEASIBILITY_ASSESSMENT.md** ← **DETAILED ANALYSIS** -**Best for:** Understanding technical details and risks - -Contains: -- Current state analysis -- What works in your favor (architecture is ready) -- Current constraints -- Proposed implementation model -- File-by-file changes needed -- Detailed risk assessment -- Testing strategy -- Architectural decisions & tradeoffs -- Comparison to ORCA model -- Known unknowns - -**Read this if you:** -- Are technically involved in the project -- Want to understand architecture impact -- Need to assess risks -- Are deciding between implementation options -- Want to know why things are feasible - -**Time to read:** 20-30 minutes - -**Key Finding:** ~50 lines of Python, ~250 lines of JavaScript, 3-4 weeks timeline - ---- - -### 3. 🛣️ **IMPLEMENTATION_ROADMAP.md** ← **STEP-BY-STEP GUIDE** -**Best for:** Actually building the feature - -Contains: -- 6 phases of development (backend → frontend → testing → release) -- Phase 1: Backend (config + classification) - 4-5 days -- Phase 2: Frontend data loading - 2-3 days -- Phase 3: Layout & simulation - 5-7 days (varies by option) -- Phase 4: Rendering & styling - 3-4 days -- Phase 5: Testing & validation - 3-5 days -- Phase 6: Refinement & release - 3-5 days -- Detailed code examples for each phase -- Testing procedures with sample code -- Risk mitigations -- Command reference - -**Read this if you:** -- Are ready to start implementation -- Need step-by-step instructions -- Want code examples -- Need testing procedures -- Are assigning tasks to developers - -**Time to read:** 30-45 minutes (to understand structure) -**Reference time:** Look up specific phase as needed during development - ---- - -### 4. 🎨 **VISUALIZATION_DESIGN_GUIDE.md** ← **DESIGN SPECIFICATIONS** -**Best for:** Visual design decisions and mockups - -Contains: -- Current visualization diagram -- Option A design: Scattered community layout -- Option B design: Outer ring layout -- Color & style specifications -- Link styling (sponsored vs community) -- Tooltip designs -- Before/after comparison -- Design decision matrix -- Mobile responsiveness considerations -- Animation & interaction specs -- Accessibility & color contrast info -- Design decision template (to document your choice) - -**Read this if you:** -- Are deciding between design options -- Need to specify visual appearance -- Are communicating design to client -- Want accessibility guidelines -- Need mockups/diagrams - -**Time to read:** 20-30 minutes - ---- - -## 🎯 Quick Reference: Who Should Read What? - -### Project Manager / Client -1. **FEATURE_REQUEST_SUMMARY.md** (10 min) -2. **VISUALIZATION_DESIGN_GUIDE.md** - Pages 1-12 (Design options) -3. Decision: Option A or Option B? - -### Frontend Developer -1. **FEATURE_REQUEST_SUMMARY.md** (10 min) - context -2. **IMPLEMENTATION_ROADMAP.md** - Phases 2-4 (20 min) -3. **Feasibility Assessment.md** - Data Model section (5 min) -4. Start with Phase 2.1 in ROADMAP - -### Backend Developer -1. **FEATURE_REQUEST_SUMMARY.md** (10 min) - context -2. **IMPLEMENTATION_ROADMAP.md** - Phase 1 (15 min) -3. **FEASIBILITY_ASSESSMENT.md** - Backend Changes table (5 min) -4. Start with Phase 1.1 in ROADMAP - -### Architect / Tech Lead -1. **FEASIBILITY_ASSESSMENT.md** (25 min) - entire document -2. **IMPLEMENTATION_ROADMAP.md** - Architecture section (10 min) -3. Review impact on existing systems - -### Designer / Product -1. **VISUALIZATION_DESIGN_GUIDE.md** (25 min) -2. **FEATURE_REQUEST_SUMMARY.md** (10 min) - context -3. Complete design decision template - ---- - -## 📊 Key Metrics at a Glance - -### Feasibility -- ✅ **Highly Feasible** -- Architecture is modular and ready -- No breaking changes required -- Backward compatible - -### Scope -| Component | Effort | Time | -|-----------|--------|------| -| Backend | ~50 lines | 4-5 days | -| Frontend | ~250 lines | 1-2 weeks | -| Testing | Comprehensive | 1 week | -| **Total** | **~300 lines** | **3-4 weeks** | - -### Risk Level -- **Low:** Config changes, CSV generation, data model -- **Medium:** Visual design tuning, force simulation performance -- **High:** None identified - -### Design Options -| Option | Time | Complexity | Visual Distinction | -|--------|------|-----------|-------------------| -| A: Scattered | 1 week | Low | Good | -| B: Outer Ring | 2 weeks | Moderate | Excellent | - ---- - -## 🔄 Document Dependencies - -``` -FEATURE_REQUEST_SUMMARY - ↓ - ├─→ FEASIBILITY_ASSESSMENT (if technical questions) - ├─→ VISUALIZATION_DESIGN_GUIDE (if design questions) - └─→ IMPLEMENTATION_ROADMAP (when ready to build) - -Before Implementation: - 1. Read FEATURE_REQUEST_SUMMARY - 2. Clarify design (VISUALIZATION_DESIGN_GUIDE) - 3. Get approval - 4. Start with IMPLEMENTATION_ROADMAP -``` - ---- - -## 🚀 Getting Started: 3 Steps - -### Step 1: Understand the Feature (15 min) -``` -Read: FEATURE_REQUEST_SUMMARY.md -Ask: "Does this solve the client's problem?" -Decide: Proceed or clarify requirements? -``` - -### Step 2: Decide on Design (20 min) -``` -Read: VISUALIZATION_DESIGN_GUIDE.md (Options A & B) -Review: Design decision matrix -Decide: Option A (simpler) or Option B (more polished)? -``` - -### Step 3: Plan Implementation (30 min) -``` -Read: IMPLEMENTATION_ROADMAP.md (Phase overview) -Review: Timeline and phases -Assign: Developers to phases -Start: Phase 1 (backend) first -``` - ---- - -## 📝 Key Questions Answered - -**Q: Is this feature feasible?** -A: ✅ Yes, highly feasible. Architecture supports it well. - -**Q: How long will it take?** -A: 3-4 weeks (Option A) to 4-5 weeks (Option B) - -**Q: Will it break existing functionality?** -A: ✅ No, fully backward compatible. - -**Q: What about performance?** -A: ✅ No degradation expected, data size unchanged. - -**Q: What are the main risks?** -A: Visual design tuning (medium risk), force simulation tuning (medium risk), no high risks identified. - -**Q: Do we need to refactor existing code?** -A: Minimal - mostly additive changes. - -**Q: Can we do this iteratively?** -A: ✅ Yes, Option A works as first pass, upgrade to Option B later. - -**Q: What about backward compatibility?** -A: ✅ Fully compatible - existing configs work without changes. - ---- - -## 🎓 Understanding the Architecture - -**You don't need to read the full ARCHITECTURE.md, but here's the key insight:** - -The visualization has three layers: - -1. **Data Layer** (Python) - - Fetches from GitHub - - Generates CSVs - - **Change Here:** Add tier classification - -2. **Processing Layer** (JavaScript data preparation) - - Loads CSV data - - Prepares nodes/links - - **Change Here:** Load tier field, separate by tier - -3. **Visualization Layer** (JavaScript rendering) - - Force simulations (position nodes) - - Canvas drawing (render) - - Interaction (hover/click) - - **Change Here:** Different simulation/positioning by tier - -The beauty of the current architecture is that each layer is separate, so changes to one don't cascade to the others. - ---- - -## ✅ Success Criteria - -The feature is successful when: - -1. ✅ Configuration supports sponsored contributor list -2. ✅ CSV output includes tier column -3. ✅ Frontend loads and displays tier data -4. ✅ Visualization positions contributors by tier -5. ✅ Tooltips show tier designation -6. ✅ All tests pass -7. ✅ No performance regression -8. ✅ Backward compatible with existing configs - ---- - -## 🤔 Common Questions - -**Q: Do we have to do Option B (the harder one)?** -A: No, start with Option A. It works well and ships faster. Upgrade to B based on feedback. - -**Q: How much code needs to change?** -A: ~300 lines total across Python and JavaScript. Most of existing code stays the same. - -**Q: Can we do this in parallel?** -A: Yes - backend (Python) can be done in parallel with frontend (JavaScript) after phase 2. - -**Q: What about the current visualization? Will it change?** -A: Only how contributors are positioned. Repos and links work the same. - -**Q: Do we need client approval for design?** -A: ✅ Yes, show them the two options (A & B) from VISUALIZATION_DESIGN_GUIDE.md. - -**Q: How do we handle the "sponsor list" in config?** -A: Add a new `[contributors.sponsored]` section to config.toml. Defaults to existing `[contributors.devseed]` if not present. - ---- - -## 📚 Documents in This Assessment - -| Document | Purpose | Length | Audience | -|----------|---------|--------|----------| -| FEATURE_REQUEST_SUMMARY.md | Overview & decision | 5 pages | Everyone | -| FEASIBILITY_ASSESSMENT.md | Technical deep-dive | 15 pages | Tech team | -| IMPLEMENTATION_ROADMAP.md | Step-by-step guide | 20 pages | Developers | -| VISUALIZATION_DESIGN_GUIDE.md | Design specs | 15 pages | Designers/PMs | -| ASSESSMENT_INDEX.md (this file) | Navigation | 3 pages | Everyone | - -**Total: ~60 pages of detailed guidance** - ---- - -## 🎯 Next Actions - -### Immediate (This Week) -- [ ] Read FEATURE_REQUEST_SUMMARY.md (stakeholders) -- [ ] Review VISUALIZATION_DESIGN_GUIDE.md options (design/PM) -- [ ] Get client input on Option A vs. B - -### Planning (Next Week) -- [ ] Finalize design choice -- [ ] Get executive approval -- [ ] Assign developers to phases -- [ ] Create sprint plan - -### Development (Following Week) -- [ ] Kick off Phase 1 (Backend) -- [ ] Use IMPLEMENTATION_ROADMAP.md as guide -- [ ] Execute phases 1-6 sequentially - ---- - -## 💡 Pro Tips - -1. **Start with Option A** - simpler, faster, still looks great. Upgrade to B based on feedback. - -2. **Do phases sequentially** - Phase 1 (backend) must complete before Phase 2-3 (frontend). - -3. **Get design approval early** - Before starting Phase 3, finalize visual design. - -4. **Test continuously** - Each phase has testing procedures. Don't skip them. - -5. **Keep backward compatibility** - All changes are additive. Existing configs still work. - -6. **Document decisions** - Use the design decision template in VISUALIZATION_DESIGN_GUIDE.md. - ---- - -## 🔗 Quick Links - -- **Overview:** FEATURE_REQUEST_SUMMARY.md -- **Technical:** FEASIBILITY_ASSESSMENT.md -- **Implementation:** IMPLEMENTATION_ROADMAP.md -- **Design:** VISUALIZATION_DESIGN_GUIDE.md -- **Architecture:** docs/ARCHITECTURE.md (existing project docs) -- **Config Example:** config.toml (in project root) - ---- - -## 📞 Contact & Questions - -For specific questions: - -1. **"How do I build this?"** → Read IMPLEMENTATION_ROADMAP.md Phase 1 -2. **"What are the risks?"** → Read FEASIBILITY_ASSESSMENT.md Risk section -3. **"What should it look like?"** → Read VISUALIZATION_DESIGN_GUIDE.md -4. **"Is this worth doing?"** → Read FEATURE_REQUEST_SUMMARY.md -5. **"How does the architecture work?"** → Read docs/ARCHITECTURE.md - ---- - -## 🎉 Summary - -**Status:** ✅ **READY TO PROCEED** - -This is a well-scoped, low-risk feature that: -- ✅ Solves the client's problem -- ✅ Fits the architecture well -- ✅ Has clear implementation path -- ✅ Can be done in 3-4 weeks -- ✅ Is backward compatible -- ✅ Has documented design options - -**Next Step:** Review FEATURE_REQUEST_SUMMARY.md and VISUALIZATION_DESIGN_GUIDE.md, then decide on design (Option A or B). - ---- - -**Assessment Complete** -**All Documentation Ready** -**Status: Ready for Implementation** - -**Date:** February 2026 -**Assessor:** Claude diff --git a/docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md b/docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md deleted file mode 100644 index d32a2f0..0000000 --- a/docs/sponsor_centric/BLOCKING_QUESTIONS_RESOLUTION.md +++ /dev/null @@ -1,159 +0,0 @@ -# Blocking Questions Resolution - -**Date:** February 2026 -**Status:** Implementation Guidance - -## Overview - -This document outlines the resolution strategies for key blocking questions identified during the initial implementation planning of the tiered contributor visualization feature. - -## 1. Simulation Strategy for Community Contributors - -**Challenge:** No existing "remaining simulation" in the codebase - -**Proposed Solution:** Create a new `communityContributorSimulation.js` - -### Key Implementation Details - -```javascript -import * as d3 from 'd3'; - -export function createCommunityContributorSimulation( - communityNodes, - centerX, - centerY, - ringRadius -) { - // Radius for community contributors (slightly further out than main ring) - const communityRadius = ringRadius * 1.5; - - return d3.forceSimulation(communityNodes) - .force('radial', d3.forceRadial( - // Varying distance from center with some randomness - d => communityRadius + (Math.random() * 50 - 25), - centerX, - centerY - ).strength(0.5)) - .force('collide', d3.forceCollide(20)) // Prevent node overlap - .force('charge', d3.forceManyBody().strength(-30)) // Gentle repulsion - .stop(); -} -``` - -**Design Principles:** -- Place community contributors outside the main contributor ring -- Prevent node overlap -- Add slight randomness to positioning -- Provide controlled, scattered layout - -## 2. Community Contributor Data Pipeline - -**Challenge:** Existing pipeline only handles configured contributors - -**Proposed Solution:** Enhanced contributor discovery mechanism - -### Key Implementation Strategies - -1. **Incremental Contributor Discovery** - - Create new method `discover_repo_contributors()` in `client.py` - - Implement intelligent rate limit handling - - Separate storage for discovered contributors - -2. **Rate Limit and API Management** - - Use GitHub GraphQL API for efficient querying - - Implement exponential backoff - - Add comprehensive error handling - -### Example Implementation Snippet - -```python -def discover_repo_contributors(repo): - """Discover all contributors for a given repository""" - try: - # GitHub API call to get all contributors - contributors = github_client.get_repository_contributors(repo) - - # Filter out known contributors from existing config - known_contributors = set(config.get_all_contributors()) - - # Discover new contributors - new_contributors = [ - c for c in contributors - if c.login not in known_contributors - ] - - return new_contributors - except RateLimitError: - # Implement intelligent rate limit handling - logger.warning(f"Rate limit hit for {repo}") - return [] -``` - -## 3. Documentation Completeness - -**Resolution:** -- Regenerate full implementation documents -- Add detailed, concrete implementation guidance -- Provide clear code examples and design rationales - -## 4. Config Strategy for Sponsored Contributors - -**Proposed Configuration Structure:** - -```toml -# Existing configuration -[contributors.devseed] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" - -# NEW: Optional sponsored section -[contributors.sponsored] -# Specific subset of contributors to highlight -aboydnw = "Anthony Boyd" - -# Configuration option to specify which group is "sponsored" -sponsored_contributor_group = "sponsored" # or "devseed" -``` - -**Python Configuration Enhancement:** -```python -class Config(BaseModel): - sponsored_contributor_group: str = "devseed" - - def get_sponsored_contributors(self): - """Retrieve sponsored contributors, with fallback""" - sponsored_group = self.contributors.get( - self.sponsored_contributor_group, - self.contributors.get("devseed", {}) - ) - return list(sponsored_group.keys()) -``` - -## Recommended Implementation Approach - -1. **Simulation Strategy:** - - Create `communityContributorSimulation.js` - - Test with mock data - - Iterate on positioning algorithm - -2. **Contributor Discovery:** - - Start with simplified discovery method - - Implement rate limit handling - - Create separate storage for new contributors - -3. **Configuration:** - - Add optional `[contributors.sponsored]` section - - Implement fallback mechanism - - Update config parsing logic - -## Next Implementation Steps - -- Develop unit tests for new methods -- Create integration tests for GitHub API interaction -- Implement mock GitHub API for testing -- Perform incremental development and validation - ---- - -**Status:** Ready for Implementation -**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md b/docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md deleted file mode 100644 index 424cd7f..0000000 --- a/docs/sponsor_centric/CONFIG_ENHANCEMENT_STRATEGY.md +++ /dev/null @@ -1,168 +0,0 @@ -# Configuration Enhancement Strategy - -**Date:** February 2026 -**Status:** Technical Design - -## Overview - -This document outlines the strategy for enhancing the configuration system to support sponsored contributor classification while maintaining backward compatibility and flexibility. - -## Current Configuration Challenges - -1. Single contributor group (`[contributors.devseed]`) -2. No explicit mechanism for highlighting specific contributors -3. Limited flexibility in contributor classification - -## Proposed Configuration Structure - -```toml -# Existing Contributors (Unchanged) -[contributors.devseed] -aboydnw = "Anthony Boyd" -gadomski = "Pete Gadomski" - -# NEW: Optional Sponsored Contributors Section -[contributors.sponsored] -aboydnw = "Anthony Boyd" -# Can include a subset or all of devseed contributors - -# Configuration Options -[contributor_classification] -# Specify which group should be treated as "sponsored" -primary_group = "sponsored" # or "devseed" -fallback_group = "devseed" - -# Optional: Additional metadata for contributors -[contributor_metadata] -aboydnw = { - title = "CTO", - department = "Engineering", - start_date = "2020-01-01" -} -``` - -## Python Configuration Model Enhancement - -```python -from pydantic import BaseModel, Field -from typing import Dict, Optional - -class ContributorMetadata(BaseModel): - """Extended metadata for individual contributors""" - title: Optional[str] = None - department: Optional[str] = None - start_date: Optional[str] = None - github_username: str - display_name: str - -class ContributorClassificationConfig(BaseModel): - """Configuration for contributor classification""" - primary_group: str = "sponsored" - fallback_group: str = "devseed" - -class Config(BaseModel): - """Enhanced configuration model""" - contributors: Dict[str, Dict[str, str]] - contributor_classification: ContributorClassificationConfig = Field( - default_factory=ContributorClassificationConfig - ) - contributor_metadata: Dict[str, ContributorMetadata] = {} - - def get_sponsored_contributors(self) -> List[str]: - """ - Retrieve sponsored contributors with intelligent fallback - - Priority: - 1. Explicitly defined sponsored group - 2. Fallback group - 3. Empty list - """ - classification = self.contributor_classification - primary_group = classification.primary_group - fallback_group = classification.fallback_group - - # Try primary group first - sponsored_group = self.contributors.get(primary_group) - if sponsored_group: - return list(sponsored_group.keys()) - - # Fallback to secondary group - fallback_sponsored_group = self.contributors.get(fallback_group) - return list(fallback_sponsored_group.keys()) if fallback_sponsored_group else [] - - def get_contributor_metadata(self, username: str) -> Optional[ContributorMetadata]: - """Retrieve extended metadata for a contributor""" - return self.contributor_metadata.get(username) -``` - -## CLI Enhancements - -```python -@cli.command() -@click.option('--list-sponsored', is_flag=True, help='List sponsored contributors') -def contributors(list_sponsored): - """Manage and list contributors""" - config = load_config() - - if list_sponsored: - sponsored_contributors = config.get_sponsored_contributors() - for contributor in sponsored_contributors: - metadata = config.get_contributor_metadata(contributor) - click.echo(f"{contributor}: {metadata.display_name if metadata else 'N/A'}") -``` - -## Key Design Principles - -1. **Backward Compatibility** - - Existing configs continue to work - - Gradual migration path - - No breaking changes - -2. **Flexibility** - - Multiple contributor groups supported - - Configurable primary/fallback groups - - Optional extended metadata - -3. **Extensibility** - - Easy to add new classification strategies - - Support for rich contributor metadata - - Minimal changes to existing code - -## Recommended Implementation Steps - -1. Update `config.py` with new Pydantic models -2. Modify config parsing to support new structure -3. Update CLI commands to leverage new configuration -4. Create migration scripts for existing configs -5. Add comprehensive test coverage - -## Potential Future Enhancements - -- Machine learning-based contributor classification -- Integration with external identity providers -- More sophisticated metadata management -- Automated contributor discovery and classification - -## Testing Strategy - -1. **Unit Tests** - - Verify sponsored contributor retrieval - - Test fallback mechanisms - - Validate metadata parsing - -2. **Integration Tests** - - Config file parsing - - CLI command functionality - - Interaction with visualization pipeline - -## Migration Guide - -1. Existing configs will work without modification -2. Gradually introduce `[contributors.sponsored]` section -3. Use `contributor_classification.primary_group` to control behavior -4. Incrementally add `contributor_metadata` - ---- - -**Status:** Ready for Implementation -**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md b/docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md deleted file mode 100644 index d5cbe4f..0000000 --- a/docs/sponsor_centric/CONTRIBUTOR_DISCOVERY_STRATEGY.md +++ /dev/null @@ -1,194 +0,0 @@ -# Contributor Discovery Strategy - -**Date:** February 2026 -**Status:** Implementation Guidance - -## Overview - -This document provides a comprehensive strategy for discovering and managing community contributors beyond the existing configured contributor list. - -## Motivation - -The current contributor network visualization relies on a predefined list of contributors. To create a more comprehensive view of open-source contributions, we need a robust mechanism to: -- Discover contributors not in the current configuration -- Handle GitHub API rate limits -- Store and manage newly discovered contributors -- Provide flexibility in contributor classification - -## Discovery Mechanisms - -### 1. Repository-Level Contributor Discovery - -```python -from github import Github -from typing import List, Dict -import logging -import time - -class ContributorDiscoveryService: - def __init__(self, github_token: str, config: Dict): - self.github_client = Github(github_token) - self.config = config - self.logger = logging.getLogger(__name__) - - def discover_repo_contributors( - self, - repo_name: str, - min_contributions: int = 1 - ) -> List[Dict]: - """ - Discover contributors for a specific repository - - Args: - repo_name (str): Full repository name (org/repo) - min_contributions (int): Minimum number of contributions to include - - Returns: - List of contributor dictionaries - """ - try: - repo = self.github_client.get_repo(repo_name) - - # Paginated contributor retrieval - contributors = [] - for contributor in repo.get_contributors(): - if contributor.contributions >= min_contributions: - contributors.append({ - 'login': contributor.login, - 'name': contributor.name or contributor.login, - 'contributions': contributor.contributions, - 'avatar_url': contributor.avatar_url, - 'html_url': contributor.html_url - }) - - # Basic rate limit management - if len(contributors) >= 100: - break - - return contributors - - except Exception as e: - self.logger.error(f"Error discovering contributors for {repo_name}: {e}") - return [] - - def discover_org_contributors( - self, - org_name: str, - repos: List[str] = None - ) -> Dict[str, List[Dict]]: - """ - Discover contributors across multiple repositories in an organization - - Args: - org_name (str): GitHub organization name - repos (List[str], optional): Specific repos to check. If None, fetches all org repos. - - Returns: - Dictionary of repository contributors - """ - if not repos: - org = self.github_client.get_organization(org_name) - repos = [repo.full_name for repo in org.get_repos()] - - org_contributors = {} - for repo_name in repos: - contributors = self.discover_repo_contributors(repo_name) - org_contributors[repo_name] = contributors - - return org_contributors - -## Persistent Storage Strategy -class ContributorStore: - def __init__(self, storage_path: str = 'discovered_contributors.json'): - self.storage_path = storage_path - - def save_contributors(self, contributors: Dict[str, List[Dict]]): - """Save discovered contributors to persistent storage""" - with open(self.storage_path, 'w') as f: - json.dump(contributors, f, indent=2) - - def load_contributors(self) -> Dict[str, List[Dict]]: - """Load previously discovered contributors""" - try: - with open(self.storage_path, 'r') as f: - return json.load(f) - except FileNotFoundError: - return {} - -## CLI Integration -@cli.command() -@click.option('--org', required=True, help='GitHub organization to discover contributors') -@click.option('--min-contributions', default=1, help='Minimum contributions to include') -def discover_contributors(org, min_contributions): - """CLI command to discover contributors for an organization""" - config = load_config() - github_token = os.environ.get('GITHUB_TOKEN') - - discovery_service = ContributorDiscoveryService(github_token, config) - store = ContributorStore() - - # Discover contributors - discovered_contributors = discovery_service.discover_org_contributors(org) - - # Save to persistent storage - store.save_contributors(discovered_contributors) - - # Optional: Print summary - for repo, contributors in discovered_contributors.items(): - click.echo(f"{repo}: {len(contributors)} contributors discovered") -``` - -## Key Design Principles - -1. **Flexible Discovery** - - Repository-level and organization-level discovery - - Configurable minimum contribution threshold - -2. **Rate Limit Management** - - Pagination support - - Exponential backoff (not shown in example) - - Logging of discovery attempts - -3. **Persistent Storage** - - JSON-based storage of discovered contributors - - Easy to manually curate or modify - -4. **Extensibility** - - Separate concerns: discovery, storage, CLI - - Easily mockable for testing - -## Configuration Considerations - -Update `config.toml` to support discovery: - -```toml -[contributor_discovery] -min_contributions = 1 -rate_limit_delay = 60 # seconds between requests -``` - -## Recommended Workflow - -1. Run discovery command -2. Review discovered contributors -3. Manually add to config or sponsored list -4. Regenerate visualization - -## Potential Enhancements - -- GraphQL API support for more efficient querying -- More sophisticated rate limit handling -- Machine learning-based contributor classification -- Webhook support for continuous discovery - -## Testing Strategy - -- Unit tests for discovery methods -- Mock GitHub API for predictable testing -- Integration tests with real GitHub repositories -- Performance testing with large repositories - ---- - -**Status:** Ready for Implementation -**Last Updated:** February 2026 diff --git a/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md b/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md deleted file mode 100644 index 597b640..0000000 --- a/docs/sponsor_centric/FEASIBILITY_ASSESSMENT.md +++ /dev/null @@ -1,85 +0,0 @@ -# Feasibility Assessment: Contributor Network Visualization Redesign - -**Date:** February 2026 -**Requested By:** Client Feature Request -**Assessment Author:** Claude - ---- - -## Updated Implementation Strategy - -Following detailed technical review, we've refined the implementation approach to address key technical challenges: - -### Key Changes from Initial Assessment -1. **Contributor Discovery** - - Implement comprehensive GitHub API-based contributor discovery - - Create new method to find contributors across tracked repositories - - Handle GitHub API rate limits intelligently - -2. **Simulation Strategy** - - Create new `communityContributorSimulation.js` - - Custom force simulation for community contributors - - Positioned outside main contributor ring with controlled scattering - -3. **Configuration Handling** - - Optional `[contributors.sponsored]` section - - Fallback to `[contributors.devseed]` - - Configurable sponsored contributor group - -### Detailed Technical Approach - -#### Contributor Discovery Pipeline -```python -def discover_repo_contributors(repo): - """ - Discover all contributors for a given repository - - Steps: - 1. Call GitHub API to get repository contributors - 2. Filter out already known contributors - 3. Store new contributors with metadata - """ - # Implementation details in client.py -``` - -#### Community Contributor Simulation -```javascript -function createCommunityContributorSimulation(communityNodes, centerX, centerY, ringRadius) { - // Create force simulation with: - // - Radial positioning outside main ring - // - Node collision prevention - // - Gentle node repulsion -} -``` - -### Revised Implementation Phases - -1. **Backend Contributor Discovery** (1-2 weeks) - - Modify GitHub client to discover all contributors - - Create storage mechanism for discovered contributors - - Handle API rate limits - -2. **Configuration Enhancement** (3-5 days) - - Update `config.py` to support sponsored contributor selection - - Add flexible configuration options - - Maintain backward compatibility - -3. **Frontend Visualization Update** (1-2 weeks) - - Create community contributor simulation - - Update data loading to incorporate new contributor tiers - - Implement scattered positioning for community contributors - -### Complexity Acknowledgment -The initial estimate of "~50 lines of Python" significantly underestimated the complexity. The actual implementation will require: -- Approximately 200-300 lines of Python -- Comprehensive GitHub API interaction -- Intelligent contributor discovery and storage - -### Risk Mitigation -- Incremental implementation -- Fallback to existing contributor list -- Modular design allowing future refinement - ---- - -(Rest of the original document remains the same, with these updates) diff --git a/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md b/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md deleted file mode 100644 index 8899c92..0000000 --- a/docs/sponsor_centric/FEATURE_REQUEST_SUMMARY.md +++ /dev/null @@ -1,56 +0,0 @@ -# Feature Request Summary & Assessment - -**Request:** Visualization of Sponsored vs. Community Contributors -**Status:** ✅ **APPROVED FOR IMPLEMENTATION** -**Date:** February 2026 - ---- - -## What We're Building - -A redesign of the Contributor Network visualization that separates contributors into two tiers: - -1. **Sponsored Contributors** - A curated list of key team members - - Displayed in a central ring (like the current implementation) - - Full prominence in the visualization - -2. **Community Contributors** - Everyone else who contributed to your repos - - Displayed separately (scattered using existing simulation) - - Visually distinguished but still present - - Quick to implement, reusing current positioning strategy - ---- - -## Key Design Decisions - -### 1. Fixed Repository List (Already In Place ✅) -- You already have this via `config.toml` -- No changes needed to repo fetching -- Data pipeline stays the same - -### 2. Contribution is Automatic -- We automatically find all contributors to your tracked repos -- No manual maintenance needed -- Scales as repos grow - -### 3. Sponsor List is Configurable -- Simple config option specifies who is "sponsored" -- Can be updated by editing `config.toml` -- Defaults to existing `[contributors.devseed]` section - -### 4. Layout Strategy - -**Chosen Approach - Option A: Reuse Existing Simulation** (Primary Implementation) -- Community contributors use existing "remaining" positioning -- Minimal code changes -- Follows current behavior for extras -- Quick to implement (1 week) - -**Future Potential - Option B Considered** -- Potential future enhancement: Create a dedicated outer ring -- Could be explored in later iterations if needed -- Currently deprioritized in favor of rapid implementation - ---- - -(Rest of the document remains the same) diff --git a/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md b/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md deleted file mode 100644 index e8118bc..0000000 --- a/docs/sponsor_centric/IMPLEMENTATION_ROADMAP.md +++ /dev/null @@ -1,50 +0,0 @@ -# Implementation Roadmap: Tiered Contributor Visualization - -**For:** Feature Request - Sponsored vs. Community Contributor Visualization -**Status:** Ready to Start -**Last Updated:** February 2026 - ---- - -## Overview - -This document provides a step-by-step implementation guide for adding tiered contributor visualization to the Contributor Network project, focusing on the primary Option A approach: reusing existing simulation for community contributors. - ---- - -## Implementation Strategy: Option A (Existing Simulation) - -### Primary Goals -- Leverage existing "remaining" simulation -- Minimal code changes -- Quick implementation (1 week) -- Clear visual distinction between sponsored and community contributors - -### Key Characteristics -- Sponsored contributors remain in central ring -- Community contributors scattered using current positioning logic -- No complex new force simulations required -- Backward compatible with existing visualization - -(Rest of the document remains largely the same, with Option A emphasized in key sections) - -### Step 2.2: Community Contributor Layout Strategy - -**Option A: Use Existing "Remaining" Simulation** (Primary Approach) -- ✅ Reuse the existing `remainingSimulation.js` for community contributors -- ✅ Position sponsored contributors in the ring (unchanged) -- ✅ Community contributors positioned scattered around (as currently done for extras) -- ✅ **Recommended Primary Implementation** - -**Option B: Custom Simulation** (Future Enhancement) -- Potential future iteration -- Create a dedicated outer ring simulation -- More complex, requires additional development time -- Currently deprioritized - -**Implementation Recommendation: Start with Option A** -- Provides functional solution quickly -- Allows for future refinement -- Minimal risk to existing codebase - -(Rest of the document remains the same) diff --git a/docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md b/docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md deleted file mode 100644 index e998366..0000000 --- a/docs/sponsor_centric/SIMULATION_IMPLEMENTATION_DETAILS.md +++ /dev/null @@ -1,176 +0,0 @@ -# Community Contributor Simulation Implementation - -**Date:** February 2026 -**Status:** Technical Design - -## Overview - -This document provides a detailed technical design for implementing the community contributor simulation, addressing the lack of an existing "remaining simulation" in the current codebase. - -## Simulation Requirements - -1. Position community contributors outside the main contributor ring -2. Prevent node overlap -3. Create a visually appealing, scattered layout -4. Maintain performance and scalability -5. Integrate with existing visualization pipeline - -## Proposed Implementation - -### JavaScript Force Simulation Design - -```javascript -import * as d3 from 'd3'; - -export class CommunityContributorSimulation { - constructor(options = {}) { - // Configurable parameters - this.defaultOptions = { - centerX: 0, - centerY: 0, - mainRingRadius: 300, - communityRingMultiplier: 1.5, - nodeRadius: 20, - forceStrength: 0.5, - collisionStrength: 0.7 - }; - - this.options = { ...this.defaultOptions, ...options }; - } - - /** - * Create force simulation for community contributors - * @param {Array} communityNodes - Array of community contributor nodes - * @returns {d3.Simulation} Configured force simulation - */ - create(communityNodes) { - const { - centerX, - centerY, - mainRingRadius, - communityRingMultiplier, - nodeRadius, - forceStrength, - collisionStrength - } = this.options; - - const communityRingRadius = mainRingRadius * communityRingMultiplier; - - return d3.forceSimulation(communityNodes) - // Radial force: scatter nodes in a ring around the center - .force('radial', d3.forceRadial( - // Slight randomness in radius for scattered effect - node => communityRingRadius + (Math.random() * 50 - 25), - centerX, - centerY - ).strength(forceStrength)) - - // Collision force: prevent node overlap - .force('collide', d3.forceCollide(nodeRadius * 2) - .strength(collisionStrength)) - - // Charge force: gentle repulsion between nodes - .force('charge', d3.forceManyBody() - .strength(-30)) - - // Center force: keep nodes near visualization center - .force('center', d3.forceCenter(centerX, centerY)) - - .stop(); // Manually tick the simulation - } - - /** - * Manually run simulation to stable state - * @param {d3.Simulation} simulation - Force simulation instance - * @param {number} tickCount - Number of simulation ticks - */ - stabilize(simulation, tickCount = 100) { - for (let i = 0; i < tickCount; i++) { - simulation.tick(); - } - } -} - -// Usage example -function initializeCommunityContributors(communityNodes, mainRingRadius) { - const simulation = new CommunityContributorSimulation({ - mainRingRadius: mainRingRadius - }); - - const communitySimulation = simulation.create(communityNodes); - simulation.stabilize(communitySimulation); - - return communityNodes; -} -``` - -## Design Considerations - -### Configurability -- Simulation parameters can be adjusted without changing core logic -- Supports different visualization requirements -- Easy to experiment with layout strategies - -### Performance -- Uses D3's efficient force simulation -- Manual simulation stabilization -- Configurable tick count for performance tuning - -### Flexibility -- Can be easily integrated with existing visualization -- Supports dynamic node count -- Provides predictable scattered layout - -## Integration with Existing Visualization - -```javascript -function updateVisualization(nodes) { - const sponsoredNodes = nodes.filter(n => n.tier === 'sponsored'); - const communityNodes = nodes.filter(n => n.tier === 'community'); - - // Existing contributor ring simulation - positionContributorNodes(sponsoredNodes); - - // New community contributor simulation - const mainRingRadius = calculateMainRingRadius(sponsoredNodes); - initializeCommunityContributors(communityNodes, mainRingRadius); - - // Render nodes - renderNodes(nodes); -} -``` - -## Testing Strategy - -1. **Unit Tests** - - Verify simulation creates nodes - - Check node positioning - - Test configuration options - -2. **Visual Regression Tests** - - Snapshot testing of node layouts - - Verify no node overlap - - Check consistent positioning - -3. **Performance Tests** - - Benchmark simulation with various node counts - - Profile memory and CPU usage - -## Potential Future Enhancements - -- Machine learning-based node positioning -- More advanced collision detection -- Animated transitions between layouts -- Configurable layout algorithms - -## Recommended Next Steps - -1. Implement simulation class -2. Create comprehensive test suite -3. Integrate with existing visualization -4. Perform user testing and gather feedback - ---- - -**Status:** Ready for Implementation -**Last Updated:** February 2026 From c1d214882bc7126d28dcf0bbdba9c8521af359b7 Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Tue, 17 Feb 2026 10:02:00 -0500 Subject: [PATCH 7/8] update docs for agents and remove a few more unnecessary docs --- docs/ARCHITECTURE.md | 477 ------------------------------- docs/CLAUDE.md | 165 +++++++++-- docs/DECISIONS.md | 421 --------------------------- docs/DEVELOPMENT_GUIDE.md | 467 ------------------------------ docs/JAVASCRIPT_REFACTORING.md | 306 -------------------- docs/ROADMAP.md | 508 --------------------------------- 6 files changed, 145 insertions(+), 2199 deletions(-) delete mode 100644 docs/ARCHITECTURE.md delete mode 100644 docs/DECISIONS.md delete mode 100644 docs/DEVELOPMENT_GUIDE.md delete mode 100644 docs/JAVASCRIPT_REFACTORING.md delete mode 100644 docs/ROADMAP.md diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md deleted file mode 100644 index 0e48807..0000000 --- a/docs/ARCHITECTURE.md +++ /dev/null @@ -1,477 +0,0 @@ -# Architecture Overview - -How the codebase is organized and how the different pieces fit together. - ---- - -## High-Level Data Flow - -``` -GitHub API - ↓ -Python CLI (client.py) - ↓ -JSON Files (assets/data/) - ↓ -CSV Generation (csvs command) - ↓ -D3.js Visualization (index.html) - ↓ -Interactive Web App -``` - ---- - -## Project Organization - -### Python Backend - -**Purpose:** Fetch data from GitHub, validate it, generate CSVs, and build the static site. - -**Key files:** -- `python/contributor_network/cli.py` - All CLI commands (Click-based) -- `python/contributor_network/client.py` - GitHub API wrapper (uses PyGithub) -- `python/contributor_network/config.py` - Configuration models (Pydantic) -- `python/contributor_network/models.py` - Data models (Repository, Link, Contributor) - -**Architecture:** -``` -CLI (cli.py) - ↓ -Client (client.py) ← Queries GitHub API - ↓ -Models (models.py) ← Validates & structures data - ↓ -Config (config.py) ← Loads from config.toml - ↓ -JSON files / CSVs / Templates -``` - -**Data Flow for `data` command:** -1. Load repositories from `config.toml` -2. For each repo, query GitHub API (client.py) -3. Fetch contributions, commit dates, repository metadata -4. Validate data with Pydantic models (models.py) -5. Save to JSON files in `assets/data/` - -### JavaScript Frontend - -**Purpose:** Load CSV data, prepare it for visualization, render with D3.js, handle interactions. - -**Architecture Status:** Modular ES6 modules - -**Organization:** -``` -src/js/ -├── config/ # Configuration (theme, scales, constants) -├── data/ # Data loading, filtering, preparation -├── interaction/ # Mouse/keyboard event handlers -├── layout/ # Canvas sizing, node positioning -├── render/ # Drawing functions (shapes, text, labels, etc.) -├── simulations/ # D3 force simulations -├── state/ # State management (filters, hover/click state) -└── utils/ # Helpers (formatters, validation, debugging) -``` - -**Current State:** -- ✅ 29 modules extracted -- ✅ 4,642 lines in modular files -- 🟡 Main orchestrator still contains ~2,059 lines -- 🟡 Largest remaining extraction: `prepareData()` (~515 lines) - ---- - -## Key Concepts - -### Nodes - -Three types of nodes in the visualization: - -1. **Contributors** - Team members (arranged in a circle) - - Position: Alphabetically around outer ring - - Color: Based on organization - - Size: Based on total contributions - -2. **Repositories** - GitHub projects - - Position: Determined by force simulation (depends on collaboration pattern) - - Color: Coded by ownership type - - Grouped by: Single owner, single contributor, multiple contributors - -3. **Owners** - Repository owners (intermediary nodes when repos grouped by owner) - - Position: Calculated to be between repos and contributors - - Purpose: Organize related repos visually - -### Links - -Connections between nodes: - -- **Contributor → Owner → Repository** (when repos grouped by owner) -- **Contributor → Repository** (direct links) -- Width: Based on commit count -- Opacity: Based on recency of contribution - -### Force Simulations - -D3.js force simulations position nodes without overlapping: - -1. **Owner Simulation** - Repos with single owner (25% of repos) - - Pulls repos toward owner - - Prevents overlap - -2. **Contributor Simulation** - Repos with single DevSeed contributor (50%) - - Pulls repos toward contributor - - Uses strong charge force to prevent clustering - -3. **Collaboration Simulation** - Repos with multiple contributors (20%) - - Centers at origin - - Balances contributors' influence - - Tighter clustering - -4. **Remaining Simulation** - Contributors outside main circle - - Positioned in outer ring - - Separated from main visualization - ---- - -## Configuration - -### `config.toml` Structure - -```toml -[repositories] -"owner/repo-name" = "Display Name" -"another-owner/project" = "Another Project" - -[contributors.devseed] -github_username = "Display Name" -another_username = "Another Name" - -[contributors.alumni] -old_member = "Old Member Name" -``` - -### Data Models (`models.py`) - -**Repository:** -- ID, full name, URL -- Metrics: stars, forks, watchers, open issues -- Metadata: languages, topics, license, created/updated dates -- Community: total contributors, DevSeed contributors, community ratio - -**Link** (Contributor → Repository relationship): -- Contributor & repo IDs -- Commit count and dates (first/last) -- Contribution span (days) -- Recency flag - ---- - -## State Management - -### Filter State - -Managed in `src/js/state/filterState.js` - -```javascript -{ - organizations: [], // Selected org filters - starsMin: null, - forksMin: null, - watchersMin: null, - language: null -} -``` - -### Interaction State - -Managed in `src/js/state/interactionState.js` - -```javascript -{ - hoverActive: false, - hoveredNode: null, - clickActive: false, - clickedNode: null, - delaunay: null // For mouse position detection -} -``` - ---- - -## Data Processing Pipeline - -### 1. Loading (`src/js/visualization/index.js`) -- Fetch CSV files (repositories.csv, contributors.csv) -- Parse into objects -- Build node and link objects - -### 2. Preparation (`src/js/data/prepare.js` - currently in progress to extract) -- Create node objects from data -- Build link arrays -- Calculate positions -- Determine colors based on metadata - -### 3. Filtering (`src/js/data/filter.js`) -- Apply active filters (organization, stars, language, etc.) -- Cascade: filter repos → filter links → filter contributors -- Rebuild only affected links - -### 4. Simulation (`src/js/simulations/`) -- Run D3 force simulations to position nodes -- Multiple simulations based on repo grouping -- Calculate contributor ring positions - -### 5. Rendering (`src/js/render/`) -- Draw nodes (circles with optional patterns) -- Draw links (curved paths with gradients) -- Draw labels (rotated for contributor ring) -- Draw tooltips on hover/click - -### 6. Interaction (`src/js/interaction/`) -- Track mouse position with Delaunay triangulation -- Trigger hover state on node entry -- Handle click for selection -- Filter links shown on hover - ---- - -## Rendering Pipeline - -### Canvas Layers - -The visualization uses multiple canvas layers (composited in HTML): - -1. **Main canvas** - Nodes and links (performance-critical) -2. **Tooltip canvas** - Hover/click information cards -3. **Label canvas** - Node labels -4. **Hover canvas** - Temporary highlighting - -### Drawing Performance - -**Why Canvas instead of SVG?** -- 200+ interactive nodes + 500+ links would be slow in SVG -- Canvas provides better performance for this density -- D3 force simulation updates positions ~60 times per second - -**Optimization strategies:** -- Request animation frame batching -- Partial redraws (only affected regions) -- Delaunay triangulation for fast node detection - ---- - -## Refactoring Status - -### What's Been Modularized ✅ - -| Area | Lines | Status | -|------|-------|--------| -| Config | 240 | ✅ Complete | -| Data filtering | 217 | ✅ Complete | -| State | 173 | ✅ Complete | -| Simulations | 529 | ✅ Complete | -| Interaction | 239 | ✅ Complete | -| Render (shapes, text, tooltips) | 1,474 | ✅ Complete | -| Layout | 329 | ✅ Complete | -| Utils | 606 | ✅ Complete | -| **Total Modular** | **4,642** | ✅ **Complete** | - -### What Still Needs Work 🟡 - -| Task | Lines | Priority | -|------|-------|----------| -| Extract `prepareData()` | ~515 | High | -| Extract `positionContributorNodes()` | ~117 | High | -| Simplify main `draw()` | ~166 | High | -| Extract helper functions | ~100 | Medium | -| **Total Remaining** | **~898** | | - -**Target:** Main `index.js` from 2,059 lines → ~300-400 lines (thin orchestrator) - ---- - -## JavaScript Module Structure - -### Current Organization - -``` -src/js/ -├── config/ -│ ├── theme.js (119 lines) # Colors, fonts, layout constants -│ └── scales.js (121 lines) # D3 scale factories -├── data/ -│ └── filter.js (217 lines) # Filtering logic -├── state/ -│ ├── filterState.js (67 lines) # Filter state -│ └── interactionState.js (106 lines) # Hover/click state -├── simulations/ -│ ├── ownerSimulation.js (125 lines) -│ ├── contributorSimulation.js (132 lines) -│ ├── collaborationSimulation.js (188 lines) -│ ├── remainingSimulation.js (84 lines) -│ └── index.js (12 lines) # Re-exports -├── interaction/ -│ ├── hover.js (87 lines) -│ ├── click.js (85 lines) -│ └── findNode.js (67 lines) -├── layout/ -│ └── resize.js (122 lines) -├── render/ -│ ├── canvas.js (207 lines) -│ ├── shapes.js (277 lines) -│ ├── text.js (275 lines) -│ ├── tooltip.js (533 lines) # Largest module -│ ├── labels.js (141 lines) -│ └── repoCard.js (248 lines) -├── utils/ -│ ├── helpers.js (121 lines) -│ ├── formatters.js (153 lines) -│ ├── validation.js (185 lines) -│ └── debug.js (147 lines) -└── visualization/ - └── index.js (14 lines) # Exports -``` - ---- - -## Theme & Customization - -### Colors (in `src/js/config/theme.js`) - -**Brand Colors:** -- Grenadier Orange (#CF3F02) -- Aquamarine Blue (#2E86AB) -- Base Gray (#443F3F) - -**Node Colors:** -- Contributors: Varied by organization (from color palette) -- Repositories: Coded by ownership pattern (single owner, single contributor, shared) -- Owners: Gray/neutral - -**Link Colors:** -- Gradient from contributor color to repo color -- Opacity: Based on recency (recent = more opaque) - -### Font Configuration - -```javascript -FONTS = { - family: "...", - baseSizeContributor: 11, // To be increased to ~14 - baseSizeRepo: 10, // To be increased to ~13 - baseSizeOwner: 12 // To be increased to ~15 -} -``` - ---- - -## Dependencies - -### Python -- `click` - CLI framework -- `pydantic` - Data validation -- `pygithub` - GitHub API client -- `requests` - HTTP library -- `tomli` - TOML parsing -- `pytest` - Testing - -### JavaScript -- `d3` - Visualization and force simulations -- `vitest` - Testing framework -- ~~esbuild~~ - Bundling (in package.json, but not active build) - ---- - -## How It All Fits Together - -**User visits the site:** - -1. **index.html** loads JavaScript modules from `src/js/` -2. **visualization/index.js** creates a chart function -3. Chart function: - - Loads CSV data from `assets/data/` - - Calls `prepareData()` to transform raw data into nodes/links - - Runs force simulations to position nodes - - Sets up event handlers (hover, click) - - Starts animation loop - -4. **Animation loop** (`draw()` function): - - Updates node positions (from force simulation) - - Redraws canvas - - Shows/hides tooltips based on interaction state - -5. **User interaction**: - - Mouse move → detect node via Delaunay triangulation - - Mouse over node → highlight and show tooltip - - Click node → select for detailed view - - Change filter → re-run cascade, rebuild visualization - -6. **Data refresh** (happens offline): - - User runs `uv run contributor-network data` - - GitHub data fetched and validated - - User runs `uv run contributor-network csvs` - - CSV files updated in `assets/data/` - - Next page refresh loads new data - ---- - -## Common Patterns - -### Module Pattern - -All modules export functions, not classes: - -```javascript -// src/js/utils/formatters.js -export function formatDate(timestamp) { /* ... */ } -export function formatNumber(num) { /* ... */ } -``` - -```javascript -// Usage in another module -import { formatDate } from '../utils/formatters.js'; -const dateStr = formatDate(timestamp); -``` - -### State Management - -Simple, predictable state updates: - -```javascript -// Create initial state -let state = createInteractionState(); - -// Update immutably -state = setHovered(state, hoveredNode); -state = setClicked(state, clickedNode); -``` - -### Configuration - -All magic numbers and constants centralized: - -```javascript -// In config/theme.js -export const COLORS = { /* ... */ }; -export const LAYOUT = { /* ... */ }; -export const FONTS = { /* ... */ }; -``` - ---- - -## Next Steps for Refactoring - -**High Priority:** -1. Extract `prepareData()` → `data/prepare.js` (~515 lines) -2. Extract `positionContributorNodes()` → `layout/positioning.js` (~117 lines) -3. Simplify main `draw()` function - -**Medium Priority:** -4. Extract `drawHoverState()` → `render/hoverState.js` -5. Extract remaining helper functions - -**Result:** Main orchestrator becomes ~300 lines (thin coordinating layer) - ---- - -**Last Updated**: February 2026 diff --git a/docs/CLAUDE.md b/docs/CLAUDE.md index bba0c69..3cad21f 100644 --- a/docs/CLAUDE.md +++ b/docs/CLAUDE.md @@ -2,6 +2,8 @@ **Start here.** This file provides quick orientation for anyone working with this codebase. +> **Important for AI agents:** When you make changes to the codebase, update the relevant documentation in `docs/` and `README.md` to reflect those changes. Keep `README.md` short, concise, and human-readable -- it is the public-facing project overview. This file (`CLAUDE.md`) is the detailed reference for developers and agents. + ## What Is This? An interactive D3.js web visualization of Development Seed's contributions to open-source projects. Shows the relationships between team members, repositories, and collaborators. @@ -16,10 +18,6 @@ An interactive D3.js web visualization of Development Seed's contributions to op **First read**: [`PRD.md`](./PRD.md) (5 min) - Understand what this product is and why it exists. -**Then read**: [`DEVELOPMENT_GUIDE.md`](./DEVELOPMENT_GUIDE.md) (10 min) - Set up your local environment. - -**Then explore**: [`ARCHITECTURE.md`](./ARCHITECTURE.md) (15 min) - Understand how the code is organized. - --- ## Quick Start @@ -95,15 +93,40 @@ python/ # Python backend (CLI) templates/ # Jinja2 HTML templates src/js/ # JavaScript frontend (modular) - index.js # Barrel exports - config/ # Theme, scales, constants - data/ # Data filtering and prep - interaction/ # Hover, click handlers - layout/ # Sizing, positioning - render/ # Drawing (shapes, text, labels, tooltips) - simulations/ # D3 force simulations - state/ # State management - utils/ # Helpers, formatters, validation + index.js # Barrel exports (re-exports all modules) + visualization/ + index.js # Main visualization factory + config/ + theme.js # Colors, fonts, layout constants + scales.js # D3 scale factories + data/ + filter.js # Filtering logic + interaction/ + hover.js # Hover event handling + click.js # Click event handling + findNode.js # Node detection via Delaunay + layout/ + resize.js # Canvas resize handling + render/ + canvas.js # Canvas setup + shapes.js # Shape drawing utilities + text.js # Text rendering + tooltip.js # Tooltip rendering + labels.js # Node labels + repoCard.js # Repo details card + simulations/ + ownerSimulation.js # Owner node forces + contributorSimulation.js # Contributor node forces + collaborationSimulation.js # Collaboration link forces + remainingSimulation.js # Remaining/community node forces + state/ + filterState.js # Filter state + interactionState.js # Hover/click state + utils/ + helpers.js # Math utilities + formatters.js # Date/number formatting + validation.js # Data validation + debug.js # Debug logging __tests__/ # Unit tests assets/ @@ -146,17 +169,63 @@ config.toml # Repository and contributor config --- +## Architecture Notes + +### Data Flow + +``` +GitHub API → Python CLI (client.py) → JSON files → CSV generation → D3.js visualization → Interactive web app +``` + +Inside the Python backend: `CLI (cli.py) → Client (client.py) → Models (models.py) → Config (config.py) → JSON/CSV output` + +### Data Storage + +Data is stored as JSON and CSV files (not a database). This keeps the project as a simple static site with no infrastructure to manage -- files are human-readable, version-controllable, and work offline. If the project grows past ~200 repositories or ~500 contributors, consider migrating to SQLite, then PostgreSQL. See `DATA_EXPANSION_PLAN.md` for details. + +### Visualization Concepts + +**Node types:** +- **Contributors** -- team members, arranged alphabetically in an outer ring, sized by total contributions +- **Repositories** -- GitHub projects, positioned by force simulation, color-coded by ownership type +- **Owners** -- intermediary nodes that visually group repos by their owner + +**Links** connect contributors to repositories (sometimes through owner nodes). Link width reflects commit count; opacity reflects recency of contribution. + +**Simulations**: Four separate D3 force simulations, each tuned for a different repo grouping pattern: +- **ownerSimulation** -- repos owned by the organization +- **contributorSimulation** -- repos with a single DevSeed contributor +- **collaborationSimulation** -- repos shared between multiple DevSeed contributors +- **remainingSimulation** -- community contributors outside the main circle + +### Rendering Pipeline + +The frontend processes data in this order: **Load → Prepare → Filter → Simulate → Render → Interact** + +The visualization uses multiple composited canvas layers for performance: main (nodes + links), tooltip, labels, and hover highlighting. Canvas is used instead of SVG because 200+ nodes and 500+ links would be too slow as DOM elements. + +### Code Patterns + +- All JS modules export **functions, not classes** +- State updates are **immutable** (e.g., `state = setHovered(state, node)`) +- All magic numbers and constants are **centralized** in `config/theme.js` + +### Dependencies + +**Python:** click, pydantic, pygithub, requests, tomli, pytest + +**JavaScript:** d3, vitest + +--- + ## Documentation Structure | Document | Purpose | Read When | |----------|---------|-----------| +| **README.md** (root) | Project overview, CLI reference, full workflows | Understanding the product and CLI usage | | **PRD.md** | Product requirements and vision | First - understand the *why* | -| **DEVELOPMENT_GUIDE.md** | Setup, workflows, local development | Setting up your environment | -| **ARCHITECTURE.md** | Code organization, current state | Understanding code structure | -| **JAVASCRIPT_REFACTORING.md** | JS modularization progress and roadmap | Working on frontend code | | **roadmap.md** | Project status, planned features, and implementation status | Planning new work | | **DATA_EXPANSION_PLAN.md** | Data collection phases (1-5) with details | Adding new data fields | -| **DECISIONS.md** | Architectural decisions and tradeoffs | Curious about design choices | --- @@ -183,12 +252,38 @@ Configured in `src/js/config/theme.js`. 1. Edit `config.toml` - add to `[contributors.devseed]` or `[contributors.alumni]` 2. Re-run data fetch and build (above) +### Making Frontend Changes + +JavaScript files in `src/js/` are auto-available to the browser without a build step during development. + +1. Make changes to files in `src/js/` +2. Refresh http://localhost:8000/ in your browser +3. Run tests to verify: `npm test` + +**Note:** If you modify `src/js/chart.js` (the main visualization), it compiles to `chart.js` in the root. If you add new modules to `src/js/`, export them from `src/js/index.js`. + +### Customizing the Visualization + +- **Colors & Fonts**: Edit `src/js/config/theme.js` +- **Layout Constants**: Edit the `LAYOUT` object in `src/js/config/theme.js` +- **Filters Available**: Check `src/js/state/filterState.js` +- **Data Filtering Logic**: See `src/js/data/filter.js` + ### Debug Visualization Issues - Open DevTools (F12) - Look for `debug-contributor-network` flag in console - Check network tab to see what data was loaded - See `src/js/utils/debug.js` for debug utilities +### Debug Python Issues + +```bash +# Run a specific test with verbose output +uv run pytest python/tests/test_file.py -v -s +``` + +The `-s` flag shows print statements and logging output. + ### Run Tests ```bash # Python @@ -202,6 +297,38 @@ npm test -- --watch --- +## Troubleshooting + +### "GitHub API rate limit exceeded" +- Make sure you're using a GitHub token: `export GITHUB_TOKEN="your_token"` +- Unauthenticated requests have a much lower limit (60/hour vs 5,000/hour) +- Wait an hour for the limit to reset, or wait for exponential backoff retry logic + +### `uv: command not found` +```bash +# Install uv +curl -LsSf https://astral.sh/uv/install.sh | sh + +# Or on macOS with Homebrew +brew install uv +``` + +### Changes to `src/js/` aren't showing up +1. Make sure you're running `python -m http.server 8000` +2. Hard-refresh your browser: Ctrl+Shift+R (or Cmd+Shift+R on Mac) +3. Check that the file was actually saved + +### Tests are failing +```bash +# Run tests with verbose output +npm test -- --reporter=verbose + +# Or for Python +uv run pytest -v +``` + +--- + ## Current Project Status See [`roadmap.md`](./roadmap.md) for full project status, planned features, and roadmap details. @@ -210,10 +337,8 @@ See [`roadmap.md`](./roadmap.md) for full project status, planned features, and ## Need Help? -- **Setting up?** → `DEVELOPMENT_GUIDE.md` -- **How does the code work?** → `ARCHITECTURE.md` +- **Project overview and CLI usage?** → `README.md` (root) - **What are we building next?** → `roadmap.md` or `DATA_EXPANSION_PLAN.md` -- **Why was a decision made?** → `DECISIONS.md` - **What's the product for?** → `PRD.md` --- diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md deleted file mode 100644 index 9cb6195..0000000 --- a/docs/DECISIONS.md +++ /dev/null @@ -1,421 +0,0 @@ -# Architectural Decisions - -Record of key decisions made in the project and their rationale. - ---- - -## Canvas Rendering (Not SVG) ✅ DECIDED - -**Decision:** Use HTML5 Canvas for rendering instead of SVG - -**Context:** -- Visualization has 200+ interactive nodes and 500+ links -- Need 60 FPS interaction response (hover, drag simulations) -- SVG would create DOM nodes for each element - -**Rationale:** -- Canvas provides superior performance for high-density visualizations -- SVG with D3 would create 700+ DOM elements (too slow) -- Canvas allows per-pixel control with animation frame batching -- Accepted tradeoff: Canvas requires custom tooltip/interaction logic (not automatic) - -**Consequences:** -- Responsible for all rendering logic (no automatic redraw) -- Must implement custom hit detection (solved with Delaunay triangulation) -- More complex tooltip positioning -- Better performance (60 FPS achievable) - -**Alternatives Considered:** -- WebGL: Overkill for this use case, harder to debug -- Hybrid (SVG + Canvas): Complexity not worth it - ---- - -## JSON/CSV Data Files (Not Database) ✅ DECIDED - -**Decision:** Store data as JSON files and export to CSV (not database) - -**Context:** -- Currently ~50 repositories, ~30 contributors -- Data collected offline (GitHub API → files → visualization) -- Need simple deployment (static site to CDN) - -**Rationale:** -- No infrastructure to manage (no database server) -- Easy to debug (human-readable files) -- Version controllable with git -- Simple to onboard new contributors (just `uv sync`) -- Works offline for local development - -**Consequences:** -- Must load all data into memory (works fine at current scale) -- No real-time filtering/querying capabilities -- File I/O overhead (negligible at current size) - -**When to Reconsider:** -- > 200 repositories -- > 500 contributors -- Need real-time updates -- Multiple services need access - -**Migration Path:** -- Stage 1: SQLite (local, single-file) when data volume warrants -- Stage 2: PostgreSQL (if multiple consumers needed) - -See `DATA_EXPANSION_PLAN.md` for detailed database discussion. - ---- - -## Modular JavaScript Architecture ✅ DECIDED - -**Decision:** Refactor JavaScript from monolith to modular structure - -**Context:** -- Original codebase: 3,400+ line `index.js` (from ORCA template) -- Hard to review, test, extend -- Multiple responsibilities: data prep, layout, rendering, interaction - -**Rationale:** -- Each module has single responsibility -- Easier to test in isolation -- Improves code review process (<300 lines per file) -- Supports future contributors and maintainability -- Each module becomes focused and reusable - -**Progress:** -- 60% complete (29 modules extracted) -- 4,642 lines in modular code -- Main file reduced from 3,400 → 2,059 lines -- ~900 lines remaining to extract - -**Target State:** -- Main orchestrator: ~300 lines (thin coordinating layer) -- All other modules: <300 lines each -- Clear data flow: Load → Prepare → Simulate → Render → Interact - -**Implementation Approach:** -- Extract gradually (don't rewrite from scratch) -- Keep tests passing after each extraction -- Each commit focuses on one extraction -- No rewrite of logic, just reorganization - ---- - -## Pydantic for Data Validation ✅ DECIDED - -**Decision:** Use Pydantic models for all data structures (Python) - -**Context:** -- Need to validate GitHub API responses -- Multiple sources of data (GitHub API, config files, generated) -- Type safety and runtime validation needed - -**Rationale:** -- Validates structure at entry point -- Type hints catch errors early -- Clear error messages when validation fails -- Automatic JSON serialization -- Works with mypy for static type checking - -**Implementation:** -- `models.py` defines Pydantic models for Repository, Link, etc. -- `client.py` converts GitHub API responses to our models -- `config.py` validates TOML configuration - -**Alternatives Considered:** -- Dataclasses: Less validation capability -- Plain dicts: No type safety, error-prone - ---- - -## Click for CLI Framework ✅ DECIDED - -**Decision:** Use Click for Python CLI commands - -**Context:** -- Need multiple subcommands (data, csvs, build, discover, list-contributors) -- Must be easy to use and document -- Should work well with automated workflows - -**Rationale:** -- Simple decorator-based command definition -- Automatic help text generation -- Type-safe argument/option handling -- Easy to test - -**Commands Implemented:** -- `data` - Fetch from GitHub -- `csvs` - Generate CSV exports -- `build` - Build static site -- `discover` - Find new repos -- `list-contributors` - Show configured contributors - -**Alternatives Considered:** -- argparse: More verbose -- Typer: Newer, nice syntax but less mature at time of decision - ---- - -## D3.js Force Simulations (Not Manual Layout) ✅ DECIDED - -**Decision:** Use D3 force-directed simulations to position nodes - -**Context:** -- Visualization shows relationships between nodes -- Hundreds of edges between nodes -- Need to avoid overlap and show structure - -**Rationale:** -- Force simulations naturally cluster related items -- Prevents overlap automatically (collision detection) -- Produces intuitive, readable layouts -- Interactive repositioning possible in future -- D3 provides proven, tested implementation - -**Architecture:** -- Four separate simulations for different repo grouping patterns -- Each simulation optimized for its use case -- Contributes to final layout naturally - -**Alternatives Considered:** -- Hierarchical tree layout: Doesn't fit network structure -- Grid layout: Too regular, loses relationship information -- Manual positioning: Not scalable as data grows - ---- - -## TypeScript Not Used ✅ DECIDED - -**Decision:** Use vanilla JavaScript (ES6 modules) without TypeScript - -**Context:** -- Relatively small frontend codebase -- Team comfort with JavaScript -- Deployment to static site (no build step needed) -- Fast iteration during development - -**Rationale:** -- No build step overhead during development -- Files immediately available in browser -- Changes visible without refresh -- Simpler development workflow -- Small module size keeps files manageable - -**Trade-offs:** -- Less compile-time type checking -- Rely on JSDoc for type hints -- Rely on testing for correctness - -**Alternatives Considered:** -- TypeScript: Good for larger projects, but adds complexity -- Flow: Similar issues to TypeScript - ---- - -## No Transpilation (ES6 Modules) ✅ DECIDED - -**Decision:** Use modern ES6 modules directly (no Babel transpilation) - -**Context:** -- All modern browsers support ES6 modules -- Simplifies development workflow -- Reduces build complexity - -**Rationale:** -- Developers can see their changes immediately -- No build step required during development -- Smaller cognitive overhead -- Works fine for this project's scale - -**Requirements:** -- Users must have modern browsers (works with all current major browsers) -- Not optimized for IE11 (but that's acceptable) - -**Deployment:** -- esbuild for production bundling (if needed) -- Currently deployed as static modules - ---- - -## Separate Simulations by Repo Type ✅ DECIDED - -**Decision:** Use different D3 force simulations based on how repos are grouped - -**Context:** -- Some repos belong to single owner -- Some repos have single DevSeed contributor -- Some repos have multiple collaborators -- Positioning needs differ for each type - -**Rationale:** -- Optimized force parameters for each scenario -- Natural clustering by collaboration pattern -- Cleaner visual hierarchy -- Prevents one type dominating layout - -**Implementation:** -``` -- ownerSimulation: Repos with single owner -- contributorSimulation: Repos with single DevSeed contributor -- collaborationSimulation: Repos shared between multiple contributors -- remainingSimulation: Contributors outside main circle -``` - -**Alternatives Considered:** -- Single universal simulation: Would require compromise on parameters -- Manual positioning: Doesn't scale, not reusable - ---- - -## Configuration via TOML ✅ DECIDED - -**Decision:** Use TOML for configuration (repositories, contributors) - -**Context:** -- Need to specify: - - Which repos to track - - Which contributors to include - - How to group/filter data - -**Rationale:** -- Human-readable and writable -- Better than JSON for configuration -- Python stdlib support (via tomli) -- Easy to edit without breaking structure - -**Example:** -```toml -[repositories] -"owner/repo" = "Display Name" - -[contributors.devseed] -github_username = "Display Name" -``` - -**Alternatives Considered:** -- JSON: More formal, harder to write -- YAML: Whitespace sensitivity can be error-prone -- Python file: Security concerns, harder to review - ---- - -## Removed ORCA Code ✅ DECIDED - -**Decision:** Remove ORCA-specific code and rebrand visualization - -**Context:** -- Project started as ORCA (top-contributor-network) -- Needed to make it DevSeed-specific -- ORCA code was a foundation, not meant to be kept - -**Changes Made:** -- ✅ Renamed `createORCAVisual` → `createContributorNetworkVisual` -- ✅ Removed ORCA-specific UI elements -- ✅ Updated debug flags (`orca-debug` → `debug-contributor-network`) -- ✅ Removed ORCA theming logic -- ✅ Updated branding to Development Seed colors -- ✅ Kept original MPL license and attribution - -**Rationale:** -- Make it clear this is the DevSeed visualization -- Avoid confusion with original ORCA project -- Simplify codebase (removed unused features) -- Establish clear ownership - -**Attribution:** -- License: MPL 2.0 (from ORCA) -- Credit: Original ORCA by Nadieh Bremer -- Link: https://github.com/nbremer/ORCA - ---- - -## Ruff for Python Linting/Formatting ✅ DECIDED - -**Decision:** Use Ruff instead of Black + Flake8 + isort - -**Context:** -- Python code quality tooling landscape fragmented -- Want unified approach -- Need fast, reliable tools - -**Rationale:** -- Single tool for format + lint + import sorting -- Very fast (written in Rust) -- Zero-config setup -- Compatible with Black formatting -- Better than individual tools - -**Configuration:** -- `pyproject.toml` defines settings -- CI runs `ruff format --check`, `ruff check`, `mypy` - -**Alternatives Considered:** -- Black + Flake8: Works but fragmented -- Pylint: Slower, more false positives -- Autopep8: Older approach - ---- - -## Vitest for JavaScript Testing ✅ DECIDED - -**Decision:** Use Vitest for unit testing JavaScript - -**Context:** -- Need to test modules independently -- Want fast test execution -- Want to test in Node (not browser) - -**Rationale:** -- Vitest is fast (built on Vite) -- Compatible with Jest syntax -- Zero-config with Vite setup -- Great for module testing - -**Test Coverage:** -- 75+ tests across modules -- Tests for filtering, validation, formatting, helpers -- More tests added as new modules extracted - -**Alternatives Considered:** -- Jest: Slower, larger -- Mocha: More setup required -- QUnit: Older approach - ---- - -## No Component Libraries ✅ DECIDED - -**Decision:** Build UI with vanilla HTML/CSS, no React/Vue/etc. - -**Context:** -- Small focused app (visualization + tooltip + controls) -- Performance critical -- No complex state management needs - -**Rationale:** -- Minimal dependencies -- Full control over rendering -- Better performance (Canvas + minimal DOM) -- Simpler deployment (static files) - -**Trade-off:** -- More manual DOM management for tooltips -- Build tooltips from scratch - -**When to Reconsider:** -- If dashboard complexity grows significantly -- If multiple views needed beyond visualization - ---- - -## Summary of Key Principles - -1. **Performance First** - Canvas rendering, modular code, fast tooling -2. **Simplicity** - No unnecessary frameworks, JSON/CSV data, static deployment -3. **Maintainability** - Modularization, testing, clear separation of concerns -4. **Scalability** - Design for growth, but don't over-engineer prematurely -5. **Attribution** - Respect original creators, use proper licensing - ---- - -**Last Updated**: February 2026 diff --git a/docs/DEVELOPMENT_GUIDE.md b/docs/DEVELOPMENT_GUIDE.md deleted file mode 100644 index 0b0185a..0000000 --- a/docs/DEVELOPMENT_GUIDE.md +++ /dev/null @@ -1,467 +0,0 @@ -# Development Guide - -How to set up your local environment, run the project, and make changes. - ---- - -## Prerequisites - -Before getting started, install: - -- **[uv](https://docs.astral.sh/uv/getting-started/installation/)** - Fast Python package manager (required) -- **[Node.js](https://nodejs.org/)** 18+ - For JavaScript tooling -- **[Git](https://git-scm.com/)** - For version control -- **GitHub personal access token** - With `public_repo` scope (for fetching data) - -### Getting a GitHub Token - -1. Go to https://github.com/settings/tokens -2. Click "Generate new token (classic)" -3. Give it a name (e.g., "contributor-network") -4. Check `public_repo` scope -5. Generate and copy the token -6. Store it somewhere safe (you'll use it for the `data` command) - ---- - -## Installation - -### 1. Clone the Repository - -```bash -git clone https://github.com/developmentseed/contributor-network.git -cd contributor-network -``` - -### 2. Install Python Dependencies - -```bash -uv sync -``` - -This installs all Python dependencies specified in `pyproject.toml` into a virtual environment. - -### 3. Install JavaScript Dependencies - -```bash -npm install -``` - -This installs D3.js, Vitest, and other frontend tooling. - ---- - -## Running Locally - -### Option A: View the Current Build - -If you already have data in `assets/data/`, you can view the built site locally: - -```bash -python -m http.server 8000 -``` - -Then open http://localhost:8000/ in your browser. - -### Option B: Full Workflow - Fetch Data & Build - -To update the visualization with fresh GitHub data: - -```bash -# 1. Set your GitHub token -export GITHUB_TOKEN="your_token_here" - -# 2. (Optional) Discover new repos that multiple team members contribute to -uv run contributor-network discover --min-contributors 2 - -# 3. Edit config.toml to add/remove repos or contributors - -# 4. Fetch data from GitHub -uv run contributor-network data assets/data assets/data - -# 5. Generate CSV files -uv run contributor-network csvs assets/data - -# 6. Build the static site -uv run contributor-network build assets/data dist - -# 7. View the built site -cd dist && python -m http.server 8000 -# Open http://localhost:8000/ -``` - ---- - -## Development Workflows - -### Making Frontend Changes - -The JavaScript files in `src/js/` are auto-available to the browser without a build step during development. - -**Workflow:** -1. Make changes to files in `src/js/` -2. Refresh http://localhost:8000/ in your browser -3. See changes immediately (no build required) -4. Run tests to verify: `npm test` - -**Special cases:** -- If you modify `src/js/chart.js` (the main visualization), it compiles to `chart.js` in the root -- If you add new modules to `src/js/`, export them from `src/js/index.js` - -### Making Backend Changes - -Python CLI changes take effect immediately (no build step needed). - -**Workflow:** -1. Make changes to `python/contributor_network/` -2. Re-run the CLI command: `uv run contributor-network ` -3. Changes are reflected -4. Run tests: `uv run pytest` - -### Adding a New Repository to Track - -1. **Get the repo URL** - e.g., `owner/repo-name` -2. **Edit `config.toml`**: - ```toml - [repositories] - "owner/repo-name" = "Display Name" - ``` -3. **Fetch fresh data**: - ```bash - export GITHUB_TOKEN="your_token_here" - uv run contributor-network data assets/data assets/data - ``` -4. **Regenerate CSVs**: - ```bash - uv run contributor-network csvs assets/data - ``` -5. **Rebuild the site**: - ```bash - uv run contributor-network build assets/data dist - ``` - -### Adding a New Contributor - -1. **Edit `config.toml`**: - ```toml - [contributors.devseed] - github_username = "Display Name" - - # Or for alumni/external: - [contributors.alumni] - github_username = "Display Name" - ``` -2. **Fetch data and rebuild** (same as above) - -### Customizing the Visualization - -**Colors & Fonts**: Edit `src/js/config/theme.js` - -**Layout Constants**: Edit the `LAYOUT` object in `src/js/config/theme.js` - -**Filters Available**: Check `src/js/state/filterState.js` to see what filters are available - -**Data Filtering Logic**: See `src/js/data/filter.js` for how filters are applied - ---- - -## Quality Checks & Tests - -### Python Quality Checks - -```bash -# Format check (no changes) -uv run ruff format --check . - -# Lint check (no changes) -uv run ruff check . - -# Type checking -uv run mypy - -# Run tests -uv run pytest - -# Run a specific test -uv run pytest python/tests/test_config.py::test_function_name -``` - -### Auto-Fix Python Issues - -```bash -# Auto-format all files -uv run ruff format . - -# Auto-fix fixable lint issues -uv run ruff check --fix . -``` - -### JavaScript Tests - -```bash -# Run all tests -npm test - -# Run tests in watch mode (re-run on file changes) -npm test -- --watch - -# Run a specific test file -npm test -- src/js/__tests__/filter.test.js -``` - -### Run All Checks (As in CI) - -```bash -# Python -uv run ruff format --check . -uv run ruff check . -uv run mypy -uv run pytest - -# JavaScript -npm test -``` - ---- - -## CLI Commands Reference - -### `list-contributors` - -List all configured contributors by category: - -```bash -uv run contributor-network list-contributors -``` - -**Output**: Shows current DevSeed, alumni, and external contributors. - -### `discover` - -Find new repositories where multiple DevSeed employees have contributed: - -```bash -export GITHUB_TOKEN="your_token_here" -uv run contributor-network discover --min-contributors 2 --limit 50 -``` - -**Options:** -- `--min-contributors N` - Minimum number of DevSeed contributors in repo (default: 2) -- `--limit N` - Limit results to N repos (default: 50) - -**Output**: Shows repos not yet in `config.toml` where DevSeed has activity. - -### `data` - -Fetch contribution data from GitHub for all configured repositories: - -```bash -export GITHUB_TOKEN="your_token_here" -uv run contributor-network data assets/data assets/data -``` - -**Arguments:** -- First arg: Input directory (where to save JSON data) - usually `assets/data` -- Second arg: Output directory (for this command, same as input) - usually `assets/data` - -**Options:** -- `--all-contributors` - Include alumni/friends, not just current DevSeed employees - -**Output**: Creates JSON files for each repository with contribution data. - -### `csvs` - -Generate CSV files from the fetched JSON data: - -```bash -uv run contributor-network csvs assets/data -``` - -**Argument:** Directory containing JSON files (usually `assets/data`) - -**Output:** Creates: -- `repositories.csv` - Repository metadata -- `contributors.csv` - Contributor-to-repo relationships - -### `build` - -Build the static site to deploy: - -```bash -uv run contributor-network build assets/data dist -``` - -**Arguments:** -- First arg: Data directory (`assets/data`) -- Second arg: Output directory (`dist`) - -**Output:** Creates `dist/` with static HTML/CSS/JS ready to deploy. - ---- - -## Project Structure for Developers - -### Python Backend - -``` -python/ -├── contributor_network/ -│ ├── __init__.py -│ ├── cli.py # Click CLI commands (entry point) -│ ├── client.py # GitHub API wrapper (uses PyGithub) -│ ├── config.py # Pydantic config models -│ ├── models.py # Data models (Repo, Link, etc.) -│ └── __main__.py # CLI entry point -├── templates/ -│ └── index.html.j2 # Jinja2 template for index.html -└── tests/ - └── test_*.py # Unit tests -``` - -**Key entry points:** -- `python/contributor_network/cli.py` - All CLI commands defined here -- `python/contributor_network/client.py` - GitHub API integration -- `python/contributor_network/models.py` - Data structure definitions - -### JavaScript Frontend - -``` -src/js/ -├── index.js # Barrel exports (re-exports all modules) -├── visualization/ -│ └── index.js # Main visualization factory -├── config/ # Configuration -│ ├── theme.js # Colors, fonts, layout constants -│ └── scales.js # D3 scale factories -├── data/ # Data operations -│ └── filter.js # Filtering logic -├── interaction/ # Event handlers -│ ├── hover.js # Hover event handling -│ ├── click.js # Click event handling -│ └── findNode.js # Node detection via Delaunay -├── layout/ # Layout & positioning -│ └── resize.js # Canvas resize handling -├── render/ # Drawing functions -│ ├── canvas.js # Canvas setup -│ ├── shapes.js # Shape drawing utilities -│ ├── text.js # Text rendering -│ ├── tooltip.js # Tooltip rendering -│ ├── labels.js # Node labels -│ └── repoCard.js # Repo details card -├── simulations/ # D3 force simulations -│ ├── ownerSimulation.js -│ ├── contributorSimulation.js -│ ├── collaborationSimulation.js -│ └── remainingSimulation.js -├── state/ # State management -│ ├── filterState.js # Filter state -│ └── interactionState.js # Hover/click state -├── utils/ # Utilities -│ ├── helpers.js # Math utilities -│ ├── formatters.js # Date/number formatting -│ ├── validation.js # Data validation -│ └── debug.js # Debug logging -└── __tests__/ # Unit tests -``` - ---- - -## Debugging Tips - -### JavaScript Debugging - -**In Browser DevTools:** -1. Open DevTools (F12) -2. Check the Console for errors -3. Look for debug output (the code logs with `debug-contributor-network` flag) -4. Inspect the Network tab to see what data was loaded -5. Check the Elements tab to inspect the canvas and DOM - -**Enable Debug Logging:** -```javascript -// In the browser console -localStorage.setItem('debug', 'debug-contributor-network'); -// Reload the page -``` - -**Check Loaded Data:** -```javascript -// In the browser console -// After the visualization loads, you can access: -console.log(window.data); // Raw data -console.log(window.nodes); // Processed nodes -``` - -### Python Debugging - -**Add print statements:** -```python -# In Python code -print(f"Debug: {variable_name}") # Will show in terminal - -# Or use logging -import logging -logger = logging.getLogger(__name__) -logger.debug(f"Debug message: {value}") -``` - -**Run a specific test with output:** -```bash -uv run pytest python/tests/test_file.py -v -s -``` - -The `-s` flag shows print statements and logging output. - ---- - -## Common Issues & Solutions - -### Issue: "GitHub API rate limit exceeded" - -**Solution:** -- Make sure you're using a GitHub token: `export GITHUB_TOKEN="your_token"` -- Unauthenticated requests have a much lower limit (60/hour vs 5,000/hour) -- Wait an hour for the limit to reset, or wait for exponential backoff retry logic - -### Issue: `uv: command not found` - -**Solution:** -```bash -# Install uv if you haven't -curl -LsSf https://astral.sh/uv/install.sh | sh - -# Or on macOS with homebrew -brew install uv -``` - -### Issue: Changes to `src/js/` aren't showing up - -**Solution:** -1. Make sure you're running `python -m http.server 8000` -2. Hard-refresh your browser: Ctrl+Shift+R (or Cmd+Shift+R on Mac) -3. Check that the file was actually saved - -### Issue: Tests are failing - -**Solution:** -```bash -# Run tests with verbose output -npm test -- --reporter=verbose - -# Or for Python -uv run pytest -v -``` - ---- - -## Where to Get Help - -- **Setup problems**: Check this file first -- **Code structure questions**: See `ARCHITECTURE.md` -- **What should I work on next?**: See `roadmap.md` -- **How does filtering work?**: See the code comments in `src/js/data/filter.js` -- **Data expansion ideas**: See `DATA_EXPANSION_PLAN.md` - ---- - -**Last Updated**: February 2026 diff --git a/docs/JAVASCRIPT_REFACTORING.md b/docs/JAVASCRIPT_REFACTORING.md deleted file mode 100644 index 3338db1..0000000 --- a/docs/JAVASCRIPT_REFACTORING.md +++ /dev/null @@ -1,306 +0,0 @@ -# JavaScript Refactoring Progress & Roadmap - -Current status of the JavaScript modularization effort. - ---- - -## Overall Progress: 60% Complete ✅ - -| Metric | Before | Current | Target | Status | -|--------|--------|---------|--------|--------| -| Main file size | 3,400+ lines | 2,059 lines | ~300-400 lines | 🟡 In Progress | -| Modular files | 0 | 29 modules | 29+ modules | ✅ Complete | -| Total modular code | 0 lines | 4,642 lines | ~4,500 lines | ✅ Complete | -| Largest module | N/A | 533 lines (tooltip.js) | <300 lines | 🟡 Needs work | - ---- - -## What's Been Done ✅ - -### Phase 1: Configuration & Constants -- ✅ `src/js/config/theme.js` - Colors, fonts, layout constants (119 lines) -- ✅ `src/js/config/scales.js` - D3 scale factories (121 lines) -- **Result:** Centralized configuration, easy to customize - -### Phase 2: State Management -- ✅ `src/js/state/filterState.js` - Filter state management (67 lines) -- ✅ `src/js/state/interactionState.js` - Hover/click state (106 lines) -- **Result:** Clear separation of state concerns - -### Phase 3: Force Simulations -- ✅ `src/js/simulations/ownerSimulation.js` (125 lines) -- ✅ `src/js/simulations/contributorSimulation.js` (132 lines) -- ✅ `src/js/simulations/collaborationSimulation.js` (188 lines) -- ✅ `src/js/simulations/remainingSimulation.js` (84 lines) -- **Result:** 529 lines extracted, easier to test and modify - -### Phase 4: Interaction Handlers -- ✅ `src/js/interaction/hover.js` - Mouse hover handling (87 lines) -- ✅ `src/js/interaction/click.js` - Click handling (85 lines) -- ✅ `src/js/interaction/findNode.js` - Node detection via Delaunay (67 lines) -- **Result:** Separated event logic from rendering - -### Phase 5: Render Functions -- ✅ `src/js/render/shapes.js` - Shape drawing (277 lines) -- ✅ `src/js/render/text.js` - Text utilities (275 lines) -- ✅ `src/js/render/tooltip.js` - Tooltip rendering (533 lines) ⚠️ -- ✅ `src/js/render/labels.js` - Node labels (141 lines) -- ✅ `src/js/render/repoCard.js` - Repo card rendering (248 lines) -- ✅ `src/js/render/canvas.js` - Canvas setup (207 lines) -- **Result:** Rendering logic broken down by component - -### Phase 6: Layout & Utilities -- ✅ `src/js/layout/resize.js` - Resize handling (122 lines) -- ✅ `src/js/utils/helpers.js` - Math utilities (121 lines) -- ✅ `src/js/utils/formatters.js` - Date/number formatting (153 lines) -- ✅ `src/js/utils/validation.js` - Data validation (185 lines) -- ✅ `src/js/utils/debug.js` - Debug logging (147 lines) -- **Result:** Utilities organized by concern - -### Phase 7: Data Management -- ✅ `src/js/data/filter.js` - Filtering logic (217 lines) -- **Result:** Pure functions for data transformation - ---- - -## What Still Needs Work 🟡 - -### High Priority - -**1. Extract `prepareData()` function** (~515 lines) -- **Location:** Currently in main `index.js`, lines 683-1198 -- **Should move to:** `src/js/data/prepare.js` -- **What it does:** Transforms raw CSV data into nodes and links -- **Complexity:** High - depends on many local variables -- **Effort:** 4-6 hours -- **Why important:** Largest single extraction remaining, holds up refactoring - -**2. Extract `positionContributorNodes()` function** (~117 lines) -- **Location:** Currently in main `index.js`, lines 1212-1310 -- **Should move to:** `src/js/layout/positioning.js` -- **What it does:** Calculates contributor ring positions -- **Complexity:** Medium - clear inputs/outputs -- **Effort:** 2-3 hours -- **Why important:** Separates layout logic from main orchestrator - -**3. Simplify `draw()` function** (~166 lines) -- **Location:** Currently in main `index.js`, lines 448-514 -- **Should move to:** `src/js/render/draw.js` or keep as thin orchestrator -- **What it does:** Main drawing loop, calls render functions -- **Complexity:** Medium - mostly orchestration -- **Effort:** 2-3 hours -- **Why important:** Main loop should be readable at a glance - -### Medium Priority - -**4. Extract `drawHoverState()` function** (~130 lines) -- **Location:** Currently in main `index.js`, lines 1538-1668 -- **Should move to:** `src/js/render/hoverState.js` -- **Complexity:** Medium -- **Effort:** 2 hours - -**5. Extract helper functions** (~100 lines spread across code) -- `isValidContributor()` - Validation helper -- `syncDelaunayVars()` - Delaunay state sync -- `calculateEdgeCenters()` - Link path calculations -- `calculateLinkGradient()` - Gradient colors -- **Should move to:** `src/js/utils/` modules - -**6. Remove wrapper functions** (~50 lines) -- Temporary compatibility layer after migration -- Can be removed once main extraction complete - -### Lower Priority - -**7. Extract canvas setup code** (~60 lines) -- Already partially in `canvas.js` -- Can move remaining setup to initialization module - ---- - -## Detailed Extraction Roadmap - -### Step 1: Extract `prepareData()` → `src/js/data/prepare.js` - -**What to do:** -1. Create new file `src/js/data/prepare.js` -2. Move `prepareData()` function from main `index.js` -3. Extract helper functions it depends on -4. Import it in main `index.js` -5. Update tests to test the module directly -6. Verify build still works - -**Dependencies to handle:** -- Uses colors from `theme.js` (import them) -- Uses scales from `scales.js` (import them) -- Uses validation from `validation.js` (import them) -- Creates objects with specific structure (document in JSDoc) - -**Expected after extraction:** -- Main `index.js` shrinks by ~515 lines -- Easier to test data transformation separately -- Clearer data flow from CSV → nodes/links - -### Step 2: Extract `positionContributorNodes()` → `src/js/layout/positioning.js` - -**What to do:** -1. Create new file `src/js/layout/positioning.js` -2. Move `positionContributorNodes()` function -3. Import from main `index.js` -4. Add unit tests for positioning logic - -**Dependencies:** -- Layout constants from `theme.js` (already exported) -- Math utilities from `helpers.js` (import as needed) - -**Expected after extraction:** -- Clear separation of layout vs rendering -- Easier to test position calculations -- Could support alternative layout algorithms in future - -### Step 3: Simplify `draw()` function - -**What to do:** -1. Review current `draw()` - what's orchestration vs. logic? -2. Extract logic into separate modules where applicable -3. Reduce `draw()` to simple: get state → call render functions → schedule next frame -4. Consider creating `src/js/render/draw.js` for orchestration - -**Expected after extraction:** -- Main loop readable in 5-10 lines -- Each frame clearly shows: what updates, what renders -- Easier to understand frame-by-frame flow - ---- - -## Module Size Targets - -After all extractions, target <300 lines per file: - -| File | Current | Target | -|------|---------|--------| -| `src/js/data/prepare.js` | N/A | ~400 lines | -| `src/js/layout/positioning.js` | N/A | ~100 lines | -| `src/js/render/draw.js` | N/A | ~100 lines | -| `src/js/render/hoverState.js` | N/A | ~120 lines | -| Main `index.js` | 2,059 | ~300-400 lines | -| All other modules | Various | <300 lines ✅ | - ---- - -## Testing Strategy - -### Current Test Coverage - -- 75 tests across extracted modules -- Tests for filtering, validation, formatting, helpers -- Vitest framework - -### Testing New Extractions - -When extracting new functions: -1. Write unit tests *first* if not already present -2. Test the extracted function in isolation -3. Test integration with dependent modules -4. Run full test suite to ensure no regressions - -**Example for `prepareData()`:** -```javascript -// src/js/data/__tests__/prepare.test.js -import { prepareData } from '../prepare.js'; - -describe('prepareData', () => { - it('transforms raw data into nodes and links', () => { - const raw = { /* ... */ }; - const result = prepareData(raw, config); - expect(result.nodes).toBeDefined(); - expect(result.links).toBeDefined(); - }); -}); -``` - ---- - -## Implementation Timeline - -| Task | Effort | Estimated Duration | -|------|--------|-------------------| -| Extract `prepareData()` | High | 4-6 hours | -| Extract `positionContributorNodes()` | Medium | 2-3 hours | -| Simplify `draw()` | Medium | 2-3 hours | -| Extract `drawHoverState()` | Medium | 2 hours | -| Extract helpers & cleanup | Low-Med | 2-3 hours | -| **Total** | **High** | **12-18 hours** | - -**Recommended breakdown:** -- Session 1: Extract `prepareData()` (largest impact) -- Session 2: Extract positioning & simplify draw -- Session 3: Polish remaining functions - ---- - -## Migration Checklist - -For each extraction, follow this checklist: - -- [ ] Create new module file -- [ ] Copy code to new file -- [ ] Identify and resolve dependencies -- [ ] Add JSDoc comments -- [ ] Write or update unit tests -- [ ] Update main `index.js` to import -- [ ] Run full test suite: `npm test` -- [ ] Verify build: `npm run build` -- [ ] Test in browser (visual regression) -- [ ] Commit with clear message - -**Commit message template:** -``` -refactor(js): Extract [function_name] to [new_module] - -- Moved [function_name] from index.js to [new_module] -- Reduces index.js by [X] lines -- Adds [Y] lines to [new_module] -- All tests passing -``` - ---- - -## Benefits of Completing This - -✅ **Main orchestrator becomes readable** - ~300 lines instead of 2,000+ - -✅ **Each module has single responsibility** - Easier to understand - -✅ **Improved testability** - Can test functions in isolation - -✅ **Better git history** - Smaller, focused commits - -✅ **Easier code review** - <300 lines per module is reviewable in 10-15 min - -✅ **Reduced maintenance burden** - Clearer code, fewer dependencies per file - -✅ **Foundation for future work** - Makes adding features easier - ---- - -## How to Track Progress - -1. Check the line count in main `index.js`: `wc -l src/js/index.js` -2. Run tests after each extraction: `npm test` -3. Check module sizes: `wc -l src/js/*/` -4. Monitor with git history: `git log --oneline src/js/` - ---- - -## Questions to Ask When Extracting - -- **Does this function have a single, clear responsibility?** ✓ -- **Can it be tested independently?** (If not, break it down) -- **Does it depend on many external variables?** (If yes, pass as parameters) -- **Is it more than 200 lines?** (If yes, consider breaking further) -- **Will other modules want to use this?** (If yes, export clearly) - ---- - -**Last Updated**: February 2026 diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md deleted file mode 100644 index 7ed3a89..0000000 --- a/docs/ROADMAP.md +++ /dev/null @@ -1,508 +0,0 @@ -# Roadmap - -Project status, planned features, and verification criteria. - ---- - -## Project Status - -### Completed -- Core visualization and interactions -- Repository and contributor discovery -- Basic filtering (organization, metrics) -- Data expansion phases 1-2 (metadata and community metrics) -- JavaScript modularization and refactoring -- ORCA code removal and rebrand - -### In Progress -- UX and chart readability improvements (font sizes, UI refinement) - -### Planned -- See [Current Implementation Batch](#features-current-implementation-batch) and [Longer-Term Enhancements](#longer-term-enhancements) below - ---- - -## Features: Current Implementation Batch - -These are the next features planned for the visualization. Implementation details are documented in `IMPLEMENTATION_PLAN.md`. - ---- - -### Feature 1: More Repository Filters 🟡 Ready to Design - -**What it does:** -Add filtering UI controls for: -- Minimum stars -- Minimum forks -- Minimum watchers -- Programming language - -**Why:** -- Let users explore by project scale -- Filter by tech stack -- Discover active vs abandoned projects -- Show modern tech preferences - -**Implementation approach:** - -**4a. Extend filter state** (`js/state/filterState.js`) -```javascript -{ - organizations: [], - starsMin: null, - forksMin: null, - watchersMin: null, - language: null -} -``` - -**4b. Extend filtering logic** (`js/chart.js` `applyFilters()`) -```javascript -// After org filter, add: -if (activeFilters.starsMin !== null) { - visibleRepos = visibleRepos.filter(r => r.stars >= activeFilters.starsMin); -} -// Same for forks, watchers, language -``` - -**4c. Add UI controls** (`index.html`) -- Range sliders for stars, forks, watchers -- Dropdown for language selection -- "Clear All Filters" button - -**Verification:** -``` -✓ Test each filter independently -✓ Test filters in combination -✓ Verify "Clear All" resets everything -✓ Check language dropdown populated correctly -✓ Test with no repos matching filters -``` - -**Status:** 🟡 Design ready, implementation ready - ---- - -### Feature 2: Visual Flows Target Specific Repo on Hover/Click 🟡 Ready to Design - -**What it does:** -When a contributor is selected (clicked), hovering over different repos shows only the relevant link for that contributor-repo pair, not all their connections. - -**Why:** -- Shows specific collaboration pathways -- Less visual noise for highly collaborative contributors -- Better understanding of individual relationships - -**Current behavior:** -- Click contributor → select them -- Hover repo → see all their links light up - -**Target behavior:** -- Click contributor → select them -- Hover repo → show ONLY link to that specific repo (dimly show others) - -**Implementation:** - -**6a. Track hovered repo during click state** -```javascript -// In interactionState.js, add: -hoveredRepoWhileClicked: null -``` - -**6b. Filter links during hover rendering** -When clicked node is contributor and hovered node is repo: -- Find links connecting them -- Draw targeted links at full opacity -- Draw others at ~0.05 opacity (ghost them) - -**6c. Handle owner intermediary** -Some links go: contributor → owner → repo -- Need to highlight both segments -- Owner node's neighbor_links contain owner→repo links - -**Verification:** -``` -✓ Click contributor -✓ Hover different repos -✓ See only relevant link highlighted -✓ Test with owner-grouped repos -✓ Verify clicking away clears state -``` - -**Status:** 🟡 Design ready, implementation ready - ---- - -### Feature 3: Click Action to Hide Irrelevant Contributors/Repos 🟡 Ready to Design - -**What it does:** -When a user clicks a contributor node, the chart hides all unrelated contributors and repos, keeping only the clicked contributor, their linked repos, and any co-contributors on those repos. The details panel expands to show richer information about the selected contributor. - -**Why:** -- Declutters the view for highly connected networks -- Lets users focus on one contributor's ecosystem -- Creates space for showing deeper data (commit timelines, repo breakdowns) in the details panel - -**Current behavior:** -- Click contributor → Delaunay index narrows to neighbors, main canvas fades to 15% opacity, hover canvas shows the contributor's links and neighbor nodes -- All other nodes remain drawn on the faded main canvas - -**Target behavior:** -- Click contributor → the chart rebuilds with only the relevant subset of data visible (similar to how org filtering works), and the details panel shows expanded information -- Click background or press Escape → restore full chart - -**Implementation:** - -**3a. Add click-filter mode to interaction state** (`js/state/interactionState.js`) -```javascript -// Add to state object: -clickFilterActive: false, -clickFilterContributor: null -``` - -**3b. Build a click-filter function** (`js/chart.js`) - -Model this on the existing `applyFilters()` cascade but driven by a clicked contributor rather than UI controls: - -```javascript -function applyClickFilter(contributorNode) { - // 1. Find all repos linked to this contributor - const linkedRepoIds = new Set( - contributorNode.neighbor_links - .map(l => getLinkNodeId(l.target)) - .filter(id => /* is repo or owner */) - ); - // 2. Find all contributors who also link to those repos - // 3. Filter visibleRepos, visibleLinks, visibleContributors - // 4. Call chart.rebuild() to re-layout with subset -} -``` - -Key difference from org filtering: this is a *temporary* filter triggered by interaction, not the filter UI. Store the pre-click data snapshot so it can be restored on deselect. The existing `originalContributors/Repos/Links` pattern works here — just avoid overwriting them. - -**3c. Wire click handler** (`js/interaction/click.js`) - -On contributor click: -1. Set `clickFilterActive = true` and store the contributor -2. Call `applyClickFilter(contributorNode)` -3. Optionally animate the transition (fade out irrelevant nodes before rebuild) - -On background click or Escape: -1. Set `clickFilterActive = false` -2. Restore original data arrays -3. Call `chart.rebuild()` - -**3d. Expand the details panel** (`js/render/tooltip.js`) - -When click-filter is active, render a richer panel for the selected contributor: -- Total commits across all visible repos -- List of repos with individual commit counts -- Date range of activity per repo -- Languages across their repos -- Co-contributors (other contributors sharing repos) - -This panel should be drawn on the click canvas so it persists across hover interactions. - -**Risks:** -- Rebuild performance: full `chart.rebuild()` re-runs all force simulations. May need to cache simulation results or skip simulations for small subsets. -- State complexity: two filter systems (UI filters + click filter) must compose correctly. Click filter should operate on already-UI-filtered data, not raw originals. - -**Verification:** -``` -✓ Click contributor → only relevant nodes/links remain -✓ Click background → full chart restores -✓ Details panel shows expanded contributor info -✓ Works correctly when org/metric filters are also active -✓ Clicking a different contributor switches the filter -✓ Zoom/pan state preserved across click-filter transitions -``` - -**Status:** 🟡 Design ready, implementation ready - ---- - -### Feature 4: Time Range Filter for Commit Activity 🟡 Ready to Design - -**What it does:** -Add a time range slider (or dual-handle range input) that filters the visualization to only show commit activity within a selected date window. Repos, links, and contributors outside the time range are hidden. - -**Why:** -- Explore how the contributor network evolved over time -- Identify recent vs legacy contributors -- See which repos are actively maintained vs dormant -- Answer questions like "who contributed in the last year?" - -**Current data available:** -Each link already has `commit_sec_min` and `commit_sec_max` (Unix timestamps for earliest and latest commit by that contributor on that repo). Repos have `createdAt` and `updatedAt` dates. This is enough for time-range filtering without new data collection. - -**Limitation:** The CSV stores one `commit_count` per contributor-repo pair with no per-period breakdown. When filtering by time range, you can determine *whether* a contributor was active on a repo during the window (their min/max overlaps), but cannot recalculate the exact commit count within that window. Display the full commit count with a note like "active during this period" or fetch granular data (see below). - -**Implementation:** - -**4a. Extend filter state** (`js/state/filterState.js`) -```javascript -{ - // ... existing fields - timeRangeMin: null, // Date object or null (no filter) - timeRangeMax: null, // Date object or null -} -``` - -Update `hasActiveFilters()` to check these fields. - -**4b. Add time-range filtering to the cascade** (`js/chart.js` `applyFilters()`) - -Insert after repo filtering but before the link cascade: -```javascript -// Filter links by time overlap with selected range -if (activeFilters.timeRangeMin !== null || activeFilters.timeRangeMax !== null) { - visibleLinks = visibleLinks.filter(link => { - const linkMin = link.commit_sec_min; - const linkMax = link.commit_sec_max; - // Check overlap: link's range intersects filter range - if (activeFilters.timeRangeMin && linkMax < activeFilters.timeRangeMin) return false; - if (activeFilters.timeRangeMax && linkMin > activeFilters.timeRangeMax) return false; - return true; - }); - // Re-derive visible repos from remaining links - const repoIdsFromLinks = new Set(visibleLinks.map(l => l.repo)); - visibleRepos = visibleRepos.filter(r => repoIdsFromLinks.has(r.repo)); -} -// Existing contributor cascade (Step 3) handles the rest -``` - -Note: the time filter operates on *links* first (not repos), since the temporal data lives on links. Then repos without any visible links are removed. The existing contributor cascade then removes contributors without visible links. - -**4c. Build the UI control** (`index.html`) - -Add a dual-handle range slider below the existing filters: -- Compute global min/max dates from all links' `commit_sec_min`/`commit_sec_max` during data load -- Use two `` elements (or a library like noUiSlider) mapping to the date range -- Display selected range as formatted dates (e.g., "Jan 2020 — Mar 2025") -- On change, call `contributorNetworkVisual.setRepoFilter('timeRangeMin', date)` and `setRepoFilter('timeRangeMax', date)` - -**4d. Add chart API** (`js/chart.js`) - -The existing `chart.setRepoFilter(name, value)` pattern from Feature 1 works here — just ensure it handles Date values. Add a convenience method: -```javascript -chart.setTimeRange = function(minDate, maxDate) { - activeFilters.timeRangeMin = minDate; - activeFilters.timeRangeMax = maxDate; - chart.rebuild(); - return chart; -}; -``` - -**Optional enhancement:** For richer granularity, expand the Python data pipeline (`python/client.py`) to fetch weekly or monthly commit counts per contributor-repo pair. This would allow showing *how many* commits occurred in the selected window rather than just whether activity overlapped. This is a separate data expansion task (see `DATA_EXPANSION_PLAN.md` Phases 3-4). - -**Verification:** -``` -✓ Slider range matches actual data timespan -✓ Narrowing range hides repos/contributors with no activity in window -✓ Widening range back to full restores all data -✓ Composes correctly with org and metric filters -✓ Edge case: single-day range still works -✓ Edge case: range that excludes all data shows empty state gracefully -``` - -**Status:** 🟡 Design ready, implementation ready - ---- - -### Feature 5: More Evenly Spaced Orgs/Repos 🔴 Needs Design - -**What it does:** -Improve the positioning of organization and repository nodes so they are more uniformly distributed across the available space, reducing visual clutter and overlap. - -**Why:** -- Current layout produces clusters with large gaps elsewhere -- Owner-grouped repos can pile up near certain contributors -- Shared/collaboration repos in the center can overlap heavily -- Better spacing makes the chart easier to read at a glance - -**Current architecture:** -Repo positioning uses three independent force simulations that run sequentially: - -1. **Owner simulation** (`js/simulations/ownerSimulation.js`): Groups repos by owner. Creates a local force simulation per owner that pulls repos toward an owner centroid. Repos are attracted to their owner node with a centering force. - -2. **Contributor simulation** (`js/simulations/contributorSimulation.js`): For repos linked to a single contributor (not shared), runs a per-contributor force simulation that positions repos in a cloud around that contributor. Uses `d3.forceCollide` and `d3.forceRadial` to keep repos near (but not on top of) their contributor. - -3. **Collaboration simulation** (`js/simulations/collaborationSimulation.js`): For repos linked to multiple contributors, positions them in the central area using `d3.forceCenter(0,0)`, `d3.forceCollide`, and link forces pulling toward connected contributor nodes. - -Each simulation runs independently with its own alpha/decay parameters. Results are merged into the final node positions. There is also `d3-bboxCollide` for label collision avoidance. - -**Why this is hard:** -- Three independent simulations don't coordinate — a repo positioned by the contributor sim may overlap with one positioned by the collaboration sim -- Force simulation tuning is iterative and visual — small parameter changes cascade unpredictably -- "Even spacing" is subjective and depends on the dataset - -**Implementation approach:** - -**5a. Unify collision detection across simulations** - -After all three simulations complete, add a final "reconciliation" pass that applies global collision forces to all repo nodes together. In `js/chart.js` after the simulation calls: - -```javascript -// After all simulations complete, run a short global collision pass -const allRepoNodes = nodes.filter(n => n.type === "repo" || n.type === "owner"); -const reconciliation = d3.forceSimulation(allRepoNodes) - .force("collide", d3.forceCollide().radius(d => d.r + 8).strength(0.7)) - .force("containment", d3.forceRadial( - RADIUS_CONTRIBUTOR * 0.85, 0, 0 // keep repos inside the contributor ring - ).strength(0.05)) - .alpha(0.3) - .alphaDecay(0.05) - .stop(); - -for (let i = 0; i < 100; i++) reconciliation.tick(); -// Copy reconciled positions back to nodes -``` - -**5b. Tune per-simulation parameters** - -In each simulation file, adjust: -- **Owner sim**: Increase `forceCollide` radius between owner groups to prevent inter-group overlap. Add a weak `forceRadial` to distribute owner groups around a ring between contributors and center. -- **Contributor sim**: Increase the radial band where per-contributor repos sit. Currently repos cluster tightly; widen the angular spread. -- **Collaboration sim**: Add `d3.forceManyBody().strength(-30)` to push shared repos apart from each other. Increase `forceCollide` padding. - -**5c. Consider angular partitioning** - -For a more structured layout, assign each contributor an angular "sector" and constrain their linked repos to that sector: -```javascript -// Each contributor already has an angle from ring positioning -// Use that angle ± half the contributor's angular allocation -// as bounds for a forceRadial + angular constraint -``` -This prevents repos from drifting into other contributors' territory. Implementation requires a custom force function since D3 doesn't have built-in angular constraints. - -**5d. Add `d3-bboxCollide` for all nodes** - -Currently `d3-bboxCollide` is used for label collision. Extend it to also handle node-to-node overlap for repos: -```javascript -.force("bbox", d3.bboxCollide(d => { - const pad = 4; - return [[-d.r - pad, -d.r - pad], [d.r + pad, d.r + pad]]; -})) -``` - -**Risks:** -- High iteration cost: tuning forces requires visual feedback loops, not just code changes -- Dataset-dependent: parameters that work for 50 repos may fail for 200 -- Performance: adding a reconciliation simulation increases layout computation time -- May require multiple rounds of parameter adjustment after initial implementation - -**Verification:** -``` -✓ No node-on-node overlaps (repos don't sit on top of each other) -✓ Owner groups visually distinct and separated -✓ Central collaboration repos spread out, not piled in center -✓ Repos stay inside the contributor ring boundary -✓ Layout looks reasonable across different datasets/filter states -✓ Rebuild after filtering still produces even spacing -✓ Performance: layout completes in <2 seconds for current dataset -``` - -**Status:** 🔴 Needs further design iteration and visual tuning - ---- - -## Longer-Term Enhancements - -### Additional Metrics 📊 Planned - -**What:** Expand data collection beyond raw commits to provide richer insights -- Weekly commit heatmaps and contributor activity timelines -- Code frequency (additions/deletions over time) -- PR count and merge rates -- Per-contributor PR/issue counts -- Review participation - -**Value:** Very High - Enables temporal visualizations and highlights quality contributions beyond commit counts - -**Implementation:** See `DATA_EXPANSION_PLAN.md` Phases 3-4 - -**Effort:** 2-4 days - ---- - -### Advanced Health Metrics 📈 Planned - -**What:** Health and impact metrics for deeper project insights -- Release frequency -- Issue response time -- Documentation scores -- Cross-repo contributor presence - -**Value:** Medium-High - Polish and deeper insights - -**Implementation:** See `DATA_EXPANSION_PLAN.md` Phase 5 - -**Effort:** 2-3 days - ---- - -### Mobile Responsiveness 📱 Future - -**What:** Optimize for small screens and touch interaction - -**Current:** Desktop-first design, not optimized for mobile - -**Why:** People want to share/view on phones - -**Approach:** -- Responsive canvas sizing -- Touch event handlers (instead of mouse) -- Simplified tooltips for small screens -- Portrait vs landscape support - -**Status:** Not started (lower priority) - ---- - -### Export & Sharing 💾 Future - -**What:** Users can export the visualization or share filtered views - -**Options:** -- Export as PNG/SVG -- Shareable URL with filters preserved -- Embed widget for other sites - -**Status:** Not started (lower priority) - ---- - -### Replicable for Other Organizations 🔧 Future - -**What:** Make the tool easy to fork, configure, and deploy for any organization's open-source portfolio - -- Configurable branding (colors, logos, typography) -- Organization-agnostic data pipeline (minimal config to point at a different org) -- Streamlined setup for new deployments -- Clear documentation for customization and deployment -- Potential packaging as a reusable template or library - -**Status:** Not started - ---- - -## How to Contribute - -**Want to work on a feature?** - -1. Read this document for the overview -2. Check `DEVELOPMENT_GUIDE.md` for detailed instructions on how to contribute -3. Read `ARCHITECTURE.md` to understand the codebase -4. Reference the verification criteria when testing - -**Found a bug or have an idea?** - -Open an issue on GitHub with: -- What you observed -- What you expected -- Steps to reproduce -- Suggested fix (if you have one) - ---- - -**Last Updated**: February 2026 From 26c9c2ee79063ded2719763e3924c028c4adc0d0 Mon Sep 17 00:00:00 2001 From: Anthony Boyd <92742765+aboydnw@users.noreply.github.com> Date: Tue, 17 Feb 2026 12:34:07 -0500 Subject: [PATCH 8/8] move docs to dotfolder --- {docs => .claude}/CLAUDE.md | 11 ++--------- {docs => .claude}/DATA_EXPANSION_PLAN.md | 0 {docs => .claude}/DATE_RANGE_IMPLEMENTATION_PLAN.md | 4 ++-- {docs => .claude}/PRD.md | 2 +- js/chart.js | 1 - 5 files changed, 5 insertions(+), 13 deletions(-) rename {docs => .claude}/CLAUDE.md (94%) rename {docs => .claude}/DATA_EXPANSION_PLAN.md (100%) rename {docs => .claude}/DATE_RANGE_IMPLEMENTATION_PLAN.md (98%) rename {docs => .claude}/PRD.md (97%) diff --git a/docs/CLAUDE.md b/.claude/CLAUDE.md similarity index 94% rename from docs/CLAUDE.md rename to .claude/CLAUDE.md index 3cad21f..7d1d69f 100644 --- a/docs/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -2,7 +2,7 @@ **Start here.** This file provides quick orientation for anyone working with this codebase. -> **Important for AI agents:** When you make changes to the codebase, update the relevant documentation in `docs/` and `README.md` to reflect those changes. Keep `README.md` short, concise, and human-readable -- it is the public-facing project overview. This file (`CLAUDE.md`) is the detailed reference for developers and agents. +> **Important for AI agents:** When you make changes to the codebase, update the relevant documentation in `.claude/` and `README.md` to reflect those changes. Keep `README.md` short, concise, and human-readable -- it is the public-facing project overview. This file (`CLAUDE.md`) is the detailed reference for developers and agents. ## What Is This? @@ -224,7 +224,6 @@ The visualization uses multiple composited canvas layers for performance: main ( |----------|---------|-----------| | **README.md** (root) | Project overview, CLI reference, full workflows | Understanding the product and CLI usage | | **PRD.md** | Product requirements and vision | First - understand the *why* | -| **roadmap.md** | Project status, planned features, and implementation status | Planning new work | | **DATA_EXPANSION_PLAN.md** | Data collection phases (1-5) with details | Adding new data fields | --- @@ -329,16 +328,10 @@ uv run pytest -v --- -## Current Project Status - -See [`roadmap.md`](./roadmap.md) for full project status, planned features, and roadmap details. - ---- - ## Need Help? - **Project overview and CLI usage?** → `README.md` (root) -- **What are we building next?** → `roadmap.md` or `DATA_EXPANSION_PLAN.md` +- **What are we building next?** → `DATA_EXPANSION_PLAN.md` - **What's the product for?** → `PRD.md` --- diff --git a/docs/DATA_EXPANSION_PLAN.md b/.claude/DATA_EXPANSION_PLAN.md similarity index 100% rename from docs/DATA_EXPANSION_PLAN.md rename to .claude/DATA_EXPANSION_PLAN.md diff --git a/docs/DATE_RANGE_IMPLEMENTATION_PLAN.md b/.claude/DATE_RANGE_IMPLEMENTATION_PLAN.md similarity index 98% rename from docs/DATE_RANGE_IMPLEMENTATION_PLAN.md rename to .claude/DATE_RANGE_IMPLEMENTATION_PLAN.md index d1ad124..b7244a5 100644 --- a/docs/DATE_RANGE_IMPLEMENTATION_PLAN.md +++ b/.claude/DATE_RANGE_IMPLEMENTATION_PLAN.md @@ -295,7 +295,7 @@ Nodes appear when their first commit month is reached. Links grow as commit coun 7. **JS data preparation** — attach histograms to link objects 8. **Time range filtering** — implement count-based filtering in `applyFilters()` 9. **Scale updates** — verify link widths and contributor radii update correctly -10. **UI slider** — build the time range control (see Feature 4 in ROADMAP.md) +10. **UI slider** — build the time range control Steps 1-5 are Python/data work (~half day). Steps 6-10 are JS/visualization work (~1-2 days), mostly layered on top of the Feature 4 time range slider from the roadmap. @@ -311,7 +311,7 @@ Steps 1-5 are Python/data work (~half day). Steps 6-10 are JS/visualization work **Scale behavior:** When the time range is very narrow (e.g., one month), most links will have small counts and the scale domain shrinks. This can make thin links appear thick. Consider setting a minimum domain ceiling (e.g., 10) to prevent scale distortion on narrow ranges. -**Backward compatibility:** `links.csv` is unchanged. The new `commit_activity.csv` is additive. If the JS can't find it, fall back to the overlap-based filtering described in the ROADMAP.md Feature 4 entry. +**Backward compatibility:** `links.csv` is unchanged. The new `commit_activity.csv` is additive. If the JS can't find it, fall back to overlap-based filtering. --- diff --git a/docs/PRD.md b/.claude/PRD.md similarity index 97% rename from docs/PRD.md rename to .claude/PRD.md index d67473e..9c5483d 100644 --- a/docs/PRD.md +++ b/.claude/PRD.md @@ -252,7 +252,7 @@ github_username = "Display Name" ## Project Status & Roadmap -See [`roadmap.md`](./roadmap.md) for current project status, planned features, and implementation details. The roadmap is the single source of truth for what's been completed, what's in progress, and what's planned next. +See [`CLAUDE.md`](./CLAUDE.md) for current project status and developer orientation. --- diff --git a/js/chart.js b/js/chart.js index ca41337..1b4646b 100644 --- a/js/chart.js +++ b/js/chart.js @@ -610,7 +610,6 @@ const createContributorNetworkVisual = ( // NOTE: Pure filter logic has been extracted to src/js/data/filter.js // This function handles integration with the visualization's mutable state. // For new features (e.g., blog charts), import { applyFilters } from './data/filter.js' - // See ARCHITECTURE_RECOMMENDATIONS.md for migration guide. function applyFilters() { // Guard against uninitialized data if (!originalRepos || !originalLinks || !originalContributors) {