Skip to content

[FEATURE] Implement Token Efficiency Strategy for Copilot Instructions #477

@nanotaboada

Description

@nanotaboada

Problem

GitHub Copilot automatically loads .github/copilot-instructions.md on every chat interaction, consuming tokens regardless of whether the detailed information is needed. For repositories with comprehensive documentation, this can lead to:

  • Unnecessary token consumption (up to 650+ tokens per interaction)
  • Slower response times due to larger context windows
  • Reduced effectiveness when detailed operational procedures aren't required

Proposed Solution

Implement a token efficiency strategy that separates instruction files by usage frequency:

  1. copilot-instructions.md (~600 tokens, auto-loaded): Core conventions, architecture, and quick commands
  2. AGENTS.md (~2,550 tokens, on-demand): Detailed workflows, troubleshooting, CI/CD procedures
  3. SKILLS/ (future, on-demand): Specialized knowledge modules

This strategy saves ~80% of tokens when detailed documentation isn't needed.

Suggested Approach

1. Token Budget Policy

Establish clear targets in .github/copilot-instructions.md:

> **Token Budget**: Target 600, limit 650 (auto-loaded)
> Details: `#file:AGENTS.md` (~2,550 tokens, on-demand)

Rules:

  • Target: 600 tokens (ideal)
  • Limit: 650 tokens (hard maximum)
  • Action: If exceeds 650, optimization required

2. Content Separation

copilot-instructions.md should contain:

  • Quick context (project type, main tech stack, purpose)
  • Core conventions (naming, code style, patterns)
  • Architecture overview (visual diagram + brief explanations)
  • Essential do's and don'ts (language/framework specific)
  • Quick commands (most frequently used: build, test, run)
  • Clear guidance on when to load AGENTS.md

AGENTS.md should contain:

  • Detailed development workflows
  • Testing procedures with all flags/options
  • Container/deployment commands and troubleshooting
  • CI/CD pipeline details
  • Database/infrastructure management procedures
  • Comprehensive troubleshooting guide
  • API/interface documentation

3. Token Measurement Tool

Create scripts/count-tokens.sh with the following code:

#!/bin/bash
# 📊 Token Counter for Copilot Instruction Files
# Uses tiktoken (OpenAI's tokenizer) for accurate counting
# Approximation: ~0.75 words per token (English text)

set -e

echo "📊 Token Analysis for Copilot Instructions"
echo "=========================================="
echo ""

# Check if tiktoken is available
if command -v python3 &> /dev/null; then
    # Try to use tiktoken for accurate counting
    if python3 -c "import tiktoken" 2>/dev/null; then
        echo "✅ Using tiktoken (accurate Claude/GPT tokenization)"
        echo ""
    else
        # tiktoken not found - offer to install
        echo "⚠️  tiktoken not installed"
        echo ""
        echo "tiktoken provides accurate token counting for Claude/GPT models."
        read -p "📦 Install tiktoken now? (y/n): " -n 1 -r
        echo ""
        if [[ $REPLY =~ ^[Yy]$ ]]; then
            echo "📥 Installing tiktoken..."
            if pip3 install tiktoken --quiet; then
                echo "✅ tiktoken installed successfully!"
                echo ""
                # Re-run the script after installation
                exec "$0" "$@"
            else
                echo "❌ Installation failed. Using word-based approximation instead."
                echo ""
                USE_APPROX=1
            fi
        else
            echo "📝 Using word-based approximation instead"
            echo "   (Install manually: pip3 install tiktoken)"
            echo ""
            USE_APPROX=1
        fi
    fi

    # Only run tiktoken if it's available and we didn't set USE_APPROX
    if [ -z "$USE_APPROX" ] && python3 -c "import tiktoken" 2>/dev/null; then

        # Create temporary Python script
        cat > /tmp/count_tokens.py << 'PYTHON'
import tiktoken
import sys

# cl100k_base is used by GPT-4, Claude uses similar tokenization
encoding = tiktoken.get_encoding("cl100k_base")

file_path = sys.argv[1]
with open(file_path, 'r', encoding='utf-8') as f:
    content = f.read()

tokens = encoding.encode(content)
print(len(tokens))
PYTHON

        # Count tokens for each file
        echo "📄 .github/copilot-instructions.md"
        COPILOT_TOKENS=$(python3 /tmp/count_tokens.py .github/copilot-instructions.md)
        echo "   Tokens: $COPILOT_TOKENS"
        echo ""

        echo "📄 AGENTS.md"
        AGENTS_TOKENS=$(python3 /tmp/count_tokens.py AGENTS.md)
        echo "   Tokens: $AGENTS_TOKENS"
        echo ""

        # Calculate total
        TOTAL=$((COPILOT_TOKENS + AGENTS_TOKENS))
        echo "📊 Summary"
        echo "   Base load (auto): $COPILOT_TOKENS tokens"
        echo "   On-demand load: $AGENTS_TOKENS tokens"
        echo "   Total (if both): $TOTAL tokens"
        echo ""

        # Check against target
        TARGET=600
        LIMIT=650
        if [ $COPILOT_TOKENS -le $TARGET ]; then
            echo "✅ copilot-instructions.md within target ($TARGET tokens)"
        elif [ $COPILOT_TOKENS -le $LIMIT ]; then
            echo "⚠️  copilot-instructions.md over target but within limit ($LIMIT tokens)"
        else
            echo "❌ copilot-instructions.md exceeds limit! Optimization required."
        fi

        SAVINGS=$((AGENTS_TOKENS * 100 / TOTAL))
        echo "💡 Savings: ${SAVINGS}% saved when AGENTS.md not needed"

        # Cleanup
        rm /tmp/count_tokens.py
    fi
else
    echo "❌ Python3 not found"
    echo "   Python 3 is required for token counting"
    echo "   Install from: https://www.python.org/downloads/"
    echo ""
    exit 1
fi

# Fallback: word-based approximation
if [ -n "$USE_APPROX" ]; then
    echo "📄 .github/copilot-instructions.md"
    WORDS=$(wc -w < .github/copilot-instructions.md | tr -d ' ')
    APPROX_TOKENS=$((WORDS * 4 / 3))
    echo "   Words: $WORDS"
    echo "   Approx tokens: $APPROX_TOKENS"
    echo ""

    echo "📄 AGENTS.md"
    WORDS=$(wc -w < AGENTS.md | tr -d ' ')
    APPROX_TOKENS=$((WORDS * 4 / 3))
    echo "   Words: $WORDS"
    echo "   Approx tokens: $APPROX_TOKENS"
    echo ""

    echo "💡 Note: Run script again to install tiktoken for accurate counts"
fi

echo ""
echo "=========================================="

Script features:

  • Auto-installs tiktoken if missing (with user confirmation)
  • Validates against target/limit thresholds
  • Shows token savings percentage
  • Fallback to word-based approximation if Python/tiktoken unavailable
  • Clear emoji-based feedback

Usage:

# Make executable
chmod +x scripts/count-tokens.sh

# Run from repository root
./scripts/count-tokens.sh

# Expected output:
# 📊 Token Analysis for Copilot Instructions
# ✅ Using tiktoken (accurate Claude/GPT tokenization)
# 📄 .github/copilot-instructions.md
#    Tokens: XXX
# 📄 AGENTS.md
#    Tokens: XXXX
# ⚠️  copilot-instructions.md over target but within limit (650 tokens)
# 💡 Savings: 80% saved when AGENTS.md not needed

4. Implementation Steps

  1. Create the script:

    mkdir -p scripts
    # Copy the script code above into scripts/count-tokens.sh
    chmod +x scripts/count-tokens.sh
  2. Establish baseline:

    # If copilot-instructions.md already exists
    ./scripts/count-tokens.sh
    # Note current token counts
  3. Optimize copilot-instructions.md:

    • Move detailed procedures to AGENTS.md
    • Keep only essential conventions and quick commands
    • Aim for target 600 tokens, limit 650 tokens
  4. Create AGENTS.md:

    • Add comprehensive operational details
    • Include all workflows, troubleshooting, CI/CD
    • Document when users should load this file
  5. Update headers:

    # In .github/copilot-instructions.md
    > **Token Budget**: Target 600, limit 650 (auto-loaded)
    > Details: `#file:AGENTS.md` (~2,550 tokens, on-demand)
    
    # In AGENTS.md
    > **Token Efficiency**: Complete operational instructions (~2,550 tokens).
    > **Auto-loaded**: NO (load explicitly with `#file:AGENTS.md` when needed)
    > **When to load**: Complex workflows, troubleshooting, CI/CD setup, detailed architecture
  6. Validate:

    ./scripts/count-tokens.sh
    # Verify copilot-instructions.md ≤ 650 tokens

5. Pre-commit Workflow

Add to development workflow:

  1. Before committing changes to instruction files, run ./scripts/count-tokens.sh
  2. Verify copilot-instructions.md stays within budget
  3. If over limit, optimize content or move to AGENTS.md

Acceptance Criteria

  • scripts/count-tokens.sh created and executable
  • Baseline token measurements documented
  • .github/copilot-instructions.md optimized to ≤650 tokens
  • AGENTS.md created with comprehensive operational details
  • Token counts documented in both file headers
  • Script tested successfully (run ./scripts/count-tokens.sh)
  • Token savings validated (should be ~80% when AGENTS.md not loaded)

References

Additional Notes

This strategy is based on the principle of lazy loading - only load context when needed. The script and approach are designed to be repository-agnostic and can be adapted to any programming language or framework.

Key Metrics to Track:

  • copilot-instructions.md: Target ≤600 tokens, Limit ≤650 tokens
  • AGENTS.md: ~2,000-3,000 tokens (detailed procedures)
  • Token savings: ~80% when AGENTS.md not needed

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestpythonPull requests that update Python code

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions