microsoft · eavanvalkenburg · Jun 9, 2026
diff --git a/.github/workflows/python-code-quality.yml b/.github/workflows/python-code-quality.yml
@@ -109,8 +109,8 @@ jobs:
       - name: Run markdown code lint
         run: uv run poe markdown-code-lint
 
-  mypy:
-    name: Mypy Checks
+  test-typing:
+    name: Test Typing Checks
     if: "!cancelled()"
     strategy:
       fail-fast: false
@@ -135,7 +135,5 @@ jobs:
           os: ${{ runner.os }}
         env:
           UV_CACHE_DIR: /tmp/.uv-cache
-      - name: Run Mypy
-        env:
-          GITHUB_BASE_REF: ${{ github.event.pull_request.base.ref || github.base_ref || 'main' }}
-        run: uv run python scripts/workspace_poe_tasks.py ci-mypy
+      - name: Run tests/samples type checkers (mypy, pyrefly, ty)
+        run: uv run python scripts/workspace_poe_tasks.py ci-test-typing
diff --git a/python/.github/skills/python-code-quality/SKILL.md b/python/.github/skills/python-code-quality/SKILL.md
@@ -21,13 +21,22 @@ uv run poe syntax -C    # Check only
 uv run poe syntax -S    # Samples only
 
 # Type checking
-uv run poe pyright       # Pyright fan-out across packages
+#
+# Division of labor (see "Type checking architecture" below):
+#   - Pyright (strict) is the source-code type checker.
+#   - Pyright (relaxed `basic`), mypy, pyrefly, ty, zuban all check the TESTS;
+#     pyright/pyrefly/ty also check the SAMPLES (mypy/zuban skip script-style samples).
+uv run poe pyright       # Pyright (strict) over SOURCE, fan-out across packages
 uv run poe pyright -P core
 uv run poe pyright -A
-uv run poe mypy          # MyPy fan-out across packages
+uv run poe test-typing   # mypy + pyrefly + ty + zuban + pyright over each package's TESTS
+uv run poe test-typing -P core
+uv run poe test-typing -S                       # samples (pyrefly + ty + pyright)
+uv run poe test-typing -P core --checker mypy   # narrow to one checker (repeatable)
+uv run poe test-typing -P core --checker pyright # relaxed pyright over the tests
+uv run poe mypy          # alias: MyPy over the tests only
 uv run poe mypy -P core
-uv run poe mypy -A
-uv run poe typing        # Both pyright and mypy
+uv run poe typing        # Pyright (source) + the tests checkers
 uv run poe typing -P core
 uv run poe typing -A
 
@@ -67,6 +76,33 @@ when markdown files change, and sample syntax lint/pyright only when files
 under `samples/` change.
 They intentionally do not run workspace `pyright` or `mypy` by default.
 
+## Type checking architecture
+
+Following the "too many type checkers" approach, type checkers are split by target:
+
+| Target | Checker(s) | Mode | Config |
+|--------|-----------|------|--------|
+| Source (`agent_framework*`) | **pyright** | strict | `[tool.pyright]` in `pyproject.toml` |
+| Tests | pyright, mypy, pyrefly, ty, zuban | relaxed/basic | `pyrightconfig.tests.json`, `[tool.mypy]`, `pyrefly.toml`, `ty` rules |
+| Samples | pyright, pyrefly, ty | basic | `pyrightconfig.samples.json`, `pyrefly.samples.toml`, `ty.samples.toml` |
+
+- **Pyright is the only *strict* source-code checker**, and it ALSO runs in a relaxed
+  `basic` profile over the tests and samples (so the surfaces customers copy from are
+  validated by every checker, including pyright). MyPy was removed from source; its
+  `[tool.mypy]` block is now a *relaxed* profile used only for tests/samples.
+- The extra checkers run over tests/samples because those exercise the public API the way
+  users do. The profile is intentionally relaxed (private access allowed, untyped test
+  bodies allowed) so authors aren't forced into ugly over-annotation.
+- **Gating checkers** are `pyright`, `mypy`, `pyrefly`, `ty`, and `zuban` — all five run by
+  default and gate CI. `zuban` is the strictest of the mypy-compatible pair, so the same
+  `[tool.mypy]` config yields more findings; suppress zuban-only friction with shared
+  `# type: ignore[code]`. Suppress relaxed-pyright friction with `# pyright: ignore[rule]`.
+- **Samples** add `pyright` to `pyrefly` + `ty` — mypy/zuban can't resolve script-style
+  sample layouts (numeric-prefixed dirs, duplicate `main.py`), but pyright handles them.
+- The strict source-pyright (`[tool.pyright]`) enforces `reportUnnecessaryTypeIgnoreComment`
+  and excludes tests/samples; the relaxed test/sample pyright configs do not flag unnecessary
+  ignores.
+
 ## Ruff Configuration
 
 - Line length: 120
@@ -77,8 +113,12 @@ They intentionally do not run workspace `pyright` or `mypy` by default.
 
 ## Pyright Configuration
 
-- Strict mode enabled
-- Excludes: tests, .venv, packages/devui/frontend
+- **Source**: strict mode (`[tool.pyright]`), `reportUnnecessaryTypeIgnoreComment = "error"`,
+  excludes tests, samples, .venv, packages/devui/frontend.
+- **Tests**: relaxed `basic` profile (`pyrightconfig.tests.json`) — private import/usage and
+  not-required TypedDict access allowed; runs as the `pyright` checker in `test-typing`.
+- **Samples**: relaxed `basic` profile (`pyrightconfig.samples.json`, with a py310 variant) —
+  runs as the `pyright` checker in `test-typing -S`.
 
 ## Parallel Execution
 
@@ -90,6 +130,6 @@ in-process with streaming output.
 
 CI splits into 4 parallel jobs:
 1. **Pre-commit hooks** — lightweight hooks (SKIP=poe-check)
-2. **Package checks** — syntax/pyright via check-packages
+2. **Package checks** — syntax/pyright (source) via check-packages
 3. **Samples & markdown** — `check -S` plus `markdown-code-lint`
-4. **Mypy** — change-detected mypy checks
+4. **Test Typing** — change-detected mypy/pyrefly/ty over tests (`ci-test-typing`)
diff --git a/python/CODING_STANDARD.md b/python/CODING_STANDARD.md
@@ -92,11 +92,18 @@ Use typing as a helper first and suppressions as a last resort:
 - **Prefer explicit typing before suppression**: Start with clearer type annotations, helper types, overloads,
   protocols, or refactoring dynamic code into typed helpers. Prioritize performance over completeness of typing, but make a good-faith effort to reduce uncertainty with typing before ignoring. Prefer to use a cast over a typeguard function since that does add overhead.
 - **Avoid redundant casts**: Do not add `cast(...)` if the type already matches; casts should be reserved for
-  unavoidable narrowing where the runtime contract is known, we will use mypy's check on redundant casts to enforce this.
+  unavoidable narrowing where the runtime contract is known.
 - **Avoid multiple assignments**: Avoid assigning multiple variables just to get typing to pass, that has performance impact while typing should not have that.
-- **Line-level pyright ignores only**: If suppression is still required, use a line-level rule-specific ignore
+- **Source vs tests/samples**: Source code (`agent_framework*`) is checked **by pyright in strict mode** — use
+  `# pyright: ignore[...]` there, never `# type: ignore` (strict pyright flags unnecessary ignores as errors). Tests
+  and samples are checked by pyright (relaxed `basic`), mypy, pyrefly, ty (and zuban on tests) in a relaxed/basic
+  profile; prefer real fixes (`isinstance`, `cast`, annotations, asserts for Optional access) over per-line ignores,
+  and keep test/sample bodies readable rather than over-annotated. When a relaxed-pyright suppression is genuinely
+  needed in tests/samples, use `# pyright: ignore[rule]`; the relaxed test/sample configs do not flag unnecessary
+  ignores, so combine with a mypy/zuban `# type: ignore[code]` on the same line only where both are required.
+- **Line-level pyright ignores only**: If suppression is still required in source, use a line-level rule-specific ignore
   (`# pyright: ignore[reportGeneralTypeIssues]`), file-level is allowed if there is a compelling reason for it, that should be documented right beneath the ignore.
-  Never change the global suppression flags for mypy and pyright unless the dev team okays it.
+  Never change the global suppression flags unless the dev team okays it.
 - **Private usage boundary**: Accessing private members across `agent_framework*` packages can be acceptable for this
   codebase, but private member usage for non-Agent Framework dependencies should remain flagged.
 

diff --git a/python/DEV_SETUP.md b/python/DEV_SETUP.md
@@ -289,23 +289,36 @@ uv run poe <command> -A           # aggregate sweep where supported
 ```
 
 #### `pyright`
-Run Pyright type checking:
+Run Pyright type checking. Pyright is the **strict source-code type checker**, and also runs
+in a relaxed `basic` profile over the tests + samples (as one of the `test-typing` checkers):
 ```bash
 uv run poe pyright
 uv run poe pyright -P core
 uv run poe pyright -A
 ```
 
+#### `test-typing`
+Run the **tests + samples** type checkers. Source code is owned by strict Pyright; the tests
+and samples are checked by `pyright` (relaxed), `mypy`, `pyrefly`, `ty`, and `zuban` in a
+deliberately relaxed/basic profile so real public-API type errors surface without forcing
+test/sample authors to fully annotate their code. All five gate CI:
+```bash
+uv run poe test-typing            # all checkers over every package's tests
+uv run poe test-typing -P core    # one package
+uv run poe test-typing -S         # samples (pyright + pyrefly + ty; mypy/zuban skip script-style samples)
+uv run poe test-typing -P core --checker mypy     # narrow to one checker (repeatable)
+uv run poe test-typing -P core --checker pyright  # relaxed pyright over the tests
+```
+
 #### `mypy`
-Run MyPy type checking:
+Convenience alias that runs MyPy over the test suite (MyPy no longer runs on source):
 ```bash
 uv run poe mypy
 uv run poe mypy -P core
-uv run poe mypy -A
 ```
 
 #### `typing`
-Run both Pyright and MyPy:
+Run Pyright over source **and** the tests/samples checkers:
 ```bash
 uv run poe typing
 uv run poe typing -P core

diff --git a/python/packages/a2a/agent_framework_a2a/_agent.py b/python/packages/a2a/agent_framework_a2a/_agent.py
@@ -344,7 +344,7 @@ def run(
         **kwargs: Any,
     ) -> ResponseStream[AgentResponseUpdate, AgentResponse[Any]]: ...
 
-    def run(  # pyright: ignore[reportIncompatibleMethodOverride]
+    def run(
         self,
         messages: AgentRunInputs | None = None,
         *,
@@ -464,7 +464,7 @@ async def _map_a2a_stream(
             if session is None:
                 raise RuntimeError("Provider session must be available when context providers are configured.")
             await provider.before_run(
-                agent=self,  # type: ignore[arg-type]
+                agent=self,
                 session=session,
                 context=session_context,
                 state=session.state.setdefault(provider.source_id, {}),