Skip to content

Add screenshot coordinate GameObject picker#1199

Open
Alex-Ma0 wants to merge 2 commits into
CoplayDev:betafrom
Alex-Ma0:codex/pick-gameobject-from-image
Open

Add screenshot coordinate GameObject picker#1199
Alex-Ma0 wants to merge 2 commits into
CoplayDev:betafrom
Alex-Ma0:codex/pick-gameobject-from-image

Conversation

@Alex-Ma0

@Alex-Ma0 Alex-Ma0 commented Jun 14, 2026

Copy link
Copy Markdown

Summary

  • add pick_gameobject_from_image MCP tool and Unity handler for reverse-picking GameObjects from supported Unity screenshot coordinates
  • attach pickView metadata to explicit camera, positioned, and Scene View screenshots
  • add unity-mcp camera pick CLI command and tests

Details

  • supports 3D Physics raycasts, 2D Collider2D ray intersections, and Scene View editor/mesh picking without Colliders for visible MeshRenderer/SkinnedMeshRenderer objects
  • documents preconditions and unsupported screenshot modes directly in the tool description
  • returns ray/image/viewport metadata plus hit GameObject summaries and hit details where available

Tests

  • python -m pytest tests/test_pick_gameobject_from_image.py tests/test_cli.py::TestCameraCommands -v
  • uv run --extra dev pytest tests/ -q (1242 passed)
  • Tuanjie EditMode MCPForUnityTests.Editor.Tools.PickGameObjectFromImageTests (15 total, 14 passed, 0 failed, 1 skipped: Scene View screenshot skipped in batchmode)

Summary by CodeRabbit

  • New Features
    • Added screenshot-based GameObject picking supporting both 2D and 3D physics with Scene View compatibility.
    • Introduced unity-mcp camera pick to select GameObjects from screenshot pixel coordinates.
    • Scene View and positioned/explicit screenshots can now return optional pickView metadata (multiview returns shots instead of a single pickView) for accurate coordinate mapping.
    • Picking supports layer masking, max-distance filtering, and configurable trigger interaction.
  • Tests
    • Added CLI and Unity EditMode test coverage for picking and pickView behavior.

@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4523f40a-0f5d-4910-b704-da5998622512

📥 Commits

Reviewing files that changed from the base of the PR and between 4f1aa97 and 81e74d4.

📒 Files selected for processing (7)
  • MCPForUnity/Editor/Tools/ManageScene.cs
  • MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs
  • Server/src/cli/commands/camera.py
  • Server/src/services/tools/pick_gameobject_from_image.py
  • Server/tests/test_cli.py
  • Server/tests/test_pick_gameobject_from_image.py
  • TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/PickGameObjectFromImageTests.cs
🚧 Files skipped from review as they are similar to previous changes (6)
  • Server/tests/test_cli.py
  • Server/tests/test_pick_gameobject_from_image.py
  • MCPForUnity/Editor/Tools/ManageScene.cs
  • Server/src/services/tools/pick_gameobject_from_image.py
  • TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/PickGameObjectFromImageTests.cs
  • MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs

📝 Walkthrough

Walkthrough

Adds a pick_gameobject_from_image feature enabling clients to pick a Unity GameObject at a screenshot pixel coordinate. ManageScene screenshots now emit pickView camera metadata; a new PickGameObjectFromImage Unity editor tool uses that metadata to dispatch 2D/3D physics or Scene View ray picks; a Python MCP tool and unity-mcp camera pick CLI command expose the feature with full validation.

Changes

pick_gameobject_from_image: screenshot-to-object picking

Layer / File(s) Summary
pickView metadata emitted by ManageScene screenshots
MCPForUnity/Editor/Tools/ManageScene.cs, Server/src/services/tools/manage_camera.py
BuildScreenshotResponseData gains an optional camera and includePickView flag that conditionally injects a pickView dictionary with camera transform, projection, and viewport metadata. Camera-based, Scene View, and positioned screenshot paths each invoke the updated method with appropriate capture source labels and camera references. manage_camera.py description is updated to document that single-view screenshots return pickView for downstream picking.
PickGameObjectFromImage: class structure, test hooks, and camera utilities
MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs, MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs.meta
Defines SceneViewPickResult struct and test-override hook. Implements camera construction from pickView JSON (position/rotation, orthographic/perspective projection, clip planes, culling mask), camera resolution by instance id/name/path, JSON vector parsing, layerMask resolution, finite-number guards, pickView normalization, Scene View viewport rectangle calculation, and the reflected IntersectRayMesh helper for mesh-ray intersection.
PickGameObjectFromImage: main handler and picking dispatch
MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs
Implements HandleCommand entry point: validates required parameters, computes normalized viewport UV from image coordinates, resolves or constructs a camera from pickView metadata, determines whether Scene View or physics picking applies, and constructs base payload containing shared ray/image/viewport metadata.
PickGameObjectFromImage: 3D physics and Scene View picking backends
MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs
Implements Physics.RaycastAll-based 3D picking. Implements Scene View picking orchestration: optionally uses HandleUtility.PickGameObject in editor mode and falls back to renderer mesh-intersection for non-collider targets. Includes mesh intersection search enumerating visible renderers, baking SkinnedMeshRenderer, filtering by layer/visibility/persistence, and invoking IntersectRayMesh via reflection.
PickGameObjectFromImage: 2D physics picking backend
MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs
Implements Physics2D.GetRayIntersectionAll-based 2D picking with hit sorting by distance.
PickGameObjectFromImage: hit and no-hit payload builders
MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs
Constructs hit payloads with success=true, hit=true, pickMode, hit point/normal/distance, collider type, and GameObject instance ID/name. Constructs no-hit payloads with success=true, hit=false, pickMode, and requiresCollider flag indicating the backend used.
Python MCP tool: validation and dispatch
Server/src/services/tools/pick_gameobject_from_image.py
Defines pick_gameobject_from_image as an async MCP tool with DESCRIPTION constant, validation helpers for numeric bounds and positivity, comprehensive parameter validation (coordinates, dimensions, scale, viewport, dimension mode, pick_view JSON parsing, camera prerequisite), query_trigger_interaction mapping, Unity params payload construction, and retry-based command dispatch.
CLI command and argument reordering
Server/src/cli/commands/camera.py
Updates all manage_camera subcommands to use new run_command argument order (command_type, params, config). Adds unity-mcp camera pick Click command with options for image coordinates, viewport dimensions, scale factors, dimension mode, optional camera reference, optional --view-json payload, optional layer mask, max distance, and trigger interaction. Builds params, parses optional view-json, and calls run_command("pick_gameobject_from_image", ...).
Tests: Python tool, CLI, and Unity EditMode
Server/tests/test_pick_gameobject_from_image.py, Server/tests/test_cli.py, TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/PickGameObjectFromImageTests.cs, TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/PickGameObjectFromImageTests.cs.meta
Python tests cover full 3D pick parameter forwarding, missing prerequisites, invalid parameter validation, JSON pick_view parsing, and DESCRIPTION content. CLI test asserts camera pick routes to pick_gameobject_from_image. Unity EditMode tests cover 2D/3D camera picking, pickView priority, coordinate scaling, Scene View renderer/mesh-intersection picking, layer mask filtering, fallback behavior, no-hit returns, and ManageScene screenshot pickView shape across explicit camera, positioned capture, Scene View, and multiview modes.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(173, 216, 230, 0.5)
    Note over Client,ManageScene: Screenshot + pickView capture
    Client->>ManageScene: capture_screenshot(camera / scene_view / positioned)
    ManageScene->>BuildPickView: camera, viewportW, viewportH
    BuildPickView-->>ManageScene: pickView (projection, transform, captureSource)
    ManageScene-->>Client: screenshot + pickView metadata
  end
  rect rgba(144, 238, 144, 0.5)
    Note over Client,PickGameObjectFromImage: Pick GameObject from image coordinate
    Client->>pick_gameobject_from_image: image_x, image_y, image_w, image_h, pick_view
    pick_gameobject_from_image->>PickGameObjectFromImage: validated params
    PickGameObjectFromImage->>CameraResolver: resolve or construct from pickView
    CameraResolver-->>PickGameObjectFromImage: Camera instance
    alt captureSource == scene_view
      PickGameObjectFromImage->>HandleUtility.PickGameObject: ray from viewport UV
      HandleUtility.PickGameObject-->>PickGameObjectFromImage: GameObject or null
      PickGameObjectFromImage->>MeshIntersection: fallback if not found
    else dimension == 3d
      PickGameObjectFromImage->>Physics.RaycastAll: ray, layerMask, trigger interaction
    else dimension == 2d
      PickGameObjectFromImage->>Physics2D.GetRayIntersectionAll: ray, layerMask
    end
    PickGameObjectFromImage-->>pick_gameobject_from_image: hit/no-hit payload
    pick_gameobject_from_image-->>Client: result dict
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • CoplayDev/unity-mcp#927: Both PRs modify the captureSource="scene_view" screenshot path in ManageScene.CaptureSceneViewScreenshot, which this PR extends to emit pickView metadata from the same camera data.
  • CoplayDev/unity-mcp#1040: Both PRs modify ManageScene.BuildScreenshotResponseData to extend the screenshot response payload; #1040 adds includeImage for game-view compositor capture while this PR adds pickView metadata injection.

Poem

🐰 Hop, hop, a pixel I see,
Which GameObject could it be?
I cast a ray through pickView's eye,
Through colliders and meshes I fly—
The rabbit clicks, the scene replies:
"BoxCollider hit! No need to guess!"
A perfect pick, pure happiness. 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 24.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main feature: adding a screenshot coordinate-based GameObject picker, which is the primary objective of the PR.
Description check ✅ Passed The PR description is well-structured with clear Summary, Details, and Tests sections covering all changes, but the author did not complete the template sections for Type of Change, Compatibility, and Documentation Updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@MCPForUnity/Editor/Tools/ManageScene.cs`:
- Around line 777-802: The BuildPickView method needs to include the camera's
culling mask in the returned dictionary so the downstream picker respects which
layers were actually rendered. Add a new dictionary entry in the BuildPickView
method that captures camera.cullingMask (use a key like "cullingMask") alongside
the existing camera properties like projection, fieldOfView, and clipping
planes. Then ensure the picker implementation uses this cullingMask value as the
default layer mask when performing raycast hit tests, preventing colliders on
invisible layers from being selected.

In `@MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs`:
- Around line 275-307: The issue is that HandleUtility.PickGameObject uses the
current active SceneView camera instead of the serialized pickView camera that
was used to capture the screenshot. If the user pans or resizes the Scene View
after capture, it may pick an incorrect object. Before calling
HandleUtility.PickGameObject, verify that the live Scene View camera position,
rotation, and viewport dimensions still match the pickView camera and viewport
from when the screenshot was taken. If they don't match, skip the HandleUtility
path and use only the ray-based or mesh-based picking fallback to ensure correct
object selection.
- Around line 362-365: When a SkinnedMeshRenderer is detected in the renderer
type check, the current code uses sharedMesh which returns the undeformed
bind-pose geometry. To account for animation deformation before ray-testing,
call BakeMesh() on the skinnedRenderer to capture the current deformed state and
assign the baked mesh result to the mesh variable instead of using sharedMesh
directly. This ensures TryIntersectRayMesh will test against the actual animated
pose rather than the static bind-pose geometry.

In `@Server/src/cli/commands/camera.py`:
- Line 645: The run_command function expects arguments in the order:
command_type, params, config. Line 645 correctly uses this order with
run_command("pick_gameobject_from_image", params, config). However, most other
run_command calls throughout camera.py have reversed the argument order, passing
config first followed by the command_type string and params. Search for all
other run_command calls in the file (the comment identifies 19+ affected calls
including those related to manage_camera operations) and reorder their arguments
to match the correct signature: move the command_type string to the first
position, params to the second, and config to the third, aligning them with the
correct call at line 645 and matching test assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ea40709c-93e6-4473-833c-b8e612dde8f3

📥 Commits

Reviewing files that changed from the base of the PR and between c0908b8 and 4f1aa97.

📒 Files selected for processing (10)
  • MCPForUnity/Editor/Tools/ManageScene.cs
  • MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs
  • MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs.meta
  • Server/src/cli/commands/camera.py
  • Server/src/services/tools/manage_camera.py
  • Server/src/services/tools/pick_gameobject_from_image.py
  • Server/tests/test_cli.py
  • Server/tests/test_pick_gameobject_from_image.py
  • TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/PickGameObjectFromImageTests.cs
  • TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/PickGameObjectFromImageTests.cs.meta

Comment on lines +777 to +802
private static Dictionary<string, object> BuildPickView(
Camera camera,
string captureSource,
int viewportWidth,
int viewportHeight)
{
if (camera == null)
return null;

var euler = camera.transform.eulerAngles;
var position = camera.transform.position;
return new Dictionary<string, object>
{
{ "captureSource", captureSource },
{ "position", new[] { position.x, position.y, position.z } },
{ "rotation", new[] { euler.x, euler.y, euler.z } },
{ "projection", camera.orthographic ? "orthographic" : "perspective" },
{ "orthographic", camera.orthographic },
{ "fieldOfView", camera.fieldOfView },
{ "orthographicSize", camera.orthographicSize },
{ "nearClipPlane", camera.nearClipPlane },
{ "farClipPlane", camera.farClipPlane },
{ "aspect", camera.aspect },
{ "viewportWidth", viewportWidth },
{ "viewportHeight", viewportHeight },
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Serialize the capture camera’s culling mask in pickView.

pickView currently preserves pose/projection only. Downstream picking falls back to an all-layers raycast when layer_mask is omitted, so colliders on layers the screenshot camera never rendered can still win the hit test. Include camera.cullingMask here and have the picker use it as its default mask to keep picks constrained to what was actually visible in the captured image.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@MCPForUnity/Editor/Tools/ManageScene.cs` around lines 777 - 802, The
BuildPickView method needs to include the camera's culling mask in the returned
dictionary so the downstream picker respects which layers were actually
rendered. Add a new dictionary entry in the BuildPickView method that captures
camera.cullingMask (use a key like "cullingMask") alongside the existing camera
properties like projection, fieldOfView, and clipping planes. Then ensure the
picker implementation uses this cullingMask value as the default layer mask when
performing raycast hit tests, preventing colliders on invisible layers from
being selected.

Comment thread MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs Outdated
Comment thread MCPForUnity/Editor/Tools/PickGameObjectFromImage.cs Outdated
Comment thread Server/src/cli/commands/camera.py
@Alex-Ma0

Alex-Ma0 commented Jun 14, 2026

Copy link
Copy Markdown
Author

Addressed the CodeRabbit review feedback in 81e74d4:

  • Added pickView.cullingMask and made omitted layer_mask default to the screenshot camera culling mask.
  • Guarded HandleUtility.PickGameObject so it only runs when the live Scene View still matches the screenshot pickView; otherwise the picker uses the ray/mesh fallback based on serialized screenshot metadata.
  • Switched Scene View SkinnedMeshRenderer mesh fallback to BakeMesh() before intersection.
  • Fixed all run_command argument ordering in Server/src/cli/commands/camera.py.

Validation:

  • uv run --extra dev pytest tests/test_pick_gameobject_from_image.py tests/test_cli.py::TestCameraCommands -v - 16 passed
  • uv run --extra dev pytest tests/ -q - 1242 passed
  • EditMode MCPForUnityTests.Editor.Tools.PickGameObjectFromImageTests - 17 total, 16 passed, 0 failed, 1 skipped (Scene View screenshot skipped in batchmode)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant