Skip to content

fix(api): blob endpoint returns 400/404 instead of 500 on bad paths#37

Open
kevincodex1 wants to merge 1 commit into
mainfrom
fix/blob-path-400
Open

fix(api): blob endpoint returns 400/404 instead of 500 on bad paths#37
kevincodex1 wants to merge 1 commit into
mainfrom
fix/blob-path-400

Conversation

@kevincodex1

@kevincodex1 kevincodex1 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Unnormalized "../.." paths from crawler-exploded relative links made git show fail and surfaced as 500s. Reject traversal segments with 400 up front, and map paths absent from the tree to 404 so only real git failures count as server errors.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced blob file retrieval with stricter path validation and normalization, rejecting invalid path segments.
    • Improved error handling to provide clearer distinction between missing files and other errors during file retrieval.

Unnormalized "../.." paths from crawler-exploded relative links made
`git show` fail and surfaced as 500s. Reject traversal segments with
400 up front, and map paths absent from the tree to 404 so only real
git failures count as server errors.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The get_blob endpoint now validates and sanitizes file paths by rejecting traversal attempts and empty segments, and improves error handling by distinguishing file-not-found errors from other git failures.

Changes

Blob retrieval robustness

Layer / File(s) Summary
Input validation and path sanitization
crates/gitlawb-node/src/api/repos.rs
The get_blob function trims leading and trailing slashes from file_path, then validates that each segment is non-empty and not . or .., returning BadRequest("invalid file path") for violations.
Error handling with git error interpretation
crates/gitlawb-node/src/api/repos.rs
The get_blob function parses specific git error strings from store::read_file failures and maps them to AppError::NotFound("file not found: {file_path}"), while other errors remain as AppError::Git.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit hops through paths so sly, 🐰
Trimming slashes, saying nay to ..
When files vanish in the git—not there!
Clear errors bloom, no more despair,
Safe paths and truth, a robustness prayer! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: the blob endpoint now returns 400 for bad/traversal paths and 404 for missing files instead of 500, which directly matches the implementation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/blob-path-400

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 252-264: The current substring-match on read_file stderr in
get_blob wrongly treats "invalid object name" as a path-missing 404; update the
flow so store::read_file returns a structured error (e.g., enum like
ReadFileError::PathNotFound and ReadFileError::Other) or perform a preflight
existence check (use git cat-file/tree lookup via the existing store helpers) to
distinguish "path not present in tree" from "invalid ref/object"
(store::resolve_head fallback failure). Then change the mapping in get_blob to
convert only the explicit PathNotFound variant to AppError::NotFound and
propagate other errors as AppError::Git (preserving stderr) so invalid
refs/internal git errors are not mapped to HTTP 404.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: df5a9e82-42d9-428a-bcd5-866854f17b35

📥 Commits

Reviewing files that changed from the base of the PR and between 6abaf1d and aa18286.

📒 Files selected for processing (1)
  • crates/gitlawb-node/src/api/repos.rs

Comment on lines +252 to +264
let content = store::read_file(&disk_path, &head_ref, file_path).map_err(|e| {
let msg = e.to_string();
// `git show ref:path` on a path absent from the tree is a 404,
// not a server error
if msg.contains("does not exist in")
|| msg.contains("invalid object name")
|| msg.contains("exists on disk, but not in")
{
AppError::NotFound(format!("file not found: {file_path}"))
} else {
AppError::Git(msg)
}
})?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "fn read_file|read_file\(" -S crates/gitlawb-node/src || true
rg -n "struct store|mod store|pub mod store|store::read_file" -S crates/gitlawb-node/src || true
ls -R crates/gitlawb-node/src/api | head

Repository: Gitlawb/node

Length of output: 607


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "fn read_file|read_file\(" -S crates/gitlawb-node/src || true
rg -n "store::read_file" -S crates/gitlawb-node/src || true
rg -n "mod store|pub mod store|crate::store|use .*store" -S crates/gitlawb-node/src || true
ls -R crates/gitlawb-node/src/api | head

Repository: Gitlawb/node

Length of output: 1012


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '180,280p' crates/gitlawb-node/src/git/store.rs
sed -n '1,120p' crates/gitlawb-node/src/api/repos.rs

rg -n "enum AppError|struct AppError|AppError::NotFound|AppError::Git" crates/gitlawb-node/src -S
rg -n "NotFound\\(|Git\\(" crates/gitlawb-node/src/api/repos.rs -S

Repository: Gitlawb/node

Length of output: 14533


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '200,310p' crates/gitlawb-node/src/api/repos.rs
sed -n '1,120p' crates/gitlawb-node/src/error.rs
sed -n '120,220p' crates/gitlawb-node/src/error.rs
sed -n '200,280p' crates/gitlawb-node/src/git/store.rs
rg -n "pub fn read_file|pub fn read_object|cat-file|show failed" crates/gitlawb-node/src/git/store.rs
sed -n '280,380p' crates/gitlawb-node/src/git/store.rs

Repository: Gitlawb/node

Length of output: 12434


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "fn resolve_head" crates/gitlawb-node/src/git/store.rs crates/gitlawb-node/src/git -S
sed -n '1,220p' crates/gitlawb-node/src/git/store.rs

Repository: Gitlawb/node

Length of output: 7288


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "store::read_file\\(" crates/gitlawb-node/src -S
rg -n "does not exist in|invalid object name|exists on disk, but not in" crates/gitlawb-node/src -S
rg -n "empty repo|no commits|HEAD.*invalid" crates/gitlawb-node/src -S || true

Repository: Gitlawb/node

Length of output: 738


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "store::read_file\\(" crates/gitlawb-node/src -S
rg -n "does not exist in|invalid object name|exists on disk, but not in" crates/gitlawb-node/src -S
rg -n "empty repo|no commits|HEAD.*invalid|rev-parse --verify\" crates/gitlawb-node/src -S || true

Repository: Gitlawb/node

Length of output: 647


🏁 Script executed:

#!/bin/bash
set -euo pipefail
sed -n '220,290p' crates/gitlawb-node/src/api/repos.rs
rg -n "init_bare\\(|create.*bare|bare repo|refs/heads/main" crates/gitlawb-node/src/git -S
rg -n "resolve_head\\(" crates/gitlawb-node/src -S
sed -n '1,220p' crates/gitlawb-node/src/git/repo_store.rs

Repository: Gitlawb/node

Length of output: 14695


🏁 Script executed:

#!/bin/bash
set -euo pipefail
sed -n '220,290p' crates/gitlawb-node/src/api/repos.rs
rg -n "init_bare\\(|create.*bare|bare repo|refs/heads/main" crates/gitlawb-node/src/git -S
rg -n "resolve_head\\(" crates/gitlawb-node/src -S
sed -n '1,260p' crates/gitlawb-node/src/git/repo_store.rs

Repository: Gitlawb/node

Length of output: 16517


Avoid mapping HTTP 404 from substring-matching raw git show stderr in get_blob.
In crates/gitlawb-node/src/api/repos.rs (~252-264), store::read_file embeds raw git show stderr via bail!("git show failed: {stderr}"), and get_blob then substring-matches (including "invalid object name") to return AppError::NotFound. Since store::resolve_head can fall back to "HEAD" even when it doesn’t resolve (e.g., empty repo), "invalid object name" can reflect a missing/invalid ref—not just a missing path in the tree—leading to 404s where a git/internal error (or a different error type) is more accurate. Move this to a structured store-level result (e.g., preflight ref/object existence using cat-file/tree lookup) and only map true tree-path misses to AppError::NotFound.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/repos.rs` around lines 252 - 264, The current
substring-match on read_file stderr in get_blob wrongly treats "invalid object
name" as a path-missing 404; update the flow so store::read_file returns a
structured error (e.g., enum like ReadFileError::PathNotFound and
ReadFileError::Other) or perform a preflight existence check (use git
cat-file/tree lookup via the existing store helpers) to distinguish "path not
present in tree" from "invalid ref/object" (store::resolve_head fallback
failure). Then change the mapping in get_blob to convert only the explicit
PathNotFound variant to AppError::NotFound and propagate other errors as
AppError::Git (preserving stderr) so invalid refs/internal git errors are not
mapped to HTTP 404.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant