Skip to content

[DJM] Add init script troubleshooting and permissions guidance for Databricks#37479

Draft
charlesmyu wants to merge 2 commits into
masterfrom
charles.yu/djm-databricks-init-script-troubleshooting
Draft

[DJM] Add init script troubleshooting and permissions guidance for Databricks#37479
charlesmyu wants to merge 2 commits into
masterfrom
charles.yu/djm-databricks-init-script-troubleshooting

Conversation

@charlesmyu

@charlesmyu charlesmyu commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

What does this PR do? What is the motivation?

Updates the Data Observability: Jobs Monitoring for Databricks documentation with two improvements:

  1. Init script permissions guidance: Adds a note to both the "Manually configure a cluster policy" and "Manually install on a specific cluster" setup tabs clarifying that Unity Catalog Volume permissions are evaluated against the cluster owner, not the principal running the cluster. The "Manually install on a specific cluster" tab was also missing the permissions steps entirely — these have been added to match the cluster policy tab.

  2. Troubleshooting section: Restructures and expands the Troubleshooting section into two subsections:

    • Init script not running or failing: guides users through confirming the init script ran and succeeded via the cluster event log (INIT_SCRIPTS_STARTED, INIT_SCRIPTS_FINISHED), and how to investigate failures using cluster log delivery
    • Init script succeeded but no data in DJM: Uses existing API key and Agent validation steps

Merge instructions

Merge readiness:

  • Ready for merge

AI assistance

Drafted with Claude Code.

Additional notes

The permissions note addresses a common support issue where users grant volume permissions only to themselves, which works for clusters they own but fails silently for clusters owned by other principals.

@github-actions

Copy link
Copy Markdown
Contributor

Preview links (active after the build_preview check completes)

Modified Files

@github-actions github-actions Bot added the Images Images are added/removed with this PR label Jun 12, 2026
@charlesmyu charlesmyu force-pushed the charles.yu/djm-databricks-init-script-troubleshooting branch from d39490c to 9c51c5e Compare June 12, 2026 16:36
…ection

- Add Unity Catalog Volume permissions steps to the "Manually install on
  a specific cluster" tab, which was missing them entirely
- Add a note to both manual install tabs that UC Volume permissions are
  evaluated against the cluster owner, not the principal running the cluster
- Add a Troubleshooting section with two subsections:
  - "Init script not running or failing": steps to confirm the init script
    ran and succeeded via the cluster event log, and how to investigate
    failures using cluster log delivery
  - "Data not appearing after a successful init script run": API key and
    Agent validation steps
@charlesmyu charlesmyu force-pushed the charles.yu/djm-databricks-init-script-troubleshooting branch from 3f8f745 to 6825ea8 Compare June 12, 2026 16:42
The script above downloads and runs the latest init script for Data Observability: Jobs Monitoring in Databricks. If you want to pin your script to a specific version, you can replace the filename in the URL (for example, `install-databricks-0.14.0.sh` to use version `0.14.0`). You can find the source code used to generate this script, and the changes between script versions, on the [Datadog Agent repository][3].

1. Grant read-only permissions to the init script:
1. At the volume level, grant the `READ VOLUME` permission to all account users.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charlesmyu account users or workspace users?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is at the unity catalog level, so there's only an "account users" group generated by default. Customers might have their own groups setup, either for the workspace specifically or for users that are allowed to create jobs/clusters, but otherwise "account users" would be the correct default here.

That's also the reason why I added the disclaimer below - you can restrict, as long as you cover all users who own clusters (either all purpose or job).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Images Images are added/removed with this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants