-
Notifications
You must be signed in to change notification settings - Fork 124
Add Lakebase Autoscaling (Postgres project) example #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
RCoff
wants to merge
7
commits into
databricks:main
Choose a base branch
from
RCoff:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+105
−0
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
fcc5560
Add example for creating a Lakebase Autoscaling Postgres project
RCoff 1403585
Update comment and documentation for clarity
RCoff 295f805
Add clarification as to what "databricks_postgres" referred to here
RCoff 80ac357
Move workspace host setting to targets, add a prod workspace target
RCoff e08df2e
Update comments to add clarification
RCoff b395b46
Clarify note regarding default_endpoint_settings
RCoff acb1522
Update README to include instructions on the 'prod' bundle target
RCoff File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| # Lakebase Autoscaling (Postgres) project with branching | ||
|
|
||
| This example demonstrates how to define a Lakebase Autoscaling project with a non-default branch in a Declarative Automation Bundle. | ||
|
|
||
| It includes and deploys: | ||
| - A Lakebase Autoscaling project with configurable min and max compute units (CUs) | ||
| - A non-default `development` branch for isolated dev/test workflows | ||
| - A read-only replica endpoint on the default production branch | ||
|
|
||
| Lakebase Autoscaling is Databricks' managed PostgreSQL service with autoscaling compute, Git-like branching, scale-to-zero, and instant point-in-time restore. | ||
|
|
||
| For more information about Lakebase Autoscaling, see the [documentation](https://docs.databricks.com/aws/en/oltp/projects/). | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| * Databricks CLI v0.287.0 or above | ||
| * `psql` client version 14 or above (Optional, only needed to run the demo queries) | ||
|
|
||
| ## Usage | ||
|
|
||
| Modify `databricks.yml`: | ||
| * Update the `host` field under each target's `workspace` block (`dev` and `prod`) to point at the Databricks workspace you want to deploy to | ||
| * Adjust `autoscaling_limit_min_cu` and `autoscaling_limit_max_cu` to fit your workload (valid range: 0.5-32 CU, max minus min cannot exceed 16 CU) | ||
| * Adjust `suspend_timeout_duration` to control scale-to-zero behavior (minimum: 60 seconds) | ||
|
|
||
| The bundle defines two targets: `dev` (the default) and `prod`. Run `databricks bundle deploy` to deploy to `dev`, or `databricks bundle deploy --target prod` to deploy to `prod`. | ||
|
|
||
| Please note that after this bundle gets deployed, the project and its compute endpoints start running, which incurs cost. Endpoints with scale-to-zero enabled will suspend after the configured `suspend_timeout_duration`. | ||
|
|
||
| Run the following queries against the **production** branch to populate your database with sample data: | ||
|
|
||
| ```bash | ||
| # Create a demo table: | ||
| databricks psql --project my-autoscaling-project -- -d databricks_postgres -c "CREATE TABLE IF NOT EXISTS hello_world (id SERIAL PRIMARY KEY, message TEXT, number INTEGER, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);" | ||
|
|
||
| # Insert 100 rows of demo data: | ||
| databricks psql --project my-autoscaling-project -- -d databricks_postgres -c "INSERT INTO hello_world (message, number) SELECT 'Hello World #' || generate_series, generate_series FROM generate_series(1, 100);" | ||
|
|
||
| # Show generated rows: | ||
| databricks psql --project my-autoscaling-project -- -d databricks_postgres -c "SELECT * FROM hello_world;" | ||
| ``` | ||
|
|
||
| Run the following queries against the **development** branch to verify the branched data: | ||
|
|
||
| **Important:** Prior to running the following command, the `development` branch MUST be 'reset' via the web UI. This will re-sync the child branch with its parent. More information regarding resetting a branch can be found [here](https://docs.databricks.com/aws/en/oltp/projects/manage-branches#reset-branch-from-parent). | ||
|
|
||
| ```bash | ||
| # Query the development branch: | ||
| databricks psql --project my-autoscaling-project --branch development -- -d databricks_postgres -c "SELECT count(*) FROM hello_world;" | ||
| ``` | ||
|
|
||
| ## Clean up | ||
| To remove the provisioned project, branches, and endpoints run `databricks bundle destroy` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| bundle: | ||
| name: postgres-autoscaling-example | ||
|
|
||
| resources: | ||
| # Top-level project container for Lakebase Autoscaling. | ||
| # Creating a project automatically provisions: | ||
| # - A default branch named "production" | ||
| # - A primary READ_WRITE endpoint on the default branch | ||
| # - A database with the name "databricks_postgres" | ||
| postgres_projects: | ||
| my_project: | ||
| project_id: my-autoscaling-project | ||
| display_name: "My Autoscaling Project" | ||
| pg_version: 17 | ||
| default_endpoint_settings: | ||
| # Newly created endpoints inherit these settings unless overridden | ||
| # at the endpoint level (see "prod_read_replica" below). | ||
| # Changing these will not affect existing endpoints. | ||
| autoscaling_limit_min_cu: 1 | ||
| autoscaling_limit_max_cu: 4 | ||
| suspend_timeout_duration: "300s" # Scale-to-zero after 5 minutes | ||
|
|
||
| # (Optional) Read replica on the default (production) branch. | ||
| postgres_endpoints: | ||
| prod_read_replica: | ||
| parent: ${resources.postgres_projects.my_project.id}/branches/production | ||
| endpoint_id: ep-production-read-replica | ||
| endpoint_type: ENDPOINT_TYPE_READ_ONLY | ||
| # Note: The below two options override the ones set in | ||
| # "default_endpoint_settings" above | ||
| autoscaling_limit_min_cu: 0.5 | ||
| autoscaling_limit_max_cu: 2 | ||
|
|
||
| # (Optional) Non-default branch for development/testing. | ||
| # Note: A READ_WRITE endpoint is currently created by default | ||
| # when provisioning a new branch | ||
| postgres_branches: | ||
| dev: | ||
| parent: ${resources.postgres_projects.my_project.id} | ||
| branch_id: development # The name for the branch | ||
| no_expiry: true | ||
|
|
||
| # Targets allow you to deploy the same bundle to different Databricks workspaces. | ||
| targets: | ||
|
RCoff marked this conversation as resolved.
|
||
| dev: | ||
| default: true | ||
| mode: development | ||
| workspace: | ||
| host: https://my-dev-workspace.cloud.databricks.com | ||
| prod: | ||
| workspace: | ||
| host: https://my-prod-workspace.cloud.databricks.com | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.