FlatPostgresCollection Create Doc Impl #266

suddendust · 2026-01-12T19:39:48Z

Description

This PR implements Collection#create(Key key, Document document) for FlatPostgresCollection.

Testing

Added integration tests.

Checklist:

My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
Any dependent changes have been merged and published in downstream modules

…ma_cache

…rite_create

suresh-prakash · 2026-01-13T02:18:10Z

...e/src/main/java/org/hypertrace/core/documentstore/postgres/model/PostgresColumnMetadata.java

  private final DataType canonicalType;
  @Getter private final PostgresDataType postgresType;
  private final boolean nullable;
+  private final boolean array;


Nit: isArray since array is a slightly confusing without looking at this line.

codecov · 2026-01-14T06:43:22Z

Codecov Report

❌ Patch coverage is 81.77083% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.52%. Comparing base (4853757) to head (900c87f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...documentstore/postgres/FlatPostgresCollection.java	81.81%	20 Missing and 8 partials ⚠️
...rg/hypertrace/core/documentstore/CreateResult.java	65.00%	3 Missing and 4 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main     #266      +/-   ##
============================================
+ Coverage     80.51%   80.52%   +0.01%     
- Complexity     1385     1392       +7     
============================================
  Files           234      237       +3     
  Lines          6194     6379     +185     
  Branches        554      584      +30     
============================================
+ Hits           4987     5137     +150     
- Misses          831      854      +23     
- Partials        376      388      +12

Flag	Coverage Δ
integration	`80.52% <81.77%> (+0.01%)`	⬆️
unit	`57.15% <7.29%> (-1.50%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

suddendust · 2026-01-14T06:44:42Z

@suresh-prakash Can you take a look at the following:

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.
Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.
If any columns are skipped this way, we return the list to the client in the CreateResult. For this, CreateResult has been enhanced.
We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

suresh-prakash · 2026-01-14T07:16:47Z

@suresh-prakash Can you take a look at the following:

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

If any columns are skipped this way, we return the list to the client in the CreateResult. For this, CreateResult has been enhanced.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Best-effort makes sense. But, most of the times such log messages are ignored though we may be setting them in the CreateResult. Rather I'd prefer we fail it so that the client can take a necessary action (they may choose to retry with the column ignored, if we throw a custom exception with meaningful info.).

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

This is fine, I guess, because that's the way the Mongo impl. works today. If I give an invalid selection, it doesn't throw. Rather avoids it in the JSON, which in turn results in a null (or missing) value on the caller/client side.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

100% makes sense. 🙂

puneet-traceable · 2026-01-14T15:47:03Z

@suresh-prakash Can you take a look at the following:

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

If any columns are skipped this way, we return the list to the client in the CreateResult. For this, CreateResult has been enhanced.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Best-effort makes sense. But, most of the times such log messages are ignored though we may be setting them in the CreateResult. Rather I'd prefer we fail it so that the client can take a necessary action (they may choose to retry with the column ignored, if we throw a custom exception with meaningful info.).

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

This is fine, I guess, because that's the way the Mongo impl. works today. If I give an invalid selection, it doesn't throw. Rather avoids it in the JSON, which in turn results in a null (or missing) value on the caller/client side.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

100% makes sense. 🙂

Why retry on DATATYPE_MISMATCH? This can be a bad situation to be in. If client data type mismatches the db column type then retries won't help. DB Column data type change should never be the case in postgres.

suddendust · 2026-01-14T20:10:12Z

DB Column data type change should never be the case in postgres.

Well theoretically speaking, the rationale behind the retry is: Maybe the data type changed and the schema is stale, so refresh it and try again.

suddendust · 2026-01-15T07:28:36Z

@suresh-prakash I am not throwing an exception because it won't throw an exception in Mongo as well. So to keep the interface behaviour consistent, shall we stick to it?

suresh-prakash · 2026-01-15T08:08:43Z

@suresh-prakash I am not throwing an exception because it won't throw an exception in Mongo as well. So to keep the interface behaviour consistent, shall we stick to it?

True. But silently neglecting seems dangerous. Most clients would be unware of it. While it may give the immediate compatibility, it could create far more issues later. Since this write path anyways require code changes in the clients, I still think it would be better to let the clients handle it rather than the library doing it behind the scenes.

@kotharironak Thoughts here?

suddendust · 2026-01-16T06:08:26Z

@suresh-prakash How about a config to control this (in customParameters)? Maybe something like bestEffortWrites: true/false. If true, it would write on a best-effort basis. If not, if would do a strict match - That is, all fields passed in the doc should be present in the schema along with the right value type.

suddendust · 2026-01-16T07:16:29Z

@suresh-prakash Have add a bestEffortWrites custom param that controls the dataflow as following:

If true, then PG would skip any fields passed in the document that are not present in the schema, or whose passed values' types don't conform to what is present in the schema.
If false, it does a strict match. All fields present in the doc should be present in the schema + the passed values' types should conform to the defined schema.

Wdyt?

suresh-prakash · 2026-01-16T11:00:46Z

@suresh-prakash Have add a bestEffortWrites custom param that controls the dataflow as following:

If true, then PG would skip any fields passed in the document that are not present in the schema, or whose passed values' types don't conform to what is present in the schema.

If false, it does a strict match. All fields present in the doc should be present in the schema + the passed values' types should conform to the defined schema.

Wdyt?

This is a nice middle-ground. 🙂 Just a small suggestion though. Instead of a boolean, can we make it as an enum (say, MissingColumnStrategy) please? That way, if at all there arises a third strategy in the future, it's straight-forward to extend it. E.g. values: SKIP, THROW, IGNORE_DOCUMENT, MENTION_IN_RESPONSE, etc. (Out of these, we can just implement the necessary ones today).

suresh-prakash · 2026-01-19T07:34:52Z

document-store/src/main/java/org/hypertrace/core/documentstore/CreateResult.java

-    this.succeed = succeed;
+  private final CreateStatus status;
+  private final boolean onRetry;
+  private final List<String> skippedFields;


Nit: Can be a set.

suresh-prakash · 2026-01-19T07:35:35Z

...ore/src/main/java/org/hypertrace/core/documentstore/model/options/MissingColumnStrategy.java

+   * a field doesn't match the schema. The write operation will fail.
+   */
+  THROW,
+  IGNORE_DOCUMENT


Nit: Can also come in later when the need arises.

suresh-prakash · 2026-01-19T07:36:13Z

...ore/src/main/java/org/hypertrace/core/documentstore/model/options/MissingColumnStrategy.java

+ *
+ * <p>This enum defines how the system should behave when encountering fields that either don't
+ * exist in the schema or have incompatible types.
+ */


Can we also mention what is the default (in case not specified)? Or, is it a mandatory field?

The default can go in @implSpec annotated documentation comment so that every implementor would agree to this specification.

Even better would be to expose a static method called MissingColumnStrategy.default() so that it is self-documented (and gives an opportunity to later change the default across all implementations in a single go).

suddendust added 21 commits December 28, 2025 23:17

Added PostgresSchemaRegistry.java

00e9a9c

Spotless

31846e9

WIP

2fdbf0e

Spotless

1727dd0

Remove unused method in SchemaRegistry

a62fbc2

Remove unused method in ColumnMetadata

6b7595b

WIP

7b4ef2a

WIP

598cb25

Configure cache expiry and cooldown

9c173b9

Added PostgresMetadataFetcherTest

7bf77c5

WIP

6d03cd5

Added docs on thread safety

c3f5f7e

Added PostgresSchemaRegistryIntegrationTest.java

827381f

WIP

602037b

WIP

c8a53eb

Refactor

31f16e2

Merge branch 'schema_cache' into pg_write_create

c139bee

Merge branch 'main' of github.com:hypertrace/document-store into sche…

75150e3

…ma_cache

Merge branch 'schema_cache' into pg_write_create

5412f9b

WIP

bf5ca5c

Implement create for flat collections

57b623a

suddendust requested review from avinashkolluru, kotharironak, skjindal93 and suresh-prakash as code owners January 12, 2026 19:39

Merge branch 'main' of github.com:hypertrace/document-store into pg_w…

233c9c4

…rite_create

suddendust changed the title ~~[Draft] Pg write create~~ FlatPostgresCollection Create Doc Impl Jan 12, 2026

Fix compilation issue

9f8811e

suresh-prakash previously approved these changes Jan 13, 2026

View reviewed changes

Refactor

bfd6651

suddendust dismissed suresh-prakash’s stale review via bfd6651 January 14, 2026 06:22

Enhanced CreateResult.java and others

70ec4b3

suddendust added 2 commits January 14, 2026 12:34

Added more test cases

6d1c277

Spotless

910ef8c

suddendust added 2 commits January 14, 2026 13:47

WIP

3e2c178

Add more test coverage

9061e24

Added bestEfforts configuration to PG custom parameters.

c3024b0

Create MissingColumnStrategy.java

900c87f

suddendust requested a review from suresh-prakash January 19, 2026 07:26

suresh-prakash approved these changes Jan 19, 2026

View reviewed changes

suresh-prakash merged commit a45e01d into hypertrace:main Jan 19, 2026
6 checks passed

FlatPostgresCollection Create Doc Impl #266

FlatPostgresCollection Create Doc Impl #266

Uh oh!

Conversation

suddendust commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Checklist:

Uh oh!

suresh-prakash Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

suddendust Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

suddendust commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suresh-prakash commented Jan 14, 2026

Uh oh!

puneet-traceable commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suddendust commented Jan 14, 2026

Uh oh!

suddendust commented Jan 15, 2026

Uh oh!

suresh-prakash commented Jan 15, 2026

Uh oh!

suddendust commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suddendust commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suresh-prakash commented Jan 16, 2026

Uh oh!

suresh-prakash Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

suresh-prakash Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

suresh-prakash Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

suresh-prakash Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

suresh-prakash Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

suddendust commented Jan 12, 2026 •

edited

Loading

codecov bot commented Jan 14, 2026 •

edited

Loading

suddendust commented Jan 14, 2026 •

edited

Loading

puneet-traceable commented Jan 14, 2026 •

edited

Loading

suddendust commented Jan 16, 2026 •

edited

Loading

suddendust commented Jan 16, 2026 •

edited

Loading