Skip to content

codegen: Support declaration deduplication#5344

Open
hughsimpson wants to merge 13 commits into
softwaremill:masterfrom
hughsimpson:support_inheritance_weirdness
Open

codegen: Support declaration deduplication#5344
hughsimpson wants to merge 13 commits into
softwaremill:masterfrom
hughsimpson:support_inheritance_weirdness

Conversation

@hughsimpson

@hughsimpson hughsimpson commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Would address #4719

The idea for this was prompted by a somewhat idiosyncratic methodology for versioning that I'm using, where each versioned api is fully declared as a separate openapi. I don't want the compilation overhead of re-generating schemas and serdes for each version where there are no changes to a particular schema, and I do want to be able to reuse security logic without fiddling about with conversions between otherwise-identical types in different namespaces.

I genuinely think this could also be useful for other use-cases, such as sharing common types (e.g. errors) between different APIs.

Currently done:

  • model dedup
  • tapir schema dedup
  • security schema dedup
  • annotation dedup
  • validator dedup
  • endpoint dedup
  • serde dedup
  • lotsa tests
  • initial impl of parsing openapi dir structures

Leaving for now:

  • more than one 'inherited' openapi (this could be added backwards-compatibly by accepting a seq[string] or a string in the future. For now it's a bit too much in one go)
  • 'proper' openapi dir parsing (we should be able to inline components referred to by file location, and not just compose the way we do in this pr)

Beyond the existing test cases, I've also tested against the project that I usually test these prs on. I was able to drop a whole bunch of openapi-pre-parsing and output-munging as a result. Even though a lot of the model/schema/serde dedup was already being done with these hacks, it still reduces compilation times noticeably for me (by ~50%), because of the endpoint dedup (which I found basically impossible to do downstream)

@hughsimpson hughsimpson force-pushed the support_inheritance_weirdness branch from 8e96f59 to 63669cc Compare June 23, 2026 15:23

implicit lazy val anEnumJsonCodec: com.github.plokhotnyuk.jsoniter_scala.core.JsonValueCodec[AnEnum] = com.github.plokhotnyuk.jsoniter_scala.macros.JsonCodecMaker.make(com.github.plokhotnyuk.jsoniter_scala.macros.CodecMakerConfig.withAllowRecursiveTypes(true).withTransientEmpty(false).withTransientDefault(false).withRequireCollectionFields(true).withDiscriminatorFieldName(scala.None))
implicit lazy val aDTWithDiscriminatorCodec: com.github.plokhotnyuk.jsoniter_scala.core.JsonValueCodec[ADTWithDiscriminator] = com.github.plokhotnyuk.jsoniter_scala.macros.JsonCodecMaker.make(com.github.plokhotnyuk.jsoniter_scala.macros.CodecMakerConfig.withAllowRecursiveTypes(true).withTransientEmpty(false).withTransientDefault(false).withRequireCollectionFields(true).withRequireDiscriminatorFirst(false).withDiscriminatorFieldName(Some("type")).withAdtLeafClassNameMapper(x => com.github.plokhotnyuk.jsoniter_scala.macros.JsonCodecMaker.simpleClassName(x) match {
implicit lazy val aDTWithDiscriminatorJsonCodec: com.github.plokhotnyuk.jsoniter_scala.core.JsonValueCodec[ADTWithDiscriminator] = com.github.plokhotnyuk.jsoniter_scala.macros.JsonCodecMaker.make(com.github.plokhotnyuk.jsoniter_scala.macros.CodecMakerConfig.withAllowRecursiveTypes(true).withTransientEmpty(false).withTransientDefault(false).withRequireCollectionFields(true).withRequireDiscriminatorFirst(false).withDiscriminatorFieldName(Some("type")).withAdtLeafClassNameMapper(x => com.github.plokhotnyuk.jsoniter_scala.macros.JsonCodecMaker.simpleClassName(x) match {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are changed to be consistent with the naming convention for all the other codec names. The discrepancy was annoying.

@@ -0,0 +1,121 @@
lazy val root = (project in file("."))

@hughsimpson hughsimpson Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna be honest, I had an LLM write a lot of these test cases for me. The generated cases actually caught a completely orthogonal bug in case-handling. I don't feel too guilty about it.

_: OpenapiSchemaObject | _: OpenapiSchemaArray | _: OpenapiSchemaMap | _: OpenapiSchemaEnum | _: OpenapiSchemaOneOf |
_: OpenapiSchemaAny,
_
) =>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A completely unrelated bug; we weren't catching the non-json OpenapiSchemaArray case before and were throwing instead.

@hughsimpson hughsimpson changed the title codegen: Support declaration deduplication (WIP) codegen: Support declaration deduplication Jun 23, 2026
@hughsimpson hughsimpson marked this pull request as ready for review June 23, 2026 18:19
@hughsimpson

hughsimpson commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@adamw I apologise if this whole thing is just completely mad and a 'won't ever merge' scenario, but I was hoping if that were the case, you would've commented on the issue I opened at some point, so I've just gone for it 😅 . It does, I think, go some way towards addressing the 'all the schemas are in one place' issue, although you'd really need multiple inheritance to take full advantage of that. Chaining sucks. I do have follow-ups planned but this one was important because those follow-ups would break my entire code-munge flow that let me get away without having this before.

@hughsimpson

hughsimpson commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Actually it strikes me that all you'd need was a little extra config param to set 'uncommitted' dependency versions, and you'd have a mechanism here to permit defining the models in one module, and then reusing them in another. Might be nice to be able to use some generated classes as genuine domain objects, before combining them with the api proper at a higher layer. All you'd have to do would be to give them the same package name and root openapi in both module and inherit the model schema into your full endpoint definitions, and you should be able to do that (with some caveats about how maybe there should be a way to force certain 'positional-only' generations on enums if you're gonna use it that way. But not explored that space yet).

Obviously if you're writing tapir as God intended, you're free to define your model classes anywhere, and this could help restore some parity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant