[FEATURE] Add compact binary model format (.namb) for fast embedded loading#222
[FEATURE] Add compact binary model format (.namb) for fast embedded loading#222jfsantos wants to merge 6 commits intosdatkinson:mainfrom
Conversation
…oading Introduces a binary serialization format that eliminates the need for JSON parsing (no nlohmann/json dependency in the loader), achieving 80-89% file size reduction. Supports all architectures including WaveNet with nested condition_dsp. New files: - NAM/namb_format.h: format constants, CRC32, binary reader/writer - NAM/get_dsp_namb.h/.cpp: binary loader (zero JSON dependency) - tools/nam2namb.cpp: JSON-to-binary converter CLI tool - tools/test/test_namb.cpp: round-trip, validation, and size tests
|
Is this intended as a new public distribution format? Or just as a tool for situations where using json is impractical? A huge benefit of the json format is that it makes the .nam format very open and easy to integrate with. A custom binary format is much less so. |
tools/test/test_namb.cpp
Outdated
| uint32_t crc_empty = nam::namb::crc32(nullptr, 0); | ||
| assert(crc_empty == 0x00000000u); | ||
|
|
||
| std::cout << " PASS" << std::endl; |
There was a problem hiding this comment.
Can you make the tests silent on happy path?
There was a problem hiding this comment.
oops, I did it again. I always mean to delete those after making sure everything works, and always forget.
| << std::fixed << std::setprecision(1) << reduction << "% reduction)" << std::endl; | ||
|
|
||
| // .namb should always be smaller than .nam | ||
| assert(namb_size < nam_size); |
There was a problem hiding this comment.
Hmm...while it's true, it feels wrong to me to be asserting this.
There was a problem hiding this comment.
The assertions were to make sure the binary conversion wasn't doing anything silly... but we can remove them.
| assert(namb_size < nam_size); | ||
|
|
||
| // Should be at least 50% reduction (typically ~85%) | ||
| assert(reduction > 50.0); |
|
A few high-level questions to ponder @jfsantos:
Thoughts appreciated. |
|
@mikeoliphant the former--the current JSON-subset ".nam" format will still be the recommended "lingua franca" for models. I agree with you--it's very handy to be able to look at the files as a human :) and this feels like a nice "add-on". In fact, this might be entirely separate from this "core" library. Arguments why / why not to do that are welcome. |
|
I'm thinking about it from a support perspective for NAM implementations that don't use NAM Core (I've got mine, Dimehead has theirs, etc.) If this new format is going to be used as a new distribution format (ie: on Tone3000), then we'll need to add support for it. |
|
@mikeoliphant this is intended for low memory platforms where parsing JSON is not feasible because of the high memory costs. It will never be the main distribution system; that will still be JSON-based. |
|
@sdatkinson thanks for your questions!
|
Introduce typed config structs per architecture (LinearConfig, LSTMConfig, ConvNetConfig, WaveNetConfig) and a single create_dsp() function that both the JSON and .namb binary loaders feed into. This eliminates duplicated construction logic — adding a new architecture now only requires a format- specific parser, not separate factory code in each loader. - Add config structs and parse_config_json() to each architecture header/impl - Add NAM/model_config.h with ModelConfig variant, ModelMetadata, create_dsp() - Refactor get_dsp(dspData&) to use parse_model_config_json() → create_dsp() - Refactor get_dsp_namb.cpp load_*() to return typed configs → create_dsp() - Register missing Linear factory in FactoryRegistry - Silence test_namb.cpp output on success to match other tests
|
Closed in favour of #227 |
Introduces a binary serialization format that eliminates the need for
JSON parsing (no nlohmann/json dependency in the loader), achieving
80-89% file size reduction. Supports all architectures including
WaveNet with nested condition_dsp.
New files:
Developed with support and sponsorship from TONE3000