Skip to content

ata-core/ata-validator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

220 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ata-validator

Ultra-fast JSON Schema validator powered by simdjson. Multi-core parallel validation, RE2 regex, codegen bytecode engine. Standard Schema V1 compatible.

ata-validator.com | API Docs | Contributing

Performance

Simple Schema (7 properties, type + format + range + nested object)

Scenario ata ajv
validate(obj) valid 22ns 102ns ata 4.6x faster
validate(obj) invalid 87ns 182ns ata 2.1x faster
isValidObject(obj) 21ns 100ns ata 4.7x faster
Schema compilation 695ns 1.30ms ata 1,867x faster
First validation 2.07μs 1.11ms ata 534x faster

Complex Schema (patternProperties + dependentSchemas + propertyNames + additionalProperties)

Scenario ata ajv
validate(obj) valid 17ns 115ns ata 6.8x faster
validate(obj) invalid 59ns 194ns ata 3.3x faster
isValidObject(obj) 19ns 124ns ata 6.6x faster

Cross-Schema $ref (multi-schema with $id registry)

Scenario ata ajv
validate(obj) valid 17ns 25ns ata 1.5x faster
validate(obj) invalid 34ns 54ns ata 1.6x faster

Measured with mitata on Apple M4 Pro (process-isolated). Benchmark code

unevaluatedProperties / unevaluatedItems

Scenario ata ajv
Tier 1 (properties only) valid 3.3ns 8.7ns ata 2.6x faster
Tier 1 invalid 3.7ns 19.1ns ata 5.2x faster
Tier 2 (allOf) valid 3.3ns 9.9ns ata 3.0x faster
Tier 3 (anyOf) valid 6.7ns 23.2ns ata 3.5x faster
Tier 3 invalid 7.1ns 42.4ns ata 6.0x faster
unevaluatedItems valid 1.0ns 5.5ns ata 5.4x faster
unevaluatedItems invalid 0.96ns 14.2ns ata 14.8x faster
Compilation 375ns 2.59ms ata 6,904x faster

Three-tier hybrid codegen: static schemas compile to zero-overhead key checks, dynamic schemas (anyOf/oneOf) use bitmask tracking with V8-inlined branch functions. Benchmark code

vs Ecosystem (Zod, Valibot, TypeBox)

Scenario ata ajv typebox zod valibot
validate (valid) 9ns 38ns 50ns 334ns 326ns
validate (invalid) 37ns 103ns 4ns 11.8μs 842ns
compilation 584ns 1.20ms 52μs
first validation 2.1μs 1.11ms 54μs

Different categories: ata/ajv/typebox are JSON Schema validators, zod/valibot are schema-builder DSLs. Benchmark code

Large Data - JS Object Validation

Size ata ajv
10 users (2KB) 6.2M ops/sec 2.5M ops/sec ata 2.5x faster
100 users (20KB) 658K ops/sec 243K ops/sec ata 2.7x faster
1,000 users (205KB) 64K ops/sec 23.5K ops/sec ata 2.7x faster

Real-World Scenarios

Scenario ata ajv
Serverless cold start (50 schemas) 0.1ms 23ms ata 230x faster
ReDoS protection (^(a+)+$) 0.3ms 765ms ata immune (RE2)
Batch NDJSON (10K items, multi-core) 13.4M/sec 5.1M/sec ata 2.6x faster
Fastify startup (5 routes) 0.5ms 6.0ms ata 12x faster

Isolated single-schema benchmarks. Results vary by workload and hardware.

How it works

Combined single-pass validator: ata compiles schemas into a single function that validates and collects errors in one pass. Valid data returns VALID_RESULT with zero allocation. Invalid data collects errors inline with pre-allocated frozen error objects - no double validation, no try/catch (3.3x V8 deopt). Lazy compilation defers all work to first usage - constructor is near-zero cost.

JS codegen: Schemas are compiled to monolithic JS functions (like ajv). Full keyword support including patternProperties, dependentSchemas, propertyNames, unevaluatedProperties, unevaluatedItems, cross-schema $ref with $id registry, and Draft 7 auto-detection. Three-tier hybrid approach for unevaluated keywords: compile-time resolution for static schemas, bitmask tracking for dynamic ones. charCodeAt prefix matching replaces regex for simple patterns (4x faster). Merged key iteration loops (patternProperties + propertyNames + additionalProperties in a single for..in).

V8 TurboFan optimizations: Destructuring batch reads, undefined checks instead of in operator, context-aware type guard elimination, property hoisting to local variables, tiered uniqueItems (nested loop for small arrays), inline key comparison for small property sets (no Set.has overhead).

Adaptive simdjson: For large documents (>8KB) with selective schemas, simdjson On Demand seeks only the needed fields - skipping irrelevant data at GB/s speeds.

JSON Schema Test Suite

96.9% pass rate (1109/1144) on official JSON Schema Test Suite (Draft 2020-12).

When to use ata

  • High-throughput validate(obj) - 6.8x faster than ajv on complex schemas, 38x faster than zod
  • Complex schemas - patternProperties, dependentSchemas, propertyNames, unevaluatedProperties all inline JS codegen (6.8x faster than ajv)
  • Multi-schema projects - cross-schema $ref with $id registry, addSchema() API
  • Draft 7 migration - auto-detects $schema, normalizes Draft 7 keywords transparently
  • Serverless / cold starts - 6,904x faster compilation, 5,148x faster first validation
  • Security-sensitive apps - RE2 regex, immune to ReDoS attacks
  • Batch/streaming validation - NDJSON log processing, data pipelines (2.6x faster)
  • Standard Schema V1 - native support for Fastify v5, tRPC, TanStack
  • C/C++ embedding - native library, no JS runtime needed

When to use ajv

  • 100% spec compliance needed - ajv covers more edge cases (ata: 96.9%)
  • $dynamicRef - not yet supported in ata

Features

  • Hybrid validator: 6.8x faster than ajv valid, 6.0x faster invalid on complex schemas - jsFn boolean guard for valid path (zero allocation), combined codegen with pre-allocated errors for invalid path. Schema compilation cache for repeated schemas
  • Cross-schema $ref: schemas option and addSchema() API. Compile-time resolution with $id registry, zero runtime overhead
  • Draft 7 support: Auto-detects $schema field, normalizes dependencies/additionalItems/definitions transparently
  • Multi-core: Parallel validation across all CPU cores - 13.4M validations/sec
  • simdjson: SIMD-accelerated JSON parsing at GB/s speeds, adaptive On Demand for large docs
  • RE2 regex: Linear-time guarantees, immune to ReDoS attacks (2391x faster on pathological input)
  • V8-optimized codegen: Destructuring batch reads, type guard elimination, property hoisting
  • Standard Schema V1: Compatible with Fastify, tRPC, TanStack, Drizzle
  • Zero-copy paths: Buffer and pre-padded input support - no unnecessary copies
  • Defaults + coercion: default values, coerceTypes, removeAdditional support
  • C/C++ library: Native API for non-Node.js environments
  • 96.9% spec compliant: Draft 2020-12

Installation

npm install ata-validator

Usage

Node.js

const { Validator } = require('ata-validator');

const v = new Validator({
  type: 'object',
  properties: {
    name: { type: 'string', minLength: 1 },
    email: { type: 'string', format: 'email' },
    age: { type: 'integer', minimum: 0 },
    role: { type: 'string', default: 'user' }
  },
  required: ['name', 'email']
});

// Fast boolean check - JS codegen, 15.3M ops/sec
v.isValidObject({ name: 'Mert', email: 'mert@example.com', age: 26 }); // true

// Full validation with error details + defaults applied
const result = v.validate({ name: 'Mert', email: 'mert@example.com' });
// result.valid === true, data.role === 'user' (default applied)

// JSON string validation (simdjson fast path)
v.validateJSON('{"name": "Mert", "email": "mert@example.com"}');
v.isValidJSON('{"name": "Mert", "email": "mert@example.com"}'); // true

// Buffer input (zero-copy, raw NAPI)
v.isValid(Buffer.from('{"name": "Mert", "email": "mert@example.com"}'));

// Parallel batch - multi-core, NDJSON, 13.4M items/sec
const ndjson = Buffer.from(lines.join('\n'));
v.isValidParallel(ndjson);  // bool[]
v.countValid(ndjson);        // number

Cross-Schema $ref

const addressSchema = {
  $id: 'https://example.com/address',
  type: 'object',
  properties: { street: { type: 'string' }, city: { type: 'string' } },
  required: ['street', 'city']
};

const v = new Validator({
  type: 'object',
  properties: {
    name: { type: 'string' },
    address: { $ref: 'https://example.com/address' }
  }
}, { schemas: [addressSchema] });

// Or use addSchema()
const v2 = new Validator(mainSchema);
v2.addSchema(addressSchema);

Options

const v = new Validator(schema, {
  coerceTypes: true,       // "42" → 42 for integer fields
  removeAdditional: true,  // strip properties not in schema
  schemas: [otherSchema],  // cross-schema $ref registry
});

Standalone Pre-compilation

Pre-compile schemas to JS files for near-zero startup. No native addon needed at runtime.

const fs = require('fs');

// Build phase (once)
const v = new Validator(schema);
fs.writeFileSync('./compiled.js', v.toStandalone());

// Read phase (every startup) - 0.6μs per schema, pure JS
const v2 = Validator.fromStandalone(require('./compiled.js'), schema);

// Bundle multiple schemas - deduplicated, single file
fs.writeFileSync('./bundle.js', Validator.bundleCompact(schemas));
const validators = Validator.loadBundle(require('./bundle.js'), schemas);

Fastify startup (5 routes): ajv 6.0ms → ata 0.5ms (12x faster, no build step needed)

Standard Schema V1

const v = new Validator(schema);

// Works with Fastify, tRPC, TanStack, etc.
const result = v['~standard'].validate(data);
// { value: data } on success
// { issues: [{ message, path }] } on failure

Fastify Plugin

npm install fastify-ata
const fastify = require('fastify')();
fastify.register(require('fastify-ata'), {
  coerceTypes: true,
  removeAdditional: true,
});

// All existing JSON Schema route definitions work as-is

C++

#include "ata.h"

auto schema = ata::compile(R"({
  "type": "object",
  "properties": { "name": {"type": "string"} },
  "required": ["name"]
})");

auto result = ata::validate(schema, R"({"name": "Mert"})");
// result.valid == true

Supported Keywords

Category Keywords
Type type
Numeric minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf
String minLength, maxLength, pattern, format
Array items, prefixItems, minItems, maxItems, uniqueItems, contains, minContains, maxContains, unevaluatedItems
Object properties, required, additionalProperties, patternProperties, minProperties, maxProperties, propertyNames, dependentRequired, dependentSchemas, unevaluatedProperties
Enum/Const enum, const
Composition allOf, anyOf, oneOf, not
Conditional if, then, else
References $ref, $defs, definitions, $id
Boolean true, false

Format Validators (hand-written, no regex)

email, date, date-time, time, uri, uri-reference, ipv4, ipv6, uuid, hostname

Building from Source

Development prerequisites

Native builds require C/C++ toolchain support and the following libraries:

  • re2
  • abseil
  • mimalloc

Install them before running npm install / npm run build:

# macOS (Homebrew)
brew install re2 abseil mimalloc
# Ubuntu/Debian (apt)
sudo apt-get update
sudo apt-get install -y libre2-dev libabsl-dev libmimalloc-dev
# C++ library + tests
cmake -B build
cmake --build build
./build/ata_tests

# Node.js addon
npm install
npm run build
npm test

# JSON Schema Test Suite
npm run test:suite

License

MIT

Authors

Mert Can Altin Daniel Lemire

About

Ultra-fast JSON Schema validator. 4.6x faster validation, 2,051x faster compilation. Works without native addon. Draft 2020-12 + Draft 7, RE2, simdjson, multi-core. Standard Schema V1.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages