@origints/core

The core package provides the planning system, execution runtime, lineage tracking, schema derivation, and all format-independent extraction primitives.

Installation

npm install @origints/core

Features

Two-phase architecture: plan then execute
Immutable execution plans that can be inspected and serialized
First-class provenance tracking with full lineage graphs
Structured failure types (missing, type, format, constraint, runtime, panic, validation)
Fail-fast execution with no silent coercions
Schema validation via Standard Schema (Zod, Valibot, etc.)
Transform registry for decoupled execution logic
JSON Schema derivation from plans or specs
Output transforms (groupBy, aggregate, sort, filter, etc.)

Inline data extraction

import { Planner, load, run } from '@origints/core'

const plan = new Planner()
  .in(load({ name: 'Alice', age: 30, role: 'admin' }))
  .emit((out, $) => out
    .add('name', $.get('name').string())
    .add('age', $.get('age').number())
    .add('role', $.get('role').string())
  )
  .compile()

const result = await run(plan)
// result.value: { name: 'Alice', age: 30, role: 'admin' }

File extraction with transforms

import { Planner, loadFile, run, parseJson } from '@origints/core'

const plan = new Planner()
  .in(loadFile('data.json'))
  .mapIn(parseJson())
  .emit((out, $) => out
    .add('id', $.get('id').number())
    .add('name', $.get('name').string())
  )
  .compile()

const result = await run(plan, {
  readFile: (path) => fs.promises.readFile(path),
})

Optional extraction

import { Planner, load, run, optional } from '@origints/core'

const plan = new Planner()
  .in(load({ name: 'Alice' }))
  .emit((out, $) => out
    .add('name', $.get('name').string())
    .add('nickname', optional($.get('nickname').string()))
    .add('score', optional($.get('score').number(), 0))
  )
  .compile()

const result = await run(plan)
// result.value: { name: 'Alice', nickname: undefined, score: 0 }

Fallback chains

import { Planner, load, run, tryExtract, mapSpec, literal } from '@origints/core'

const plan = new Planner()
  .in(load({ price: '42.50' }))
  .emit((out, $) => out
    .add('price', tryExtract(
      $.get('price').number(),
      mapSpec($.get('price').string(), v => parseFloat(v as string), 'parseFloat'),
      literal(null)
    ))
  )
  .compile()

const result = await run(plan)
// result.value: { price: 42.5 }

Guarded extraction

import { Planner, load, run, guard, tryExtract, literal } from '@origints/core'

const plan = new Planner()
  .in(load({ age: -5 }))
  .emit((out, $) => out
    .add('age', tryExtract(
      guard($.get('age').number(), v => (v as number) >= 0, 'Age must be non-negative'),
      literal(0)
    ))
  )
  .compile()

const result = await run(plan)
// result.value: { age: 0 }

Merging plans

// Flat merge — combines outputs
const combined = Planner.merge(configPlan, usersPlan)

// Named merge — nests under keys
const nested = Planner.mergeAs({ config: configPlan, users: usersPlan })

Inspecting lineage

import { formatLineage, formatLineageAsString } from '@origints/core'

const result = await run(plan)

console.log(formatLineageAsString(result.lineage, plan.ast))

const trace = formatLineage(result.lineage, plan.ast)
console.log(JSON.stringify(trace, null, 2))

Schema derivation

import { JsonSchema } from '@origints/core'

const schema = JsonSchema.output(plan, {
  draft: '2020-12',
  title: 'User',
  deduplicate: true,
})

Non-goals

Not a general-purpose ETL framework
Not optimized for streaming large datasets
Not a schema definition language

License

MIT