Skip to content

ForEach Extraction

ForEachSpec enables data-driven extraction patterns — iterate over values from one extraction and use each to parameterize another. Combined with VariableRefSpec and dynamic predicates, this allows runtime values to control navigation and matching.

Sometimes the set of values to extract isn’t known at plan-build time. For example:

  • Extract revenue for each company name found in a separate data source
  • Look up cells in an XLSX workbook using values from a JSON config
  • Navigate to different columns based on a dynamic list of field names

ForEachSpec solves this by binding each source value to a named variable, then executing a body spec with that variable in scope.

import type { ForEachSpec, VariableRefSpec, SpecScope } from '@origints/core'
interface SpecScope {
readonly bindings: ReadonlyMap<string, unknown>
}
interface ForEachSpec<Source extends Spec = Spec, Body extends Spec = Spec> {
readonly kind: 'forEach'
readonly source: Source // must produce an array
readonly binding: string // variable name
readonly body: Body // executed once per source item
}
interface VariableRefSpec {
readonly kind: 'variableRef'
readonly name: string // variable to look up
readonly path?: readonly (string | number)[] // navigate into the value
readonly extract?: string // type check: 'string' | 'number' | 'boolean' | 'null'
}
import { forEach, variableRef, object, literal } from '@origints/core'
// Iterate over a source array, bind each item to 'company', execute body
forEach(source, 'company', body)
// Reference a bound variable
variableRef('company')
// Navigate into a bound value
variableRef('item', ['address', 'city'])
// Reference with type checking
variableRef('name', { extract: 'string' })
// Build an ObjectSpec from already-constructed specs (no SpecBuilder needed)
object({
name: variableRef('name', { extract: 'string' }),
items: someArraySpec,
})
import { forEach, variableRef, literal, executeSpec } from '@origints/core'
const spec = forEach(
literal(['Alice', 'Bob', 'Charlie']),
'name',
variableRef('name', { extract: 'string' })
)
const result = executeSpec(spec, {})
// result.value: ['Alice', 'Bob', 'Charlie']

Use object() to construct an ObjectSpec from already-built specs. Unlike extract() (which provides a JSON SpecBuilder for navigation), object() takes a plain record — ideal for forEach bodies that compose specs from multiple sources.

import { forEach, variableRef, object } from '@origints/core'
const spec = forEach(
$.get('ids').array(id => id.number()),
'userId',
object({
id: variableRef('userId', { extract: 'number' }),
profile: $.get('profiles').get(variableRef('userId')).string(),
})
)

extract() is still available when you need the JSON SpecBuilder ($):

// extract() provides a SpecBuilder root for JSON navigation
extract($ => ({
name: $.get('name').string(),
age: $.get('age').number(),
}))
// object() takes already-constructed specs directly
object({
name: variableRef('name', { extract: 'string' }),
items: someXlsxArraySpec,
})

ForEach specs nest freely. Inner iterations can reference both inner and outer bindings:

const spec = forEach(
literal(['math', 'science']),
'subject',
forEach(
literal(['Alice', 'Bob']),
'student',
object({
subject: variableRef('subject', { extract: 'string' }),
student: variableRef('student', { extract: 'string' }),
})
)
)
// Produces: [
// [{ subject: 'math', student: 'Alice' }, { subject: 'math', student: 'Bob' }],
// [{ subject: 'science', student: 'Alice' }, { subject: 'science', student: 'Bob' }],
// ]

The XLSX package provides scope-aware predicates that resolve variable values at runtime:

import { cell } from '@origints/xlsx'
// Match cells whose value equals a bound variable
cell.equalsRef('company')
cell.equalsRef('item', ['name']) // navigate into the bound value first
// Match cells whose string value contains a bound variable
cell.containsRef('keyword')
cell.containsRef('search', undefined, true) // case-sensitive
// Match cells whose string value starts with a bound variable
cell.startsWithRef('company')
cell.startsWithRef('item', ['name'], true) // case-sensitive
import { forEach, literal } from '@origints/core'
import { XlsxSpecBuilder, cell } from '@origints/xlsx'
const $ = XlsxSpecBuilder.root()
const spec = forEach(
literal(['Acme Corp', 'Globex Inc']),
'company',
$.firstSheet().find(cell.equalsRef('company')).string()
)
// For each company name, finds the cell containing that value
const header = $.firstSheet().find(cell.equals('Company'))
const hasData = rowCol(0, cell.isNotEmpty())
const spec = forEach(
literal(['Revenue', 'Employees']),
'colName',
header
.down()
.eachSlice('down', hasData, row =>
row.colWhere(header, cell.equalsRef('colName')).value()
)
)
// First iteration: extracts all values from the Revenue column
// Second iteration: extracts all values from the Employees column

Note: colWhere/rowWhere caches are automatically bypassed when the predicate contains equalsRef, containsRef, or startsWithRef, since the resolved value changes per forEach iteration.

The CSV package provides the same scope-aware predicates:

import { col } from '@origints/csv'
col.equalsRef('company') // exact match against bound variable
col.containsRef('keyword') // substring match (case-insensitive)
col.startsWithRef('prefix', ['path']) // prefix match with path navigation
import { forEach, literal } from '@origints/core'
import { col, CsvSpecBuilder } from '@origints/csv'
const $ = CsvSpecBuilder.root()
const spec = forEach(
literal(['Revenue', 'Employees']),
'colName',
$.rows(row => row.colWhere(col.equalsRef('colName')).string())
)
// First iteration: extracts all values from the Revenue column
// Second iteration: extracts all values from the Employees column

Note: colWhere caches are automatically bypassed for scope-dependent predicates.

import { JsonSchema } from '@origints/core'
// ForEachSpec → { type: 'array', items: <schema of body> }
// VariableRefSpec with extract: 'string' → { type: 'string' }
// VariableRefSpec without extract → {}
  • Static: ForEach uses [*] wildcard for array positions
  • Runtime: Concrete indices from actual iteration count
  • VariableRef: Yields $varName in the input path
  • Source failure: If the source spec fails, the forEach fails immediately
  • Non-array source: If the source produces a non-array value, fails with 'type' kind
  • Body failure: If the body fails for any iteration, the forEach fails with the iteration index in the error path
  • Missing variable: Referencing an unbound variable fails with 'missing' kind
  • No scope: Using variableRef outside a forEach fails with 'missing' kind