Skip to content

@origints/mammoth

DOCX to semantic HTML or plain text conversion with custom style mapping and configurable image handling.

Terminal window
npm install @origints/mammoth @origints/core
  • Convert DOCX to semantic HTML
  • Convert DOCX to plain text
  • Custom style mapping for headings, lists, and more
  • Configurable image handling
  • Conversion warnings and messages
import { Planner, loadFile, run } from '@origints/core'
import { docxToHtml } from '@origints/mammoth'
const plan = new Planner()
.in(loadFile('document.docx'))
.mapIn(docxToHtml())
.emit((out, $) => out
.add('html', $.get('html').string())
)
.compile()
const result = await run(plan, { readFile, registry })
// result.value: { html: '<h1>Title</h1><p>Content...</p>' }
const plan = new Planner()
.in(loadFile('report.docx'))
.mapIn(docxToHtml({
styleMap: [
"p[style-name='Title'] => h1.document-title",
"p[style-name='Heading 1'] => h1",
"p[style-name='Heading 2'] => h2",
"p[style-name='Quote'] => blockquote",
],
idPrefix: 'doc-',
}))
.emit((out, $) => out
.add('content', $.get('html').string())
)
.compile()
import { docxToText } from '@origints/mammoth'
const plan = new Planner()
.in(loadFile('document.docx'))
.mapIn(docxToText())
.emit((out, $) => out
.add('text', $.get('text').string())
)
.compile()
const plan = new Planner()
.in(loadFile('document.docx'))
.mapIn(docxToHtml({ imageHandling: 'omit' }))
.emit((out, $) => out.add('html', $.get('html').string()))
.compile()
import * as fs from 'fs'
import { docxToHtmlImpl, docxToTextImpl } from '@origints/mammoth'
const buffer = fs.readFileSync('document.docx')
const htmlResult = await docxToHtmlImpl.execute(buffer)
console.log(htmlResult.html)
for (const msg of htmlResult.messages) {
console.warn(msg.message)
}
const textResult = await docxToTextImpl.execute(buffer)
console.log(textResult.text)
ExportDescription
docxToHtml(options?)Transform AST for HTML conversion
docxToText(options?)Transform AST for text conversion
docxToHtmlImplAsync HTML conversion implementation
docxToTextImplAsync text conversion implementation
registerMammothTransforms(registry)Register mammoth transforms

MIT