Skip to content

TypeScriptFunctionalMIT License

corpus

The versioned data store for TypeScript. Track lineage, deduplicate content, extract structured observations.

4
Backends
Zod
Schemas
Zero
Exceptions

Core Features

Type-Safe Stores

Define stores with Zod schemas. Full TypeScript inference with validation on encode and decode.

Content Deduplication

Automatic SHA-256 hashing. Store the same data twice, pay for one copy.

Lineage Tracking

Link snapshots with parent references. Build complete data provenance graphs.

Observations

Extract structured facts from documents with automatic staleness detection and provenance tracking.

Quick Start

Install the package
Terminal window
bun add @f0rbit/corpus zod
Define a schema and create a corpus
import { z } from 'zod'
import { create_corpus, create_memory_backend, define_store, json_codec } from '@f0rbit/corpus'
const ArticleSchema = z.object({
title: z.string(),
body: z.string(),
tags: z.array(z.string()),
})
const corpus = create_corpus()
.with_backend(create_memory_backend())
.with_store(define_store('articles', json_codec(ArticleSchema)))
.build()
Store versioned data
const result = await corpus.stores.articles.put({
title: 'Hello World',
body: 'My first article.',
tags: ['intro', 'getting-started'],
})
if (result.ok) {
console.log(`Version: ${result.value.version}`)
console.log(`Hash: ${result.value.content_hash}`)
}

Why Corpus?

Errors as Values

No thrown exceptions. Every operation returns a Result<T, CorpusError> that you can pattern match on. Predictable, type-safe error handling.

Functional Composition

Builder pattern for setup, pure functions for operations. No hidden state, no surprises. Compose backends, stores, and observations.

Local-First Ready

Start with memory backend for development, file for persistence, Cloudflare for production. Same API everywhere.

Pluggable Backends

Memory for testing, filesystem for local dev, Cloudflare for production. Combine them with the layered backend for caching and replication.

Backend Comparison

Choose the right backend for your use case

Memory

Storage In-memory
Best for Testing, prototyping

File

Storage Local disk
Best for Development, desktop apps

Cloudflare

Storage D1 + R2
Best for Production, global distribution

Layered

Storage Composite
Best for Caching, replication, migration

Next Steps