Type-Safe Stores
Define stores with Zod schemas. Full TypeScript inference with validation on encode and decode.
Type-Safe Stores
Define stores with Zod schemas. Full TypeScript inference with validation on encode and decode.
Content Deduplication
Automatic SHA-256 hashing. Store the same data twice, pay for one copy.
Lineage Tracking
Link snapshots with parent references. Build complete data provenance graphs.
Observations
Extract structured facts from documents with automatic staleness detection.
import { z } from 'zod'import { create_corpus, create_memory_backend, define_store, json_codec } from '@f0rbit/corpus'
const ArticleSchema = z.object({ title: z.string(), body: z.string(), tags: z.array(z.string()),})
const corpus = create_corpus() .with_backend(create_memory_backend()) .with_store(define_store('articles', json_codec(ArticleSchema))) .build()
// Store a versioned snapshotconst result = await corpus.stores.articles.put({ title: 'Hello World', body: 'My first article.', tags: ['intro', 'getting-started'],})
if (result.ok) { console.log(`Version: ${result.value.version}`) console.log(`Hash: ${result.value.content_hash}`)}No thrown exceptions. Every operation returns a Result<T, CorpusError> that you can pattern match on. Predictable, type-safe error handling.
Builder pattern for setup, pure functions for operations. No hidden state, no surprises. Compose backends, stores, and observations.
Start with the memory backend for development, file backend for persistence, then deploy to Cloudflare D1+R2 for production. Same API everywhere.
Memory for testing, filesystem for local dev, Cloudflare for production. Combine them with the layered backend for caching and replication.
When you need to extract structured data from your documents - sentiment analysis, named entities, key facts - Observations provide first-class support with automatic provenance tracking.
import { define_observation_type } from '@f0rbit/corpus'
const SentimentObservation = define_observation_type('sentiment', z.object({ subject: z.string(), score: z.number().min(-1).max(1),}))
const corpus = create_corpus() .with_backend(backend) .with_store(define_store('articles', json_codec(ArticleSchema))) .with_observations([SentimentObservation]) .build()
// Store observation pointing back to source documentawait corpus.observations.put(SentimentObservation, { source: { store_id: 'articles', version: article.version }, content: { subject: 'product launch', score: 0.8 },})
// Query observations - stale ones automatically filteredfor await (const obs of corpus.observations.query({ type: 'sentiment' })) { console.log(obs.content.subject, obs.content.score)}| Backend | Storage | Best For |
|---|---|---|
| Memory | In-memory | Testing, prototyping |
| File | Local disk | Development, desktop apps |
| Cloudflare | D1 + R2 | Production, global distribution |
| Layered | Composite | Caching, replication, migration |