File System
function create_file_backend(config: FileBackendConfig): BackendThe file backend provides durable local storage using the file system. Metadata is stored as JSON, and binary data is stored in separate files with content-addressable naming for automatic deduplication.
Why Use File Backend?
- Persistence: Data survives process restarts
- Inspectable: Human-readable JSON metadata, standard binary files
- No dependencies: No database server or external services needed
- Portable: Just copy the directory to move your data
Basic Usage
import { create_corpus, create_file_backend, define_store, json_codec } from '@f0rbit/corpus'import { z } from 'zod'
const DocumentSchema = z.object({ title: z.string(), content: z.string(), updatedAt: z.string(),})
const corpus = create_corpus() .with_backend(create_file_backend({ base_path: './data' })) .with_store(define_store('documents', json_codec(DocumentSchema))) .build()
// Data persists across restartsawait corpus.stores.documents.put({ title: 'My Document', content: 'Hello, world!', updatedAt: new Date().toISOString(),})Configuration
type FileBackendConfig = { base_path: string on_event?: EventHandler}| Option | Type | Description |
|---|---|---|
base_path | string | Root directory for all storage (created if doesn’t exist) |
on_event | (event: CorpusEvent) => void | Optional callback for storage events |
Directory Structure
The file backend organizes data predictably:
base_path/├── documents/│ └── _meta.json # All metadata for 'documents' store├── images/│ └── _meta.json # All metadata for 'images' store└── _data/ ├── documents_a1b2c3.bin # Binary data (named by hash) ├── documents_d4e5f6.bin └── images_789abc.binMetadata Format
The _meta.json file contains an array of [version, metadata] pairs:
[ ["AZx5kQ", { "store_id": "documents", "version": "AZx5kQ", "content_hash": "a1b2c3...", "created_at": "2024-01-15T10:30:00Z", "size_bytes": 1234, "parents": [] }]]Data Files
Binary data is stored in the _data directory with content-addressable names. If two versions have identical content, they share the same data file (deduplication).
CLI Tool Example
Perfect for local CLI tools that need persistent configuration or data:
#!/usr/bin/env bun
import { create_corpus, create_file_backend, define_store, json_codec } from '@f0rbit/corpus'import { z } from 'zod'
const ConfigSchema = z.object({ theme: z.enum(['light', 'dark']), recentFiles: z.array(z.string()), lastOpened: z.string().optional(),})
// Store config in user's home directoryconst configPath = `${process.env.HOME}/.myapp`
const corpus = create_corpus() .with_backend(create_file_backend({ base_path: configPath })) .with_store(define_store('config', json_codec(ConfigSchema))) .build()
// Load existing config or use defaultsasync function loadConfig() { const result = await corpus.stores.config.get_latest() if (result.ok) { return result.value.data } return { theme: 'dark' as const, recentFiles: [] }}
// Save updated configasync function saveConfig(config: z.infer<typeof ConfigSchema>) { await corpus.stores.config.put({ ...config, lastOpened: new Date().toISOString(), })}
// Usageconst config = await loadConfig()console.log(`Theme: ${config.theme}`)Development Server Example
Use file backend during development for data that persists across restarts:
import { create_corpus, create_file_backend, define_store, json_codec } from '@f0rbit/corpus'
const corpus = create_corpus() .with_backend(create_file_backend({ base_path: './dev-data', on_event: (e) => { if (e.type === 'snapshot_put') { console.log(`[corpus] Saved ${e.store_id}@${e.version}`) } } })) .with_store(define_store('sessions', json_codec(SessionSchema))) .with_store(define_store('cache', json_codec(CacheSchema))) .build()
// Data survives `bun --watch` restartsError Handling
File operations can fail due to permissions, disk space, or I/O errors:
const result = await corpus.stores.documents.put(data)
if (!result.ok) { switch (result.error.kind) { case 'storage_error': console.error(`I/O error during ${result.error.operation}:`, result.error.cause) break case 'encode_error': console.error('Failed to serialize data:', result.error.cause) break }}When to Use
| Scenario | Recommended |
|---|---|
| Local development | ✅ Yes |
| CLI tools | ✅ Yes |
| Desktop applications | ✅ Yes |
| Single-server deployment | ✅ Yes |
| Multi-server deployment | ❌ No (no replication) |
| Serverless/Edge | ❌ No (use Cloudflare) |
| High-write workloads | ⚠️ Consider (I/O bound) |
Requirements
Performance Considerations
- Metadata: All metadata for a store is in one JSON file. Very large stores (100K+ versions) may see slower list operations.
- Data: Binary data is stored in separate files, so retrieval is fast regardless of total data size.
- Writes: Each write updates the metadata JSON file, which requires reading and rewriting the entire file.
For high-throughput scenarios, consider the Layered backend with memory caching.
See Also
- Memory - For testing
- Cloudflare - For production deployment
- Layered - Add caching to file backend