Getting Started with nori-sstable¶
Build your first SSTable in 5 minutes.
Installation¶
Add nori-sstable to your Cargo.toml:
{: .note }
nori-sstable requires
tokiofor async I/O. Make sure to include thefullfeature set or at minimumfs,io-util, andrt.
Your First SSTable¶
Step 1: Create a Builder¶
use nori_sstable::{SSTableBuilder, SSTableConfig};
use std::path::PathBuf;
#[tokio::main]
async fn main() -> nori_sstable::Result<()> {
// Configure the SSTable
let config = SSTableConfig {
path: PathBuf::from("my_first.sst"),
estimated_entries: 1000,
..Default::default()
};
// Create the builder
let mut builder = SSTableBuilder::new(config).await?;
Ok(())
}
Step 2: Add Entries¶
Entries must be added in sorted order by key:
use nori_sstable::Entry;
// Add key-value pairs
builder.add(&Entry::put("apple", "red fruit")).await?;
builder.add(&Entry::put("banana", "yellow fruit")).await?;
builder.add(&Entry::put("cherry", "red fruit")).await?;
// Add a tombstone (deletion marker)
builder.add(&Entry::delete("durian")).await?;
{: .warning }
Important: Keys must be added in sorted order. If you try to add keys out of order, you'll get a
KeysNotSortederror.
Step 3: Finish Building¶
// Finalize the SSTable (writes index, bloom filter, footer)
let metadata = builder.finish().await?;
println!(" Created SSTable:");
println!(" Path: {:?}", metadata.path);
println!(" Entries: {}", metadata.entry_count);
println!(" Size: {} bytes", metadata.file_size);
println!(" Blocks: {}", metadata.block_count);
Complete Example¶
use nori_sstable::{Entry, SSTableBuilder, SSTableConfig};
use std::path::PathBuf;
#[tokio::main]
async fn main() -> nori_sstable::Result<()> {
// Create builder
let config = SSTableConfig {
path: PathBuf::from("my_first.sst"),
estimated_entries: 1000,
..Default::default()
};
let mut builder = SSTableBuilder::new(config).await?;
// Add entries (sorted order!)
builder.add(&Entry::put("apple", "red fruit")).await?;
builder.add(&Entry::put("banana", "yellow fruit")).await?;
builder.add(&Entry::put("cherry", "red fruit")).await?;
// Finish
let metadata = builder.finish().await?;
println!(" Created SSTable with {} entries", metadata.entry_count);
Ok(())
}
Reading from an SSTable¶
Step 1: Open a Reader¶
use nori_sstable::SSTableReader;
use std::sync::Arc;
#[tokio::main]
async fn main() -> nori_sstable::Result<()> {
// Open the SSTable file
let reader = Arc::new(
SSTableReader::open(PathBuf::from("my_first.sst")).await?
);
Ok(())
}
{: .note }
We wrap the reader in
Arcbecause it's designed to be shared across threads. The reader is immutable and thread-safe.
Step 2: Point Lookups¶
// Look up a single key
if let Some(entry) = reader.get(b"banana").await? {
println!("Found: key={:?}, value={:?}",
String::from_utf8_lossy(&entry.key),
String::from_utf8_lossy(&entry.value)
);
} else {
println!("Key not found");
}
What happens: 1. Bloom filter check (~67ns) - might the key exist? 2. Index lookup (O(log B)) - which block contains the key? 3. Block read (cache or disk) - read the 4KB block 4. Binary search within block - find the exact entry
Step 3: Range Scans¶
// Iterate over all entries
let mut iter = reader.clone().iter();
while let Some(entry) = iter.try_next().await? {
println!("{:?} = {:?}",
String::from_utf8_lossy(&entry.key),
String::from_utf8_lossy(&entry.value)
);
}
Step 4: Range with Bounds¶
// Scan from "apple" to "cherry" (exclusive end)
let mut range_iter = reader.iter_range(
Bytes::from("apple"),
Bytes::from("cherry")
);
while let Some(entry) = range_iter.try_next().await? {
// Only entries with apple <= key < cherry
println!("{:?}", String::from_utf8_lossy(&entry.key));
}
Complete Read Example¶
use nori_sstable::SSTableReader;
use std::sync::Arc;
use std::path::PathBuf;
#[tokio::main]
async fn main() -> nori_sstable::Result<()> {
// Open reader
let reader = Arc::new(SSTableReader::open(PathBuf::from("my_first.sst")).await?);
// Point lookup
if let Some(entry) = reader.get(b"banana").await? {
println!("Found banana: {:?}", String::from_utf8_lossy(&entry.value));
}
// Full scan
let mut iter = reader.iter();
while let Some(entry) = iter.try_next().await? {
println!("{:?} = {:?}",
String::from_utf8_lossy(&entry.key),
String::from_utf8_lossy(&entry.value)
);
}
Ok(())
}
Enabling Compression¶
Why Compress?¶
- Save storage: 2-3x size reduction (LZ4) or 3-5x (Zstd)
- Fast decompression: LZ4 decompresses at 3.9 GB/s
- Cache-friendly: Decompressed blocks are cached
Using LZ4 Compression (Recommended)¶
use nori_sstable::{SSTableConfig, Compression};
let config = SSTableConfig {
path: PathBuf::from("compressed.sst"),
estimated_entries: 1000,
compression: Compression::Lz4, // ✨ Enable compression
..Default::default()
};
let mut builder = SSTableBuilder::new(config).await?;
Reading is automatic:
// Reader automatically detects compression from footer
let reader = Arc::new(SSTableReader::open("compressed.sst".into()).await?);
// Reads work exactly the same - compression is transparent!
let entry = reader.get(b"key").await?;
{: .highlight }
Pro tip: Combine compression with caching for best results. The default 64MB cache will store decompressed blocks, giving you storage savings + fast reads!
Using Zstd Compression (Higher Ratio)¶
For cold storage or archival data:
let config = SSTableConfig {
path: PathBuf::from("archive.sst"),
estimated_entries: 10_000,
compression: Compression::Zstd, // Higher compression ratio
block_cache_mb: 0, // Disable cache for cold data
..Default::default()
};
Learn more about compression →
Configuring the Cache¶
Default Cache (64MB)¶
The default configuration includes a 64MB LRU cache:
// This automatically includes 64MB cache
let reader = SSTableReader::open("data.sst".into()).await?;
Custom Cache Size¶
For hot workloads, increase the cache:
use nori_observe::NoopMeter;
// 256MB cache for hot data
let reader = SSTableReader::open_with_config(
"hot_data.sst".into(),
Arc::new(NoopMeter),
256 // MB
).await?;
Disable Cache (Cold Storage)¶
For infrequently accessed data:
Error Handling¶
Common Errors¶
use nori_sstable::{SSTableError, Result};
match builder.add(&Entry::put("key", "value")).await {
Ok(_) => println!("Added successfully"),
Err(SSTableError::KeysNotSorted(prev, current)) => {
eprintln!("Error: Keys must be sorted!");
eprintln!("Previous: {:?}, Current: {:?}", prev, current);
}
Err(SSTableError::Io(e)) => {
eprintln!("I/O error: {}", e);
}
Err(e) => {
eprintln!("Other error: {}", e);
}
}
Using the ? Operator¶
Most examples use ? for error propagation:
async fn build_sstable() -> Result<()> {
let mut builder = SSTableBuilder::new(config).await?;
builder.add(&Entry::put("key", "value")).await?;
builder.finish().await?;
Ok(())
}
Working with Entries¶
PUT Entries¶
// From string slices
let entry = Entry::put("key", "value");
// From byte slices
let entry = Entry::put(b"key", b"value");
// From byte vectors
let entry = Entry::put(vec![1, 2, 3], vec![4, 5, 6]);
// From Bytes (zero-copy)
use bytes::Bytes;
let entry = Entry::put(Bytes::from("key"), Bytes::from("value"));
DELETE Entries (Tombstones)¶
// Mark a key as deleted
let entry = Entry::delete("key_to_delete");
// When reading, tombstones are still returned
if let Some(entry) = reader.get(b"key_to_delete").await? {
if entry.tombstone {
println!("Key was deleted");
}
}
Checking for Tombstones¶
let entry = reader.get(b"key").await?;
match entry {
Some(e) if e.tombstone => println!("Key deleted"),
Some(e) => println!("Value: {:?}", e.value),
None => println!("Key not found"),
}
Configuration Options¶
SSTableConfig¶
use nori_sstable::{SSTableConfig, Compression};
let config = SSTableConfig {
// Required
path: PathBuf::from("data.sst"), // Output file path
estimated_entries: 10_000, // For bloom filter sizing
// Optional (with defaults)
block_size: 4096, // 4KB blocks (default)
restart_interval: 16, // Prefix compression restarts
compression: Compression::Lz4, // None, Lz4, or Zstd
bloom_bits_per_key: 10, // ~0.9% false positive rate
block_cache_mb: 64, // LRU cache size in MB
};
Common Configurations¶
Development (no compression, small cache):
let config = SSTableConfig {
path: "dev.sst".into(),
estimated_entries: 100,
compression: Compression::None,
block_cache_mb: 16,
..Default::default()
};
Production (LZ4 + cache):
let config = SSTableConfig {
path: "prod.sst".into(),
estimated_entries: 100_000,
compression: Compression::Lz4,
block_cache_mb: 256,
..Default::default()
};
Archival (Zstd, no cache):
let config = SSTableConfig {
path: "archive.sst".into(),
estimated_entries: 1_000_000,
compression: Compression::Zstd,
block_cache_mb: 0,
..Default::default()
};
Thread Safety¶
Sharing Readers¶
Readers are designed to be shared via Arc:
let reader = Arc::new(SSTableReader::open("data.sst".into()).await?);
// Spawn multiple tasks sharing the same reader
for i in 0..4 {
let reader_clone = reader.clone();
tokio::spawn(async move {
let key = format!("key{}", i);
reader_clone.get(key.as_bytes()).await
});
}
Benefits: - Cache is shared across all tasks - No duplication of index/bloom filter in memory - Thread-safe concurrent reads
Builders are NOT Thread-Safe¶
// Wrong - don't share builders
let builder = Arc::new(Mutex::new(builder));
// Correct - one builder per SSTable
let mut builder = SSTableBuilder::new(config).await?;
Best Practices¶
Do¶
- Sort keys before adding - use a BTreeMap or sort your data first
- Estimate entries accurately - improves bloom filter sizing
- Use compression - LZ4 is nearly free with caching
- Enable caching - 64MB default is reasonable
- Wrap readers in Arc - share across threads
- Use
?for error handling - simplifies code
Don't¶
- Don't add keys out of order (will error)
- Don't forget to call
finish()on builder - Don't share builders across threads
- Don't open multiple readers for the same file (share one)
- Don't disable cache unless memory is very limited
Next Steps¶
Learn More¶
- Compression Guide - Deep dive into LZ4/Zstd
- Caching Guide - Optimize for hot workloads
- How It Works - Understand the file format
- API Reference - Complete API documentation
Examples¶
- Basic Usage - More complete examples
- Hot Workloads - Cache tuning
Performance¶
- Benchmarks - See performance numbers
- Tuning Guide - Optimize for your workload
Complete Example: Build and Read¶
Here's a complete program that builds and reads an SSTable:
use nori_sstable::{
Entry, SSTableBuilder, SSTableReader, SSTableConfig, Compression
};
use std::path::PathBuf;
use std::sync::Arc;
#[tokio::main]
async fn main() -> nori_sstable::Result<()> {
// Build SSTable
let config = SSTableConfig {
path: PathBuf::from("example.sst"),
estimated_entries: 100,
compression: Compression::Lz4,
block_cache_mb: 64,
..Default::default()
};
let mut builder = SSTableBuilder::new(config).await?;
// Add sorted entries
for i in 0..100 {
let key = format!("key_{:04}", i);
let value = format!("value_{:04}", i);
builder.add(&Entry::put(key, value)).await?;
}
let metadata = builder.finish().await?;
println!(" Built SSTable: {} entries, {} bytes",
metadata.entry_count, metadata.file_size);
// Read SSTable
let reader = Arc::new(SSTableReader::open(PathBuf::from("example.sst")).await?);
// Point lookup
if let Some(entry) = reader.get(b"key_0042").await? {
println!("Found key_0042: {:?}", String::from_utf8_lossy(&entry.value));
}
// Range scan (first 10 keys)
let mut iter = reader.iter();
let mut count = 0;
while let Some(entry) = iter.try_next().await? {
println!("{:?} = {:?}",
String::from_utf8_lossy(&entry.key),
String::from_utf8_lossy(&entry.value)
);
count += 1;
if count >= 10 {
break;
}
}
Ok(())
}
Run this with:
Summary¶
You've learned how to:
- Install nori-sstable
- Build an SSTable with sorted entries
- Read data with point lookups and range scans
- Enable LZ4/Zstd compression
- Configure the LRU cache
- Handle errors properly
- Share readers across threads
Next: Explore compression and caching to optimize performance, or dive into the API reference for complete documentation.
Happy coding!