Skip to content

Deployment Guide

Production deployment patterns for NoriKV on bare metal, Docker, and Kubernetes.



Overview

This guide covers deploying NoriKV in various environments:


Prerequisites

Hardware Requirements

Minimum (Development): - CPU: 2 cores - RAM: 4 GB - Disk: 20 GB SSD

Recommended (Production): - CPU: 8 cores (Intel Xeon or AMD EPYC) - RAM: 32 GB - Disk: 500 GB NVMe SSD (3000+ IOPS) - Network: 1 Gbps

Scaling guidelines:

Cluster Size vCPU/node RAM/node Disk/node Network
1 node (dev) 2-4 8 GB 100 GB SSD 100 Mbps
3 nodes (small prod) 4-8 16-32 GB 500 GB NVMe 1 Gbps
9 nodes (medium prod) 8-16 32-64 GB 1 TB NVMe 10 Gbps
27 nodes (large prod) 16-32 64-128 GB 2 TB NVMe 10 Gbps

Software Requirements

  • Operating System: Linux (Ubuntu 22.04, RHEL 8+, Debian 11+)
  • Kernel: 5.10+ (for modern async I/O)
  • systemd: For service management
  • Docker: 20.10+ (if using containers)
  • Kubernetes: 1.25+ (if using K8s)

Network Requirements

Ports:

Port Protocol Purpose Firewall Rule
7447 TCP gRPC (client + Raft) Open to clients and cluster
8080 TCP HTTP (health/metrics) Open to monitoring only

Bandwidth: - Intra-cluster: 1 Gbps minimum (10 Gbps recommended) - Client-to-cluster: Depends on workload (1-10 Gbps)

Latency: - Same datacenter: <1ms RTT - Multi-AZ: <5ms RTT - Multi-region: <50ms RTT (adjust Raft timeouts)


Single-Node Deployment

Perfect for development, testing, and small datasets (<100GB).

Installation

Download binary:

# Download latest release
wget https://github.com/norikv/norikv/releases/download/v0.1.0/norikv-server-linux-amd64

# Make executable
chmod +x norikv-server-linux-amd64
sudo mv norikv-server-linux-amd64 /usr/local/bin/norikv-server

Or build from source:

git clone https://github.com/norikv/norikv.git
cd norikv
cargo build --release -p norikv-server
sudo cp target/release/norikv-server /usr/local/bin/

Configuration

Create config file:

sudo mkdir -p /etc/norikv
sudo vim /etc/norikv/config.yaml

/etc/norikv/config.yaml:

node_id: "node0"
rpc_addr: "0.0.0.0:7447"
http_addr: "0.0.0.0:8080"
data_dir: "/var/lib/norikv"

cluster:
  seed_nodes: []  # Single-node mode
  total_shards: 128
  replication_factor: 1

telemetry:
  prometheus:
    enabled: true

Create System User

# Create norikv user
sudo useradd -r -s /bin/false norikv

# Create data directory
sudo mkdir -p /var/lib/norikv
sudo chown norikv:norikv /var/lib/norikv
sudo chmod 750 /var/lib/norikv

systemd Service

Create service file:

sudo vim /etc/systemd/system/norikv.service

/etc/systemd/system/norikv.service:

[Unit]
Description=NoriKV Server
Documentation=https://docs.norikv.io
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=norikv
Group=norikv
ExecStart=/usr/local/bin/norikv-server --config /etc/norikv/config.yaml
Restart=on-failure
RestartSec=5s
LimitNOFILE=65536
StandardOutput=journal
StandardError=journal

# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/norikv

[Install]
WantedBy=multi-user.target

Start Service

# Reload systemd
sudo systemctl daemon-reload

# Enable service (start on boot)
sudo systemctl enable norikv

# Start service
sudo systemctl start norikv

# Check status
sudo systemctl status norikv

Verify Deployment

Check health:

curl http://localhost:8080/health/quick
# Expected: OK

curl http://localhost:8080/health | jq
# Expected: {"status": "healthy", ...}

Check metrics:

curl http://localhost:8080/metrics | grep kv_requests_total

Test with client:

# Install grpcurl
brew install grpcurl  # or apt-get install grpcurl

# Test PUT
grpcurl -plaintext localhost:7447 norikv.Kv/Put \
  -d '{"key":"dGVzdA==","value":"dmFsdWU="}'

# Test GET
grpcurl -plaintext localhost:7447 norikv.Kv/Get \
  -d '{"key":"dGVzdA=="}'

Logs

View logs:

# Tail logs
sudo journalctl -u norikv -f

# Last 100 lines
sudo journalctl -u norikv -n 100

# Since yesterday
sudo journalctl -u norikv --since yesterday

# Filter by ERROR
sudo journalctl -u norikv | grep ERROR

Log format (JSON):

{
  "timestamp": "2024-01-15T10:30:00.123Z",
  "level": "INFO",
  "target": "norikv_server::node",
  "message": "Starting node",
  "fields": {
    "node_id": "node0"
  }
}

3-Node Cluster (Bare Metal)

Production deployment with fault tolerance and high availability.

Architecture

┌─────────────────┐   ┌─────────────────┐   ┌─────────────────┐
│   Node 0        │   │   Node 1        │   │   Node 2        │
│   10.0.1.10     │   │   10.0.1.11     │   │   10.0.1.12     │
│                 │   │                 │   │                 │
│   RF=3          │   │   RF=3          │   │   RF=3          │
│   Shards:       │   │   Shards:       │   │   Shards:       │
│   0-341         │   │   0-341         │   │   0-341         │
│   (leader:      │   │   (leader:      │   │   (leader:      │
│    0-341)       │   │    342-682)     │   │    683-1023)    │
└─────────────────┘   └─────────────────┘   └─────────────────┘
         │                     │                     │
         └─────────────────────┴─────────────────────┘
                       Raft + SWIM gossip

Node 0 Configuration

/etc/norikv/config.yaml:

node_id: "node0"
rpc_addr: "10.0.1.10:7447"
http_addr: "10.0.1.10:8080"
data_dir: "/var/lib/norikv"

cluster:
  seed_nodes:
    - "10.0.1.10:7447"  # self
    - "10.0.1.11:7447"  # node1
    - "10.0.1.12:7447"  # node2
  total_shards: 1024
  replication_factor: 3

telemetry:
  prometheus:
    enabled: true

Node 1 and Node 2: Same config, change node_id and rpc_addr/http_addr


Bootstrap Procedure

1. Start Node 0 (first node):

# On node0 (10.0.1.10)
sudo systemctl start norikv
sudo journalctl -u norikv -f
# Wait for "Node started successfully"

2. Start Node 1:

# On node1 (10.0.1.11)
sudo systemctl start norikv
# Logs should show: "Joining cluster via seed: 10.0.1.10:7447"
# Logs should show: "Member joined: node0 at 10.0.1.10:7447"

3. Start Node 2:

# On node2 (10.0.1.12)
sudo systemctl start norikv
# Cluster now has 3 nodes

4. Verify cluster formation:

# On any node
curl http://10.0.1.10:8080/health | jq '.nodes | length'
# Expected: 3

# Check SWIM cluster size
curl http://10.0.1.10:8080/metrics | grep swim_cluster_size
# Expected: swim_cluster_size 3

Verify Replication

Test write on node0, read from node1:

# Write to node0
grpcurl -plaintext 10.0.1.10:7447 norikv.Kv/Put \
  -d '{"key":"dGVzdC1yZXBs","value":"cmVwbGljYXRlZA=="}'

# Wait for replication (1-2 seconds)
sleep 2

# Read from node1 (should succeed)
grpcurl -plaintext 10.0.1.11:7447 norikv.Kv/Get \
  -d '{"key":"dGVzdC1yZXBs"}'
# Expected: {"value":"cmVwbGljYXRlZA=="}

Load Balancer Setup

HAProxy configuration:

# /etc/haproxy/haproxy.cfg

global
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    tcp
    option  tcplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

frontend norikv_grpc
    bind *:7447
    default_backend norikv_servers

backend norikv_servers
    balance roundrobin
    option httpchk GET /health/quick
    http-check expect status 200
    server node0 10.0.1.10:7447 check port 8080 inter 5s fall 3 rise 2
    server node1 10.0.1.11:7447 check port 8080 inter 5s fall 3 rise 2
    server node2 10.0.1.12:7447 check port 8080 inter 5s fall 3 rise 2

frontend norikv_http
    bind *:8080
    default_backend norikv_http_servers

backend norikv_http_servers
    balance roundrobin
    option httpchk GET /health/quick
    http-check expect status 200
    server node0 10.0.1.10:8080 check inter 5s
    server node1 10.0.1.11:8080 check inter 5s
    server node2 10.0.1.12:8080 check inter 5s

Start HAProxy:

sudo systemctl restart haproxy
sudo systemctl status haproxy

Test via load balancer:

grpcurl -plaintext localhost:7447 norikv.Kv/Put \
  -d '{"key":"bGI=","value":"dGVzdA=="}'

Docker Deployment

Single-Node Docker

Create docker-compose.yml:

version: '3.8'

services:
  norikv:
    image: norikv/norikv-server:latest
    container_name: norikv-server
    ports:
      - "7447:7447"  # gRPC
      - "8080:8080"  # HTTP
    volumes:
      - norikv-data:/data
    environment:
      NORIKV_NODE_ID: "docker-node0"
      NORIKV_RPC_ADDR: "0.0.0.0:7447"
      NORIKV_HTTP_ADDR: "0.0.0.0:8080"
      NORIKV_DATA_DIR: "/data"
      NORIKV_SEED_NODES: ""  # Single-node mode
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health/quick"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 30s

volumes:
  norikv-data:
    driver: local

Start:

docker-compose up -d
docker-compose logs -f norikv

3-Node Docker Cluster

docker-compose.yml:

version: '3.8'

services:
  norikv-node0:
    image: norikv/norikv-server:latest
    container_name: norikv-node0
    hostname: norikv-node0
    ports:
      - "7447:7447"
      - "8080:8080"
    volumes:
      - node0-data:/data
    environment:
      NORIKV_NODE_ID: "node0"
      NORIKV_RPC_ADDR: "0.0.0.0:7447"
      NORIKV_HTTP_ADDR: "0.0.0.0:8080"
      NORIKV_DATA_DIR: "/data"
      NORIKV_SEED_NODES: "norikv-node0:7447,norikv-node1:7447,norikv-node2:7447"
    networks:
      - norikv-cluster
    restart: unless-stopped

  norikv-node1:
    image: norikv/norikv-server:latest
    container_name: norikv-node1
    hostname: norikv-node1
    ports:
      - "7448:7447"
      - "8081:8080"
    volumes:
      - node1-data:/data
    environment:
      NORIKV_NODE_ID: "node1"
      NORIKV_RPC_ADDR: "0.0.0.0:7447"
      NORIKV_HTTP_ADDR: "0.0.0.0:8080"
      NORIKV_DATA_DIR: "/data"
      NORIKV_SEED_NODES: "norikv-node0:7447,norikv-node1:7447,norikv-node2:7447"
    networks:
      - norikv-cluster
    restart: unless-stopped
    depends_on:
      - norikv-node0

  norikv-node2:
    image: norikv/norikv-server:latest
    container_name: norikv-node2
    hostname: norikv-node2
    ports:
      - "7449:7447"
      - "8082:8080"
    volumes:
      - node2-data:/data
    environment:
      NORIKV_NODE_ID: "node2"
      NORIKV_RPC_ADDR: "0.0.0.0:7447"
      NORIKV_HTTP_ADDR: "0.0.0.0:8080"
      NORIKV_DATA_DIR: "/data"
      NORIKV_SEED_NODES: "norikv-node0:7447,norikv-node1:7447,norikv-node2:7447"
    networks:
      - norikv-cluster
    restart: unless-stopped
    depends_on:
      - norikv-node0

networks:
  norikv-cluster:
    driver: bridge

volumes:
  node0-data:
  node1-data:
  node2-data:

Start cluster:

docker-compose up -d
docker-compose ps
docker-compose logs -f

Kubernetes Deployment

Architecture

┌────────────────────────────────────────────────┐
│  StatefulSet: norikv                           │
│  Replicas: 3                                   │
├────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────┐│
│  │ norikv-0    │  │ norikv-1    │  │ norikv-2││
│  │ PVC: 100Gi  │  │ PVC: 100Gi  │  │ PVC: ..  ││
│  └─────────────┘  └─────────────┘  └─────────┘│
└────────────────────────────────────────────────┘
┌─────────────┴──────────────┐
│  Headless Service: norikv  │  (ClusterIP: None)
│  Port: 7447 (gRPC)         │
└────────────────────────────┘

ConfigMap

norikv-configmap.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
  name: norikv-config
  namespace: default
data:
  config.yaml: |
    node_id: "$(POD_NAME)"
    rpc_addr: "0.0.0.0:7447"
    http_addr: "0.0.0.0:8080"
    data_dir: "/data"
    cluster:
      seed_nodes:
        - "norikv-0.norikv.default.svc.cluster.local:7447"
        - "norikv-1.norikv.default.svc.cluster.local:7447"
        - "norikv-2.norikv.default.svc.cluster.local:7447"
      total_shards: 1024
      replication_factor: 3
    telemetry:
      prometheus:
        enabled: true

Headless Service

norikv-service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: norikv
  namespace: default
  labels:
    app: norikv
spec:
  clusterIP: None  # Headless service
  ports:
    - name: grpc
      port: 7447
      targetPort: 7447
      protocol: TCP
    - name: http
      port: 8080
      targetPort: 8080
      protocol: TCP
  selector:
    app: norikv
---
apiVersion: v1
kind: Service
metadata:
  name: norikv-lb
  namespace: default
  labels:
    app: norikv
spec:
  type: LoadBalancer
  ports:
    - name: grpc
      port: 7447
      targetPort: 7447
      protocol: TCP
  selector:
    app: norikv

StatefulSet

norikv-statefulset.yaml:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: norikv
  namespace: default
spec:
  serviceName: norikv
  replicas: 3
  selector:
    matchLabels:
      app: norikv
  template:
    metadata:
      labels:
        app: norikv
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: norikv
        image: norikv/norikv-server:latest
        ports:
        - containerPort: 7447
          name: grpc
          protocol: TCP
        - containerPort: 8080
          name: http
          protocol: TCP
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: NORIKV_NODE_ID
          value: "$(POD_NAME)"
        volumeMounts:
        - name: data
          mountPath: /data
        - name: config
          mountPath: /etc/norikv
        livenessProbe:
          httpGet:
            path: /health/quick
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /health/quick
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          successThreshold: 1
        resources:
          requests:
            cpu: "2"
            memory: "8Gi"
          limits:
            cpu: "4"
            memory: "16Gi"
      volumes:
      - name: config
        configMap:
          name: norikv-config
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: fast-ssd  # Adjust for your cloud
      resources:
        requests:
          storage: 100Gi

Deploy to Kubernetes

Apply manifests:

kubectl apply -f norikv-configmap.yaml
kubectl apply -f norikv-service.yaml
kubectl apply -f norikv-statefulset.yaml

Watch rollout:

kubectl rollout status statefulset/norikv
kubectl get pods -l app=norikv -w

Verify cluster:

# Check pods
kubectl get pods -l app=norikv

# Check health
kubectl exec norikv-0 -- curl -s http://localhost:8080/health | jq

# Check cluster size
kubectl exec norikv-0 -- curl -s http://localhost:8080/metrics | grep swim_cluster_size
# Expected: swim_cluster_size 3

Access from Outside K8s

Via LoadBalancer:

# Get LoadBalancer IP
kubectl get svc norikv-lb
# NAME        TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)
# norikv-lb   LoadBalancer   10.100.200.1   35.123.45.67     7447:31234/TCP

# Test
grpcurl -plaintext 35.123.45.67:7447 norikv.Kv/Put \
  -d '{"key":"dGVzdA==","value":"dmFsdWU="}'

Via Port-Forward (development):

kubectl port-forward svc/norikv-lb 7447:7447
grpcurl -plaintext localhost:7447 norikv.Kv/Put \
  -d '{"key":"dGVzdA==","value":"dmFsdWU="}'

Cloud Provider Specific

AWS EKS

Storage Class (gp3 SSD):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
volumeBindingMode: WaitForFirstConsumer

Node affinity (i3en instances with NVMe):

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: node.kubernetes.io/instance-type
          operator: In
          values:
          - i3en.xlarge
          - i3en.2xlarge

GCP GKE

Storage Class (pd-ssd):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-ssd
  replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer

Azure AKS

Storage Class (Premium_LRS):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: disk.csi.azure.com
parameters:
  skuName: Premium_LRS
volumeBindingMode: WaitForFirstConsumer

Production Checklist

Pre-Deployment

  • Size hardware (CPU, RAM, disk) based on workload
  • Provision SSD/NVMe storage (not HDD)
  • Configure firewall rules (ports 7447, 8080)
  • Set up monitoring (Prometheus + Grafana)
  • Configure log aggregation (ELK, Loki)
  • Plan backup strategy (snapshots)

Post-Deployment

  • Verify cluster formation (3 nodes visible)
  • Test replication (write on node0, read from node1)
  • Load test (simulate production traffic)
  • Test failure scenarios (kill node, network partition)
  • Set up alerts (Prometheus Alertmanager)
  • Document runbook (incident response)

Next Steps