NoriKV Go Client Troubleshooting Guide¶
Solutions to common issues when using the Go client SDK.
Table of Contents¶
- Connection Issues
- Performance Problems
- Version Conflicts
- Error Messages
- Configuration Issues
- Debugging Tips
Connection Issues¶
Problem: "connection refused" or "dial tcp: connect: connection refused"¶
Symptoms:
_, err := client.Put(ctx, key, value, nil)
// Error: rpc error: code = Unavailable desc = connection error:
// desc = "transport: Error while dialing dial tcp 127.0.0.1:9001:
// connect: connection refused"
Causes: 1. NoriKV server is not running 2. Wrong address/port in configuration 3. Firewall blocking connections
Solutions:
-
Verify server is running:
-
Check client configuration:
-
Test connectivity:
Problem: "context deadline exceeded"¶
Symptoms:
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
defer cancel()
_, err := client.Get(ctx, key, nil)
// Error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Causes: 1. Timeout too short 2. Server overloaded 3. Network latency 4. Leader election in progress
Solutions:
-
Increase timeout:
-
Check server health:
-
Enable retries:
Problem: "transport is closing"¶
Symptoms:
_, err := client.Put(ctx, key, value, nil)
// Error: rpc error: code = Unavailable desc = transport is closing
Causes: 1. Server restarted 2. Network interruption 3. Connection idle timeout
Solutions:
- Enable automatic retries (already default behavior)
-
Configure keepalive:
-
Check network stability:
Performance Problems¶
Problem: Slow Put/Get operations¶
Symptoms:
start := time.Now()
_, err := client.Put(ctx, key, value, nil)
duration := time.Since(start)
// duration > 100ms consistently
Causes: 1. Network latency 2. Large value sizes 3. Server overload 4. Wrong consistency level 5. Too many retries
Diagnosis:
-
Measure network latency:
-
Check value sizes:
-
Monitor retry count:
Solutions:
-
Use appropriate consistency level:
-
Reduce value sizes:
-
Use concurrent requests:
Problem: High memory usage¶
Symptoms:
Causes: 1. Too many goroutines 2. Large values buffered 3. Connection leaks 4. Topology listeners not cleaned up
Solutions:
-
Limit concurrent operations:
-
Clean up topology listeners:
-
Close client when done:
-
Use pprof to diagnose:
Version Conflicts¶
Problem: "version mismatch" on CAS operations¶
Symptoms:
result, _ := client.Get(ctx, key, nil)
options := &norikv.PutOptions{
IfMatchVersion: result.Version,
}
_, err := client.Put(ctx, key, newValue, options)
if errors.Is(err, norikv.ErrVersionMismatch) {
// This happens frequently
}
Causes: 1. High contention on key 2. Concurrent updates from multiple clients 3. Insufficient retry logic
Solutions:
-
Implement retry loop:
const maxRetries = 10 for i := 0; i < maxRetries; i++ { result, err := client.Get(ctx, key, nil) if err != nil { return err } // Compute new value newValue := transform(result.Value) options := &norikv.PutOptions{ IfMatchVersion: result.Version, } _, err = client.Put(ctx, key, newValue, options) if err == nil { break // Success } if !errors.Is(err, norikv.ErrVersionMismatch) { return err } if i == maxRetries-1 { return fmt.Errorf("failed after %d retries", maxRetries) } // Exponential backoff time.Sleep(time.Duration(1<<i) * 10 * time.Millisecond) } -
Reduce contention by sharding:
-
Use idempotency keys when appropriate:
Problem: Lost updates with concurrent writes¶
Symptoms:
Cause: Not using CAS for concurrent updates.
Solution:
Always use CAS for concurrent updates:
func incrementCounter(client *norikv.Client, ctx context.Context, key []byte) error {
for i := 0; i < 10; i++ {
result, err := client.Get(ctx, key, nil)
if err != nil {
return err
}
value, _ := strconv.Atoi(string(result.Value))
newValue := []byte(strconv.Itoa(value + 1))
options := &norikv.PutOptions{
IfMatchVersion: result.Version,
}
_, err = client.Put(ctx, key, newValue, options)
if err == nil {
return nil
}
if !errors.Is(err, norikv.ErrVersionMismatch) {
return err
}
}
return fmt.Errorf("failed after retries")
}
Error Messages¶
"key not found"¶
Error:
Meaning: Key does not exist in the database.
Solutions: 1. Check if key was written 2. Check for typos in key name 3. Handle gracefully:
result, err := client.Get(ctx, key, nil)
if errors.Is(err, norikv.ErrKeyNotFound) {
// Use default value
return defaultValue
}
"version mismatch"¶
Error:
_, err := client.Put(ctx, key, value, &norikv.PutOptions{
IfMatchVersion: expectedVersion,
})
if errors.Is(err, norikv.ErrVersionMismatch) {
// Handle
}
Meaning: CAS check failed - version changed between read and write.
Solutions: See Version Conflicts above.
"key already exists"¶
Error:
_, err := client.Put(ctx, key, value, &norikv.PutOptions{
IfNotExists: true,
})
if errors.Is(err, norikv.ErrAlreadyExists) {
// Handle
}
Meaning: Key already exists (used with IfNotExists).
Solutions:
if errors.Is(err, norikv.ErrAlreadyExists) {
// Get existing value
result, _ := client.Get(ctx, key, nil)
// Or ignore if duplicate is okay
}
Configuration Issues¶
Problem: "invalid shard count"¶
Symptoms:
Cause: TotalShards must match cluster configuration.
Solution:
Problem: Hash mismatches causing wrong routing¶
Symptoms: - Requests going to wrong shards - Frequent NOT_LEADER errors
Cause: Hash function incompatibility.
Solution:
Verify hash compatibility:
All tests must pass to ensure compatibility with server.
Problem: Client not finding any nodes¶
Symptoms:
Cause: Empty or invalid Nodes configuration.
Solution:
config := &norikv.ClientConfig{
Nodes: []string{
"node1:9001",
"node2:9001",
"node3:9001",
},
// ...
}
Debugging Tips¶
Enable Debug Logging¶
// Add logging to track requests
import "log"
type loggingClient struct {
*norikv.Client
}
func (c *loggingClient) Put(ctx context.Context, key, value []byte, opts *norikv.PutOptions) (*norikv.Version, error) {
log.Printf("PUT key=%s, size=%d", key, len(value))
version, err := c.Client.Put(ctx, key, value, opts)
if err != nil {
log.Printf("PUT error: %v", err)
} else {
log.Printf("PUT success: version=%v", version)
}
return version, err
}
Monitor Client Statistics¶
// Periodically log stats
go func() {
ticker := time.NewTicker(10 * time.Second)
defer ticker.Stop()
for range ticker.C {
stats := client.Stats()
log.Printf("Client stats: %+v", stats)
}
}()
Test with Ephemeral Server¶
import "github.com/norikv/norikv-go/testing/ephemeral"
// Start in-memory server for testing
server := ephemeral.NewServer()
err := server.Start("127.0.0.1:0")
if err != nil {
log.Fatal(err)
}
defer server.Stop()
// Get actual address
address := server.Address()
// Create client
config := &norikv.ClientConfig{
Nodes: []string{address},
TotalShards: 1024,
}
client, err := norikv.NewClient(ctx, config)
Use pprof for Performance Analysis¶
import (
_ "net/http/pprof"
"net/http"
"runtime"
)
// Enable pprof
go func() {
log.Println(http.ListenAndServe("localhost:6060", nil))
}()
// Analyze CPU profile
// go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30
// Analyze memory
// go tool pprof http://localhost:6060/debug/pprof/heap
// Check goroutines
// go tool pprof http://localhost:6060/debug/pprof/goroutine
Trace Individual Requests¶
import "context"
// Add trace ID to context
type traceKeyType string
const traceKey traceKeyType = "trace-id"
func withTraceID(ctx context.Context, id string) context.Context {
return context.WithValue(ctx, traceKey, id)
}
func getTraceID(ctx context.Context) string {
if id, ok := ctx.Value(traceKey).(string); ok {
return id
}
return ""
}
// Use in requests
traceID := uuid.New().String()
ctx := withTraceID(context.Background(), traceID)
log.Printf("[%s] Starting request", traceID)
result, err := client.Get(ctx, key, nil)
log.Printf("[%s] Request complete: err=%v", traceID, err)
Common Pitfalls¶
1. Not checking errors¶
// Bad
result, _ := client.Get(ctx, key, nil)
process(result.Value) // May panic if result is nil
// Good
result, err := client.Get(ctx, key, nil)
if err != nil {
log.Printf("Get failed: %v", err)
return err
}
process(result.Value)
2. Creating client per request¶
// Bad - closes connections!
func handleRequest() {
client, _ := norikv.NewClient(ctx, config)
defer client.Close()
client.Get(ctx, key, nil)
}
// Good - reuse client
var globalClient *norikv.Client
func init() {
globalClient, _ = norikv.NewClient(context.Background(), config)
}
3. Ignoring context cancellation¶
// Bad
_, err := client.Get(context.Background(), key, nil)
// Good
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_, err := client.Get(ctx, key, nil)
4. Not using CAS for concurrent updates¶
// Bad - lost updates
result, _ := client.Get(ctx, key, nil)
value, _ := strconv.Atoi(string(result.Value))
client.Put(ctx, key, []byte(strconv.Itoa(value+1)), nil)
// Good - CAS prevents lost updates
result, _ := client.Get(ctx, key, nil)
value, _ := strconv.Atoi(string(result.Value))
client.Put(ctx, key, []byte(strconv.Itoa(value+1)), &norikv.PutOptions{
IfMatchVersion: result.Version,
})
Getting Help¶
If you're still experiencing issues:
- Check the API Guide for correct usage
- Review Architecture Guide for understanding internals
- See Advanced Patterns for complex use cases
- Open an issue on GitHub with:
- Go version (
go version) - Client SDK version
- Minimal reproduction code
- Error messages and logs
- Server configuration