life-my--midst--in

Troubleshooting Guide

Common errors, debugging strategies, and solutions for the In Midst My Life platform.

Quick Diagnostics
Database Issues
Redis & Caching Issues
API Errors
Orchestrator & Job Queue Issues
Frontend Issues
Development Environment
Deployment Issues
Performance Issues
LLM & Agent Issues

Quick Diagnostics

Health Check Script

Run this first to diagnose system health:

#!/bin/bash
# save as: scripts/health-check.sh

echo "🔍 Running system diagnostics..."

echo "\n1. Checking services..."
docker-compose ps

echo "\n2. Testing API health..."
curl -s http://localhost:3001/health || echo "❌ API unreachable"
curl -s http://localhost:3001/ready || echo "❌ API not ready"

echo "\n3. Testing database connection..."
docker-compose exec -T postgres pg_isready -U midstsvc || echo "❌ PostgreSQL not ready"

echo "\n4. Testing Redis..."
docker-compose exec -T redis redis-cli ping || echo "❌ Redis not responding"

echo "\n5. Checking logs for errors..."
docker-compose logs --tail=50 api | grep -i error
docker-compose logs --tail=50 orchestrator | grep -i error

echo "\n✅ Diagnostics complete"

Common Commands

# View all container status
docker-compose ps

# Check service logs
docker-compose logs -f api
docker-compose logs -f orchestrator
docker-compose logs --tail=100 api

# Restart specific service
docker-compose restart api

# Full restart
docker-compose down && docker-compose up -d

# Check port usage
lsof -i :3001
lsof -i :5432

Database Issues

Error: `ECONNREFUSED` - Cannot Connect to Database

Symptoms:

Error: connect ECONNREFUSED 127.0.0.1:5432

Diagnosis:

# Check if PostgreSQL is running
docker-compose ps postgres

# Check PostgreSQL logs
docker-compose logs postgres

Solutions:

PostgreSQL not started:

./scripts/dev-up.sh
# or
docker-compose up postgres -d

Wrong connection string:

# Check environment variable
echo $DATABASE_URL
   
# Should be:
postgresql://midstsvc:password@localhost:5432/midst_dev
   
# For Docker services:
postgresql://midstsvc:password@postgres:5432/midst

Port conflict:

# Check what's using port 5432
lsof -i :5432
   
# Change port in .env
POSTGRES_PORT=5433
   
# Update DATABASE_URL accordingly

Container networking issue:

# Verify network
docker network ls
docker network inspect in-midst-my-life_default
   
# Recreate network
docker-compose down
docker-compose up -d

Error: `relation "profiles" does not exist`

Symptoms:

ERROR: relation "profiles" does not exist

Diagnosis:

# Connect to database
./scripts/dev-shell.sh postgres

# In psql:
\dt                    # List all tables

Solution:

# Run migrations
pnpm --filter @in-midst-my-life/api migrate
pnpm --filter @in-midst-my-life/orchestrator migrate

# Verify tables exist
./scripts/dev-shell.sh postgres
# In psql:
\dt
SELECT COUNT(*) FROM profiles;

Error: Migration Failed

Symptoms:

Migration failed: duplicate column name

Diagnosis:

# Check migration status
./scripts/dev-shell.sh postgres
# In psql:
SELECT * FROM migrations ORDER BY applied_at DESC;

Solutions:

Migration already applied:
- Migrations are idempotent by design
- Check if table/column already exists

Partial migration:

# Rollback (if DOWN statements exist)
# Edit migration file with proper DOWN
   
# Re-run migration
pnpm --filter @in-midst-my-life/api migrate

Database in bad state:

# DANGER: Nuclear option (dev only)
docker-compose down -v  # Removes volumes
docker-compose up postgres -d
pnpm --filter @in-midst-my-life/api migrate
pnpm --filter @in-midst-my-life/api seed

Error: `too many connections`

Symptoms:

FATAL: sorry, too many clients already

Solutions:

Close unused connections:

./scripts/dev-shell.sh postgres
# In psql:
SELECT COUNT(*) FROM pg_stat_activity;
   
# Kill idle connections
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle'
AND query_start < NOW() - INTERVAL '5 minutes';

Increase max_connections:

# docker-compose.yml
postgres:
  command: postgres -c max_connections=200

Connection pooling:
- Check API uses connection pooling (default in our setup)
- Verify pool size is appropriate (default: 10)

Redis & Caching Issues

Error: `Redis connection failed`

Symptoms:

Error: Redis connection to redis:6379 failed

Diagnosis:

# Check Redis status
docker-compose ps redis

# Test connection
docker-compose exec redis redis-cli ping

Solutions:

Redis not started:

./scripts/dev-up.sh
# or
docker-compose up redis -d

Wrong Redis URL:

echo $REDIS_URL
   
# Should be:
redis://redis:6379          # From Docker containers
redis://localhost:6379      # From host machine

Redis auth required but not provided:

# If Redis has password
redis://:<password>@redis:6379

Error: Cache Inconsistency

Symptoms:

Stale data returned
Taxonomy changes not reflected

Solutions:

Clear cache manually:

./scripts/dev-shell.sh redis
# In redis-cli:
FLUSHDB         # Clear current database
FLUSHALL        # Clear all databases

Clear specific keys:

./scripts/dev-shell.sh redis
# In redis-cli:
KEYS taxonomy:*  # Find taxonomy keys
DEL taxonomy:masks taxonomy:epochs taxonomy:stages

Disable caching (dev only):

# Set in .env
REDIS_URL=  # Empty = use in-memory fallback

API Errors

Error: `401 Unauthorized`

Symptoms:

{
  "error": {
    "code": "UNAUTHORIZED",
    "message": "Missing or invalid authentication token"
  }
}

Solutions:

Missing Authorization header:

# Include JWT token
curl -H "Authorization: Bearer <your-jwt-token>" \
  http://localhost:3001/profiles

Expired token:
- Obtain new token from auth provider
- Check token expiry: https://jwt.io
Invalid token signature:
- Verify JWT_SECRET matches between services
- Check token issuer/audience claims

Error: `404 Not Found`

Symptoms:

{
  "error": {
    "code": "NOT_FOUND",
    "message": "Profile not found"
  }
}

Diagnosis:

# Check if resource exists in database
./scripts/dev-shell.sh postgres
# In psql:
SELECT * FROM profiles WHERE id = '<uuid>';

Solutions:

Resource doesn’t exist:
- Create the resource first
- Check if ID is correct (UUID format)

Soft-deleted:

SELECT * FROM profiles WHERE id = '<uuid>' AND is_active = true;

Error: `429 Too Many Requests` / `QUOTA_EXCEEDED`

Symptoms:

{
  "error": {
    "code": "QUOTA_EXCEEDED",
    "message": "Monthly quota exceeded for this feature",
    "details": {
      "feature": "resume_tailoring",
      "limit": 5,
      "used": 5,
      "resetDate": "2025-02-01T00:00:00Z"
    }
  }
}

Solutions:

Upgrade tier:
- FREE → PRO → ENTERPRISE
Wait for quota reset:
- Check resetDate in error response

Bypass in development:

# Set env var (dev only)
DISABLE_RATE_LIMITING=true

Error: `500 Internal Server Error`

Symptoms:

{
  "error": {
    "code": "INTERNAL_ERROR",
    "message": "An unexpected error occurred"
  }
}

Diagnosis:

# Check API logs
docker-compose logs api | tail -100

# Look for stack traces
docker-compose logs api | grep -A 20 "Error:"

Solutions:

Database connection issue:
- See Database Issues
Uncaught exception:
- Check logs for stack trace
- File bug report with reproduction steps
Resource exhaustion:
- Check memory/CPU usage
- Restart service: docker-compose restart api

Orchestrator & Job Queue Issues

Error: Jobs Not Processing

Symptoms:

Jobs stuck in pending status
Queue growing but not draining

Diagnosis:

# Check orchestrator is running
docker-compose ps orchestrator

# Check worker is enabled
docker-compose exec orchestrator printenv | grep ORCH_WORKER_ENABLED

# Check Redis queue length
./scripts/dev-shell.sh redis
# In redis-cli:
LLEN "bull:task-queue:waiting"
LLEN "bull:task-queue:active"
LLEN "bull:task-queue:failed"

Solutions:

Worker not enabled:

# Set in .env
ORCH_WORKER_ENABLED=true
   
# Restart orchestrator
docker-compose restart orchestrator

Redis connection issue:

# Check REDIS_URL or ORCH_REDIS_URL
echo $ORCH_REDIS_URL
   
# Should be:
redis://redis:6379

Job handler not registered:

# Check orchestrator logs
docker-compose logs orchestrator | grep "Registering handler"
   
# Ensure task type matches registered handlers

Job failing silently:

# Check failed queue
./scripts/dev-shell.sh redis
# In redis-cli:
LRANGE "bull:task-queue:failed" 0 -1

Error: LLM Agent Timeout

Symptoms:

Error: LLM request timeout after 30s

Solutions:

Local LLM not running:

# Check Ollama is running
curl http://localhost:11434/api/tags
   
# Start Ollama
ollama serve
   
# Pull model if needed
ollama pull llama3.1:8b

Wrong LLM URL:

# For local development
LOCAL_LLM_URL=http://localhost:11434
   
# For Docker containers
LOCAL_LLM_URL=http://host.docker.internal:11434
LOCAL_LLM_ALLOWED_HOSTS=host.docker.internal

Use stub executor (no LLM):
```
# Set in .env
ORCH_AGENT_EXECUTOR=stub
```

Frontend Issues

Error: `hydration failed` in Next.js

Symptoms:

Error: Hydration failed because the initial UI does not match what was rendered on the server.

Solutions:

Client-only rendering:

'use client';
   
import dynamic from 'next/dynamic';
   
const ClientComponent = dynamic(() => import('./ClientComponent'), {
  ssr: false,
});

Date/time formatting:

// Use consistent formatting
const date = new Date(dateString).toISOString();

Browser extensions:
- Disable browser extensions
- Test in incognito mode

Error: API calls failing from frontend

Symptoms:

Failed to fetch: net::ERR_CONNECTION_REFUSED

Diagnosis:

# Check API is running
curl http://localhost:3001/health

# Check NEXT_PUBLIC_API_BASE_URL
echo $NEXT_PUBLIC_API_BASE_URL

Solutions:

API not running:

pnpm --filter @in-midst-my-life/api dev

Wrong API URL:

# Set in .env
NEXT_PUBLIC_API_BASE_URL=http://localhost:3001

CORS issue:
- Check API CORS configuration
- Ensure origin is allowed

Error: D3 Graph Not Rendering

Symptoms:

Blank graph area
Console error: Cannot read property 'append' of null

Solutions:

Container ref not attached:

const containerRef = useRef<HTMLDivElement>(null);
   
useEffect(() => {
  if (!containerRef.current) return;
  // ... D3 code
}, []);
   
return <div ref={containerRef} />;

Dynamic import for D3:

const D3Graph = dynamic(() => import('./D3Graph'), {
  ssr: false,
});

Fallback to radial layout:

# Set in .env
NEXT_PUBLIC_GRAPH_LAYOUT=radial

Development Environment

Error: `pnpm install` fails

Symptoms:

ERR_PNPM_LOCKFILE_BROKEN_NODE_MODULES

Solutions:

# Clean install
rm -rf node_modules
rm pnpm-lock.yaml
pnpm install

# If still failing, clear pnpm cache
pnpm store prune
pnpm install

Error: TypeScript errors after update

Symptoms:

Type 'X' is not assignable to type 'Y'

Solutions:

# Rebuild all packages
pnpm build

# Clear TypeScript cache
rm tsconfig.tsbuildinfo
rm -rf apps/*/tsconfig.tsbuildinfo
rm -rf packages/*/tsconfig.tsbuildinfo

# Re-run typecheck
pnpm typecheck

Error: Port already in use

Symptoms:

Error: listen EADDRINUSE: address already in use :::3001

Solutions:

# Find process using port
lsof -i :3001

# Kill process
kill -9 <PID>

# Or change port
# In .env:
API_PORT=3011

Deployment Issues

Error: Kubernetes Pod CrashLoopBackOff

Diagnosis:

# Check pod status
kubectl get pods -n inmidst

# View pod logs
kubectl logs <pod-name> -n inmidst

# Describe pod for events
kubectl describe pod <pod-name> -n inmidst

Common Causes:

Missing environment variables:

# Check pod environment
kubectl exec <pod-name> -n inmidst -- printenv

Database connection failure:
- Check DATABASE_URL secret
- Verify database is accessible from cluster

Image pull error:

# Check image exists
docker pull <image-name>:<tag>
   
# Check image pull secrets
kubectl get secrets -n inmidst

Error: Helm Install Failed

Diagnosis:

# Check Helm release status
helm status inmidst -n inmidst

# View release history
helm history inmidst -n inmidst

# Get error details
helm get notes inmidst -n inmidst

Solutions:

Rollback to previous version:
```
helm rollback inmidst -n inmidst
```

Uninstall and reinstall:

helm uninstall inmidst -n inmidst
helm install inmidst . -n inmidst -f values.yaml

Debug with dry-run:

helm install inmidst . --dry-run --debug -n inmidst

Performance Issues

Issue: Slow API Response Times

Diagnosis:

# Check API metrics
curl http://localhost:3001/metrics | grep http_request_duration

# Test specific endpoint
time curl http://localhost:3001/profiles/<id>

Solutions:

Database query optimization:

# Enable query logging
# In postgresql.conf:
log_statement = 'all'
log_duration = on
   
# Check slow queries
SELECT * FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

Add database indexes:

CREATE INDEX idx_profiles_slug ON profiles(slug);
CREATE INDEX idx_experiences_profile_id ON experiences(profile_id);

Enable Redis caching:

# Ensure REDIS_URL is set
echo $REDIS_URL
   
# Test Redis connection
./scripts/dev-shell.sh redis

Issue: High Memory Usage

Diagnosis:

# Check memory usage
docker stats

# Kubernetes
kubectl top pods -n inmidst

Solutions:

Increase memory limits:

# docker-compose.yml
api:
  deploy:
    resources:
      limits:
        memory: 2G

Connection pool tuning:

// Reduce pool size
const pool = new Pool({
  max: 5, // instead of 10
});

LLM & Agent Issues

Error: LLM returns invalid JSON

Symptoms:

Error: Unexpected token in JSON at position 0

Solutions:

Use simpler model:

# Switch to smaller model
LOCAL_LLM_MODEL=gemma3:4b

Disable structured output:

# Use text mode
ORCH_LLM_RESPONSE_FORMAT=text

Retry with exponential backoff:
- Already implemented in agent executor

Error: Agent tool not found

Symptoms:

Error: Tool 'rg' not in allowlist

Solutions:

# Enable tool in allowlist
ORCH_TOOL_ALLOWLIST=rg,ls,cat

# Or disable tools entirely
ORCH_TOOL_ALLOWLIST=  # Empty = no tools

Getting More Help

Enable Debug Logging

# API
LOG_LEVEL=debug pnpm --filter @in-midst-my-life/api dev

# Orchestrator
LOG_LEVEL=debug pnpm --filter @in-midst-my-life/orchestrator dev

# Docker Compose
docker-compose up --verbose

Collect Diagnostic Information

# Create bug report bundle
./scripts/collect-diagnostics.sh > diagnostics.txt

# Includes:
# - Service status
# - Recent logs
# - Environment config (secrets redacted)
# - Database table counts
# - Redis stats

Contact Support

GitHub Issues: https://github.com/anthropics/in-midst-my-life/issues
Email: padavano.anthony@gmail.com
Documentation: https://github.com/anthropics/in-midst-my-life/docs

When reporting issues, include:

Error message (full stack trace)
Steps to reproduce
Environment (Docker/Kubernetes, OS, Node version)
Relevant logs
Diagnostic output

Additional Resources

This site is open source. Improve this page.

life-my--midst--in

Troubleshooting Guide

Table of Contents

Quick Diagnostics

Health Check Script

Common Commands

Database Issues

Error: ECONNREFUSED - Cannot Connect to Database

Error: relation "profiles" does not exist

Error: Migration Failed

Error: too many connections

Redis & Caching Issues

Error: Redis connection failed

Error: Cache Inconsistency

API Errors

Error: 401 Unauthorized

Error: 404 Not Found

Error: 429 Too Many Requests / QUOTA_EXCEEDED

Error: 500 Internal Server Error

Orchestrator & Job Queue Issues

Error: Jobs Not Processing

Error: LLM Agent Timeout

Frontend Issues

Error: hydration failed in Next.js

Error: API calls failing from frontend

Error: D3 Graph Not Rendering

Development Environment

Error: pnpm install fails

Error: TypeScript errors after update

Error: Port already in use

Deployment Issues

Error: Kubernetes Pod CrashLoopBackOff

Error: Helm Install Failed

Performance Issues

Issue: Slow API Response Times

Issue: High Memory Usage

LLM & Agent Issues

Error: LLM returns invalid JSON

Error: Agent tool not found

Getting More Help

Enable Debug Logging

Collect Diagnostic Information

Contact Support

Additional Resources

Error: `ECONNREFUSED` - Cannot Connect to Database

Error: `relation "profiles" does not exist`

Error: `too many connections`

Error: `Redis connection failed`

Error: `401 Unauthorized`

Error: `404 Not Found`

Error: `429 Too Many Requests` / `QUOTA_EXCEEDED`

Error: `500 Internal Server Error`

Error: `hydration failed` in Next.js

Error: `pnpm install` fails