# Performance Tuning Guide ## Overview This guide provides comprehensive performance optimization strategies for Readur deployments, from small personal instances to large enterprise installations. ## Performance Baseline ### System Metrics to Monitor | Metric | Target | Warning | Critical | |--------|--------|---------|----------| | CPU Usage | <60% | 60-80% | >80% | | Memory Usage | <70% | 70-85% | >85% | | Disk I/O Wait | <10% | 10-25% | >25% | | Database Connections | <60% | 60-80% | >80% | | Response Time (p95) | <500ms | 500-1000ms | >1000ms | | OCR Queue Length | <100 | 100-500 | >500 | ## Database Optimization ### PostgreSQL Tuning #### Connection Pool Settings ```bash # postgresql.conf max_connections = 200 shared_buffers = 256MB # 25% of available RAM effective_cache_size = 1GB # 50-75% of available RAM work_mem = 4MB maintenance_work_mem = 64MB # Write performance checkpoint_segments = 32 checkpoint_completion_target = 0.9 wal_buffers = 16MB # Query optimization random_page_cost = 1.1 # For SSD storage effective_io_concurrency = 200 # For SSD default_statistics_target = 100 ``` #### Application Connection Pooling ```bash # Readur configuration DATABASE_POOL_SIZE=20 DATABASE_MAX_OVERFLOW=10 DATABASE_POOL_TIMEOUT=30 DATABASE_POOL_RECYCLE=3600 DATABASE_STATEMENT_TIMEOUT=30000 # 30 seconds ``` ### Query Optimization #### Index Creation ```sql -- Essential indexes for performance CREATE INDEX CONCURRENTLY idx_documents_user_created ON documents(user_id, created_at DESC); CREATE INDEX CONCURRENTLY idx_documents_ocr_status ON documents(ocr_status) WHERE ocr_status IN ('pending', 'processing'); CREATE INDEX CONCURRENTLY idx_documents_search ON documents USING gin(to_tsvector('english', content)); -- Partial indexes for common queries CREATE INDEX CONCURRENTLY idx_recent_documents ON documents(created_at DESC) WHERE created_at > CURRENT_DATE - INTERVAL '30 days'; ``` #### Query Analysis ```sql -- Enable query logging for slow queries ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries over 1 second ALTER SYSTEM SET log_statement = 'all'; SELECT pg_reload_conf(); -- Analyze query performance EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM documents WHERE user_id = '123' AND created_at > '2024-01-01' ORDER BY created_at DESC LIMIT 100; ``` ### Database Maintenance ```bash #!/bin/bash # maintenance.sh - Run weekly # Vacuum and analyze docker-compose exec postgres vacuumdb -U readur -d readur -z -v # Reindex for better performance docker-compose exec postgres reindexdb -U readur -d readur # Update statistics docker-compose exec postgres psql -U readur -d readur -c "ANALYZE;" # Clean up old data via database queries docker-compose exec readur psql -U readur -d readur -c \ "DELETE FROM sessions WHERE last_activity < NOW() - INTERVAL '30 days';" # Check for orphaned files docker-compose exec readur psql -U readur -d readur -c \ "SELECT COUNT(*) FROM documents WHERE file_path NOT IN (SELECT path FROM files);" ``` ## OCR Performance ### OCR Worker Configuration ```bash # Optimize based on CPU cores and RAM OCR_WORKERS=4 # Number of parallel workers OCR_MAX_PARALLEL=8 # Max concurrent OCR operations OCR_QUEUE_SIZE=1000 # Queue buffer size OCR_BATCH_SIZE=10 # Documents per batch OCR_TIMEOUT=300 # Seconds per document # Memory management OCR_MAX_MEMORY_MB=1024 # Per worker memory limit OCR_TEMP_DIR=/tmp/ocr # Use fast storage for temp files # Tesseract optimization TESSERACT_THREAD_LIMIT=2 # Threads per OCR job TESSERACT_PSM=3 # Page segmentation mode TESSERACT_OEM=1 # OCR engine mode (LSTM) ``` ### OCR Processing Strategies #### Priority Queue Implementation ```python # priority_queue.py from celery import Celery from kombu import Queue, Exchange app = Celery('readur') # Define priority queues app.conf.task_routes = { 'ocr.process_document': {'queue': 'ocr', 'routing_key': 'ocr.normal'}, 'ocr.process_urgent': {'queue': 'ocr_priority', 'routing_key': 'ocr.high'}, } app.conf.task_queues = ( Queue('ocr', Exchange('ocr'), routing_key='ocr.normal', priority=5), Queue('ocr_priority', Exchange('ocr'), routing_key='ocr.high', priority=10), ) # Worker configuration app.conf.worker_prefetch_multiplier = 1 app.conf.task_acks_late = True ``` #### Batch Processing ```bash # Re-queue pending OCR documents during off-hours 0 2 * * * docker-compose exec readur /app/enqueue_pending_ocr ``` ## Storage Optimization ### File System Performance #### Local Storage ```bash # Mount options for better performance /dev/sdb1 /data ext4 defaults,noatime,nodiratime,nobarrier 0 2 # For XFS /dev/sdb1 /data xfs defaults,noatime,nodiratime,allocsize=64m 0 2 # Enable compression (Btrfs) mount -o compress=lzo /dev/sdb1 /data ``` #### Storage Layout ``` /data/ ├── readur/ │ ├── documents/ # Main storage (SSD recommended) │ ├── temp/ # Temporary files (tmpfs or fast SSD) │ ├── cache/ # Cache directory (SSD) │ └── thumbnails/ # Generated thumbnails (can be slower storage) ``` ### S3 Optimization ```bash # S3 transfer optimization S3_MAX_CONNECTIONS=100 S3_MAX_BANDWIDTH=100MB # Limit bandwidth if needed S3_MULTIPART_THRESHOLD=64MB S3_MULTIPART_CHUNKSIZE=16MB S3_MAX_CONCURRENCY=10 S3_USE_ACCELERATE_ENDPOINT=true # AWS only # Connection pooling S3_CONNECTION_POOL_SIZE=50 S3_CONNECTION_TIMEOUT=30 S3_READ_TIMEOUT=60 ``` ## Caching Strategy ### Redis Configuration ```bash # redis.conf maxmemory 4gb maxmemory-policy allkeys-lru save "" # Disable persistence for cache-only use tcp-keepalive 60 timeout 300 # Performance tuning tcp-backlog 511 databases 2 hz 10 ``` ### Application Caching ```bash # Cache configuration CACHE_TYPE=redis CACHE_REDIS_URL=redis://localhost:6379/0 CACHE_DEFAULT_TIMEOUT=3600 # 1 hour CACHE_THRESHOLD=1000 # Max cached items # Specific cache TTLs CACHE_SEARCH_RESULTS_TTL=600 # 10 minutes CACHE_USER_SESSIONS_TTL=3600 # 1 hour CACHE_DOCUMENT_METADATA_TTL=86400 # 24 hours CACHE_THUMBNAILS_TTL=604800 # 7 days ``` ### CDN Integration ```nginx # Serve static files through CDN location /static/ { expires 30d; add_header Cache-Control "public, immutable"; add_header Vary "Accept-Encoding"; } location /media/thumbnails/ { expires 7d; add_header Cache-Control "public"; } ``` ## Application Optimization ### Gunicorn/Uvicorn Configuration ```bash # Gunicorn settings GUNICORN_WORKERS=4 # 2-4 x CPU cores GUNICORN_WORKER_CLASS=uvicorn.workers.UvicornWorker GUNICORN_WORKER_CONNECTIONS=1000 GUNICORN_MAX_REQUESTS=1000 GUNICORN_MAX_REQUESTS_JITTER=50 GUNICORN_TIMEOUT=30 GUNICORN_KEEPALIVE=5 # Thread pool GUNICORN_THREADS=4 GUNICORN_THREAD_WORKERS=2 ``` ### Async Processing ```python # async_config.py import asyncio from concurrent.futures import ThreadPoolExecutor # Configure async settings ASYNC_MAX_WORKERS = 10 ASYNC_QUEUE_SIZE = 100 executor = ThreadPoolExecutor(max_workers=ASYNC_MAX_WORKERS) # Background task processing CELERY_WORKER_CONCURRENCY = 4 CELERY_WORKER_PREFETCH_MULTIPLIER = 1 CELERY_TASK_TIME_LIMIT = 300 CELERY_TASK_SOFT_TIME_LIMIT = 240 ``` ## Network Optimization ### HTTP/2 Configuration ```nginx server { listen 443 ssl http2; # HTTP/2 settings http2_max_field_size 16k; http2_max_header_size 32k; http2_max_requests 1000; # Keep-alive keepalive_timeout 65; keepalive_requests 100; } ``` ### Load Balancing ```nginx upstream readur_backend { least_conn; # Or ip_hash for session affinity server backend1:8000 weight=5 max_fails=3 fail_timeout=30s; server backend2:8000 weight=5 max_fails=3 fail_timeout=30s; server backend3:8000 weight=3 backup; keepalive 32; } ``` ## Monitoring and Profiling ### Performance Monitoring Stack ```yaml # docker-compose.monitoring.yml services: prometheus: image: prom/prometheus volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml ports: - "9090:9090" grafana: image: grafana/grafana ports: - "3000:3000" environment: - GF_SECURITY_ADMIN_PASSWORD=admin node_exporter: image: prom/node-exporter ports: - "9100:9100" ``` ### Application Profiling ```python # profile_middleware.py import cProfile import pstats import io class ProfilingMiddleware: def __init__(self, app): self.app = app def __call__(self, environ, start_response): if 'profile' in environ.get('QUERY_STRING', ''): profiler = cProfile.Profile() profiler.enable() response = self.app(environ, start_response) profiler.disable() stream = io.StringIO() stats = pstats.Stats(profiler, stream=stream) stats.sort_stats('cumulative') stats.print_stats(20) print(stream.getvalue()) return response return self.app(environ, start_response) ``` ## Scaling Strategies ### Horizontal Scaling ```yaml # docker-compose.scale.yml services: readur: deploy: replicas: 3 resources: limits: cpus: '2' memory: 4G reservations: cpus: '1' memory: 2G ocr-worker: deploy: replicas: 5 resources: limits: cpus: '1' memory: 2G ``` ### Vertical Scaling Guidelines | Users | CPU | RAM | Storage | Database | |-------|-----|-----|---------|----------| | 1-10 | 2 cores | 4GB | 100GB | Shared | | 10-50 | 4 cores | 8GB | 500GB | Dedicated 2 cores, 4GB | | 50-100 | 8 cores | 16GB | 1TB | Dedicated 4 cores, 8GB | | 100-500 | 16 cores | 32GB | 5TB | Cluster | | 500+ | Multiple servers | 64GB+ | Object storage | Cluster with replicas | ## Optimization Checklist ### Quick Wins - [ ] Enable gzip compression - [ ] Set appropriate cache headers - [ ] Configure connection pooling - [ ] Enable query result caching - [ ] Optimize database indexes - [ ] Tune OCR worker count - [ ] Configure Redis caching - [ ] Enable HTTP/2 ### Advanced Optimizations - [ ] Implement read replicas - [ ] Set up CDN for static files - [ ] Enable database partitioning - [ ] Implement queue priorities - [ ] Configure auto-scaling - [ ] Set up performance monitoring - [ ] Implement rate limiting - [ ] Enable connection multiplexing ## Troubleshooting Performance Issues ### High CPU Usage ```bash # Identify CPU-intensive processes top -H -p $(pgrep -d',' readur) # Check OCR worker load docker-compose exec readur celery inspect active # Profile Python code python -m cProfile -o profile.stats app.py ``` ### Memory Issues ```bash # Check memory usage free -h docker stats # Find memory leaks docker-compose exec readur python -m tracemalloc # Adjust memory limits docker update --memory 4g readur_container ``` ### Slow Queries ```sql -- Find slow queries SELECT query, calls, mean_exec_time, total_exec_time FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10; -- Check missing indexes SELECT schemaname, tablename, attname, n_distinct, correlation FROM pg_stats WHERE schemaname = 'public' AND n_distinct > 100 AND correlation < 0.1 ORDER BY n_distinct DESC; ``` ## Related Documentation - [Architecture Overview](../architecture.md) - [Monitoring Guide](./monitoring.md) - [Database Guardrails](../dev/DATABASE_GUARDRAILS.md) - [OCR Optimization](../dev/OCR_OPTIMIZATION_GUIDE.md)