5.7 KiB

Raw Blame History

Testing Guide

This document describes the testing strategy for the Readur OCR document management system.

🧪 Testing Strategy

We have a clean three-tier testing approach:

Unit Tests (Rust) - Fast, no dependencies, test individual components
Integration Tests (Python) - Test against running services, user workflow validation
Frontend Tests (JavaScript) - Component and API integration testing

🚀 Quick Start

Using the Rust Test Runner (Recommended)

# Run all tests
cargo run --bin test_runner

# Run specific test types
cargo run --bin test_runner unit         # Unit tests only
cargo run --bin test_runner integration  # Integration tests only  
cargo run --bin test_runner frontend     # Frontend tests only

Manual Test Execution

# Unit tests (fast, no dependencies)
cargo test --test unit_tests

# Integration tests (requires running server)
# 1. Start server: cargo run
# 2. Run tests: cargo test --test integration_tests

# Frontend tests
cd frontend && npm test -- --run

📋 Test Categories

Unit Tests (`tests/unit_tests.rs`)

Rust-based tests for core data structures and conversions without external dependencies:

✅ Document response conversion (with/without OCR)
✅ OCR field validation (confidence, word count, processing time)
✅ User response conversion (security - no password leaks)
✅ Search mode defaults and enums

Run with: cargo test --test unit_tests or cargo run --bin test_runner unit

Integration Tests (`tests/integration_tests.rs`)

Rust-based tests for complete user workflows against running services:

✅ User registration and authentication (using CreateUser, LoginRequest types)
✅ Document upload via multipart form (returns DocumentResponse)
✅ OCR processing completion (with timeout and type validation)
✅ OCR text retrieval via API endpoint (validates response structure)
✅ Error handling (401, 404 responses)
✅ Health endpoint validation

Run with: cargo test --test integration_tests or cargo run --bin test_runner integration

Advantages of Rust Integration Tests:

🔒 Type Safety - Uses same models/types as main application
🚀 Performance - Faster execution than Python scripts
🛠️ IDE Support - Full autocomplete and refactoring support
🔗 Code Reuse - Can import validation logic and test helpers

Frontend Tests

Located in frontend/src/:

✅ Document details page with OCR functionality
✅ API service mocking and integration
✅ Component behavior and user interactions

Run with: cd frontend && npm test

🔧 Test Configuration

Server Requirements

Integration tests expect the server running at:

URL: http://localhost:8080
Health endpoint: /api/health returns {"status": "ok"}

Test Data

Integration tests use:

Test user: integrationtest@test.com
Test document: Simple text file with known content
Timeout: 30 seconds for OCR processing

📊 Test Coverage

What We Test

OCR Functionality:

Document upload → OCR processing → text retrieval
OCR metadata validation (confidence, word count, timing)
Error handling for failed OCR processing

API Endpoints:

Authentication flow (register/login)
Document management (upload/list)
OCR text retrieval (/api/documents/{id}/ocr)
Error responses (401, 404, 500)

Data Models:

Type safety and field validation
Response structure consistency
Security (no password leaks)

Frontend Components:

OCR dialog behavior
API integration and error handling
User interaction flows

What We Don't Test

Tesseract OCR accuracy (external library)
Database schema migrations (handled by SQLx)
File system operations (handled by OS)
Network failures (covered by error handling)

🐛 Debugging Test Failures

Integration Test Failures

"Server is not running"

# Start the server first
cargo run
# Then run tests
./run_user_tests.sh

"OCR processing timed out"
- Check server logs for OCR errors
- Ensure Tesseract is installed and configured
- Increase timeout in test if needed
"Authentication failed"
- Check JWT secret configuration
- Verify database is accessible

Unit Test Failures

Unit tests should never fail due to external dependencies. If they do:

Check for compilation errors in models
Verify type definitions match expectations
Review recent changes to data structures

🔄 Continuous Integration

For CI/CD pipelines:

# Example GitHub Actions workflow
- name: Run Unit Tests
  run: cargo test --lib

- name: Start Services  
  run: docker-compose up -d

- name: Wait for Health
  run: timeout 60s bash -c 'until curl -s http://localhost:8080/api/health | grep -q "ok"; do sleep 2; done'

- name: Run Integration Tests
  run: cargo test --test integration_ocr_test

📈 Adding New Tests

For New API Endpoints

Add unit tests for data models in tests/unit_tests.rs
Add integration test in tests/integration_ocr_test.rs
Add frontend tests if UI components involved

For New OCR Features

Test the happy path (document → processing → retrieval)
Test error conditions (file format, processing failures)
Test performance/timeout scenarios
Validate response structure changes

🎯 Test Philosophy

Fast Feedback: Unit tests run in milliseconds, integration tests in seconds.

Real User Scenarios: Integration tests simulate actual user workflows.

Maintainable: Tests are simple, focused, and well-documented.

Reliable: Tests pass consistently and fail for good reasons.

Comprehensive: Critical paths are covered, edge cases are handled.

5.7 KiB Raw Blame History