# Testing Guide This document describes the testing strategy for the Readur OCR document management system. ## ๐Ÿงช Testing Strategy We have a clean three-tier testing approach: 1. **Unit Tests** (Rust) - Fast, no dependencies, test individual components 2. **Integration Tests** (Python) - Test against running services, user workflow validation 3. **Frontend Tests** (JavaScript) - Component and API integration testing ## ๐Ÿš€ Quick Start ### Using the Rust Test Runner (Recommended) ```bash # Run all tests cargo run --bin test_runner # Run specific test types cargo run --bin test_runner unit # Unit tests only cargo run --bin test_runner integration # Integration tests only cargo run --bin test_runner frontend # Frontend tests only ``` ### Manual Test Execution ```bash # Unit tests (fast, no dependencies) cargo test --test unit_tests # Integration tests (requires running server) # 1. Start server: cargo run # 2. Run tests: cargo test --test integration_tests # Frontend tests cd frontend && npm test -- --run ``` ## ๐Ÿ“‹ Test Categories ### Unit Tests (`tests/unit_tests.rs`) Rust-based tests for core data structures and conversions without external dependencies: - โœ… Document response conversion (with/without OCR) - โœ… OCR field validation (confidence, word count, processing time) - โœ… User response conversion (security - no password leaks) - โœ… Search mode defaults and enums **Run with:** `cargo test --test unit_tests` or `cargo run --bin test_runner unit` ### Integration Tests (`tests/integration_tests.rs`) Rust-based tests for complete user workflows against running services: - โœ… User registration and authentication (using `CreateUser`, `LoginRequest` types) - โœ… Document upload via multipart form (returns `DocumentResponse`) - โœ… OCR processing completion (with timeout and type validation) - โœ… OCR text retrieval via API endpoint (validates response structure) - โœ… Error handling (401, 404 responses) - โœ… Health endpoint validation **Run with:** `cargo test --test integration_tests` or `cargo run --bin test_runner integration` **Advantages of Rust Integration Tests:** - ๐Ÿ”’ **Type Safety** - Uses same models/types as main application - ๐Ÿš€ **Performance** - Faster execution than Python scripts - ๐Ÿ› ๏ธ **IDE Support** - Full autocomplete and refactoring support - ๐Ÿ”— **Code Reuse** - Can import validation logic and test helpers ### Frontend Tests Located in `frontend/src/`: - โœ… Document details page with OCR functionality - โœ… API service mocking and integration - โœ… Component behavior and user interactions **Run with:** `cd frontend && npm test` ## ๐Ÿ”ง Test Configuration ### Server Requirements Integration tests expect the server running at: - **URL:** `http://localhost:8080` - **Health endpoint:** `/api/health` returns `{"status": "ok"}` ### Test Data Integration tests use: - **Test user:** `integrationtest@test.com` - **Test document:** Simple text file with known content - **Timeout:** 30 seconds for OCR processing ## ๐Ÿ“Š Test Coverage ### What We Test **OCR Functionality:** - Document upload โ†’ OCR processing โ†’ text retrieval - OCR metadata validation (confidence, word count, timing) - Error handling for failed OCR processing **API Endpoints:** - Authentication flow (register/login) - Document management (upload/list) - OCR text retrieval (`/api/documents/{id}/ocr`) - Error responses (401, 404, 500) **Data Models:** - Type safety and field validation - Response structure consistency - Security (no password leaks) **Frontend Components:** - OCR dialog behavior - API integration and error handling - User interaction flows ### What We Don't Test - Tesseract OCR accuracy (external library) - Database schema migrations (handled by SQLx) - File system operations (handled by OS) - Network failures (covered by error handling) ## ๐Ÿ› Debugging Test Failures ### Integration Test Failures 1. **"Server is not running"** ```bash # Start the server first cargo run # Then run tests ./run_user_tests.sh ``` 2. **"OCR processing timed out"** - Check server logs for OCR errors - Ensure Tesseract is installed and configured - Increase timeout in test if needed 3. **"Authentication failed"** - Check JWT secret configuration - Verify database is accessible ### Unit Test Failures Unit tests should never fail due to external dependencies. If they do: 1. Check for compilation errors in models 2. Verify type definitions match expectations 3. Review recent changes to data structures ## ๐Ÿ”„ Continuous Integration For CI/CD pipelines: ```yaml # Example GitHub Actions workflow - name: Run Unit Tests run: cargo test --lib - name: Start Services run: docker-compose up -d - name: Wait for Health run: timeout 60s bash -c 'until curl -s http://localhost:8080/api/health | grep -q "ok"; do sleep 2; done' - name: Run Integration Tests run: cargo test --test integration_ocr_test ``` ## ๐Ÿ“ˆ Adding New Tests ### For New API Endpoints 1. Add unit tests for data models in `tests/unit_tests.rs` 2. Add integration test in `tests/integration_ocr_test.rs` 3. Add frontend tests if UI components involved ### For New OCR Features 1. Test the happy path (document โ†’ processing โ†’ retrieval) 2. Test error conditions (file format, processing failures) 3. Test performance/timeout scenarios 4. Validate response structure changes ## ๐ŸŽฏ Test Philosophy **Fast Feedback:** Unit tests run in milliseconds, integration tests in seconds. **Real User Scenarios:** Integration tests simulate actual user workflows. **Maintainable:** Tests are simple, focused, and well-documented. **Reliable:** Tests pass consistently and fail for good reasons. **Comprehensive:** Critical paths are covered, edge cases are handled.