5.7 KiB
Testing Guide
This document describes the testing strategy for the Readur OCR document management system.
🧪 Testing Strategy
We have a clean three-tier testing approach:
- Unit Tests (Rust) - Fast, no dependencies, test individual components
- Integration Tests (Python) - Test against running services, user workflow validation
- Frontend Tests (JavaScript) - Component and API integration testing
🚀 Quick Start
Using the Rust Test Runner (Recommended)
# Run all tests
cargo run --bin test_runner
# Run specific test types
cargo run --bin test_runner unit # Unit tests only
cargo run --bin test_runner integration # Integration tests only
cargo run --bin test_runner frontend # Frontend tests only
Manual Test Execution
# Unit tests (fast, no dependencies)
cargo test --test unit_tests
# Integration tests (requires running server)
# 1. Start server: cargo run
# 2. Run tests: cargo test --test integration_tests
# Frontend tests
cd frontend && npm test -- --run
📋 Test Categories
Unit Tests (tests/unit_tests.rs)
Rust-based tests for core data structures and conversions without external dependencies:
- ✅ Document response conversion (with/without OCR)
- ✅ OCR field validation (confidence, word count, processing time)
- ✅ User response conversion (security - no password leaks)
- ✅ Search mode defaults and enums
Run with: cargo test --test unit_tests or cargo run --bin test_runner unit
Integration Tests (tests/integration_tests.rs)
Rust-based tests for complete user workflows against running services:
- ✅ User registration and authentication (using
CreateUser,LoginRequesttypes) - ✅ Document upload via multipart form (returns
DocumentResponse) - ✅ OCR processing completion (with timeout and type validation)
- ✅ OCR text retrieval via API endpoint (validates response structure)
- ✅ Error handling (401, 404 responses)
- ✅ Health endpoint validation
Run with: cargo test --test integration_tests or cargo run --bin test_runner integration
Advantages of Rust Integration Tests:
- 🔒 Type Safety - Uses same models/types as main application
- 🚀 Performance - Faster execution than Python scripts
- 🛠️ IDE Support - Full autocomplete and refactoring support
- 🔗 Code Reuse - Can import validation logic and test helpers
Frontend Tests
Located in frontend/src/:
- ✅ Document details page with OCR functionality
- ✅ API service mocking and integration
- ✅ Component behavior and user interactions
Run with: cd frontend && npm test
🔧 Test Configuration
Server Requirements
Integration tests expect the server running at:
- URL:
http://localhost:8080 - Health endpoint:
/api/healthreturns{"status": "ok"}
Test Data
Integration tests use:
- Test user:
integrationtest@test.com - Test document: Simple text file with known content
- Timeout: 30 seconds for OCR processing
📊 Test Coverage
What We Test
OCR Functionality:
- Document upload → OCR processing → text retrieval
- OCR metadata validation (confidence, word count, timing)
- Error handling for failed OCR processing
API Endpoints:
- Authentication flow (register/login)
- Document management (upload/list)
- OCR text retrieval (
/api/documents/{id}/ocr) - Error responses (401, 404, 500)
Data Models:
- Type safety and field validation
- Response structure consistency
- Security (no password leaks)
Frontend Components:
- OCR dialog behavior
- API integration and error handling
- User interaction flows
What We Don't Test
- Tesseract OCR accuracy (external library)
- Database schema migrations (handled by SQLx)
- File system operations (handled by OS)
- Network failures (covered by error handling)
🐛 Debugging Test Failures
Integration Test Failures
-
"Server is not running"
# Start the server first cargo run # Then run tests ./run_user_tests.sh -
"OCR processing timed out"
- Check server logs for OCR errors
- Ensure Tesseract is installed and configured
- Increase timeout in test if needed
-
"Authentication failed"
- Check JWT secret configuration
- Verify database is accessible
Unit Test Failures
Unit tests should never fail due to external dependencies. If they do:
- Check for compilation errors in models
- Verify type definitions match expectations
- Review recent changes to data structures
🔄 Continuous Integration
For CI/CD pipelines:
# Example GitHub Actions workflow
- name: Run Unit Tests
run: cargo test --lib
- name: Start Services
run: docker-compose up -d
- name: Wait for Health
run: timeout 60s bash -c 'until curl -s http://localhost:8080/api/health | grep -q "ok"; do sleep 2; done'
- name: Run Integration Tests
run: cargo test --test integration_ocr_test
📈 Adding New Tests
For New API Endpoints
- Add unit tests for data models in
tests/unit_tests.rs - Add integration test in
tests/integration_ocr_test.rs - Add frontend tests if UI components involved
For New OCR Features
- Test the happy path (document → processing → retrieval)
- Test error conditions (file format, processing failures)
- Test performance/timeout scenarios
- Validate response structure changes
🎯 Test Philosophy
Fast Feedback: Unit tests run in milliseconds, integration tests in seconds.
Real User Scenarios: Integration tests simulate actual user workflows.
Maintainable: Tests are simple, focused, and well-documented.
Reliable: Tests pass consistently and fail for good reasons.
Comprehensive: Critical paths are covered, edge cases are handled.