# Readur ๐Ÿ“„ A powerful, modern document management system built with Rust and React. Readur provides intelligent document processing with OCR capabilities, full-text search, and a beautiful web interface designed for 2026 tech standards. ## โœจ Features - ๐Ÿ” **Secure Authentication**: JWT-based user authentication with bcrypt password hashing - ๐Ÿ“ค **Smart File Upload**: Drag-and-drop support for PDF, images, text files, and Office documents - ๐Ÿ” **Advanced OCR**: Automatic text extraction using Tesseract for searchable document content - ๐Ÿ”Ž **Powerful Search**: PostgreSQL full-text search with advanced filtering and ranking - ๐Ÿ‘๏ธ **Folder Monitoring**: Non-destructive file watching (unlike paperless-ngx, doesn't consume source files) - ๐ŸŽจ **Modern UI**: Beautiful React frontend with Material-UI components and responsive design - ๐Ÿณ **Docker Ready**: Complete containerization with production-ready multi-stage builds - โšก **High Performance**: Rust backend for speed and reliability - ๐Ÿ“Š **Analytics Dashboard**: Document statistics and processing status overview ## ๐Ÿš€ Quick Start ### Using Docker Compose (Recommended) The fastest way to get Readur running: ```bash # Clone the repository git clone cd readur # Start all services docker compose up --build # Access the application open http://localhost:8000 ``` **Default login credentials:** - Username: `admin` - Password: `admin123` > โš ๏ธ **Important**: Change the default admin password immediately after first login! ### What You Get After deployment, you'll have: - **Web Interface**: Modern document management UI at `http://localhost:8000` - **PostgreSQL Database**: Document metadata and full-text search indexes - **File Storage**: Persistent document storage with OCR processing - **REST API**: Full API access for integrations ## ๐Ÿ—๏ธ Architecture ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ React Frontend โ”‚โ”€โ”€โ”€โ”€โ”‚ Rust Backend โ”‚โ”€โ”€โ”€โ”€โ”‚ PostgreSQL DB โ”‚ โ”‚ (Port 8000) โ”‚ โ”‚ (Axum API) โ”‚ โ”‚ (Port 5433) โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚ File Storage โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ + OCR Engine โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ## ๐Ÿ“‹ System Requirements ### Minimum Requirements - **CPU**: 2 cores - **RAM**: 2GB - **Storage**: 10GB free space - **OS**: Linux, macOS, or Windows with Docker ### Recommended for Production - **CPU**: 4+ cores - **RAM**: 4GB+ - **Storage**: 50GB+ SSD - **Network**: Stable internet connection for OCR processing ## ๐Ÿ› ๏ธ Manual Installation For development or custom deployments without Docker: ### Prerequisites Install these dependencies on your system: ```bash # Ubuntu/Debian sudo apt-get update sudo apt-get install -y \ tesseract-ocr tesseract-ocr-eng \ libtesseract-dev libleptonica-dev \ postgresql postgresql-contrib \ pkg-config libclang-dev # macOS (requires Homebrew) brew install tesseract leptonica postgresql rust nodejs npm # Install Rust (if not already installed) curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh ``` ### Backend Setup 1. **Configure Database**: ```bash # Create database and user sudo -u postgres psql CREATE DATABASE readur; CREATE USER readur_user WITH ENCRYPTED PASSWORD 'your_password'; GRANT ALL PRIVILEGES ON DATABASE readur TO readur_user; \q ``` 2. **Environment Configuration**: ```bash # Copy environment template cp .env.example .env # Edit configuration nano .env ``` Required environment variables: ```env DATABASE_URL=postgresql://readur_user:your_password@localhost/readur JWT_SECRET=your-super-secret-jwt-key-change-this SERVER_ADDRESS=0.0.0.0:8000 UPLOAD_PATH=./uploads WATCH_FOLDER=./watch ALLOWED_FILE_TYPES=pdf,png,jpg,jpeg,gif,bmp,tiff,txt,rtf,doc,docx ``` 3. **Build and Run Backend**: ```bash # Install dependencies and run cargo build --release cargo run ``` ### Frontend Setup 1. **Install Dependencies**: ```bash cd frontend npm install ``` 2. **Development Mode**: ```bash npm run dev # Frontend available at http://localhost:5173 ``` 3. **Production Build**: ```bash npm run build # Built files in frontend/dist/ ``` ## ๐Ÿ“– User Guide ### Getting Started 1. **First Login**: Use the default admin credentials to access the system 2. **Upload Documents**: Drag and drop files or use the upload button 3. **Wait for Processing**: OCR processing happens automatically in the background 4. **Search and Organize**: Use the powerful search features to find your documents ### Supported File Types | Type | Extensions | OCR Support | Notes | |------|-----------|-------------|-------| | **PDF** | `.pdf` | โœ… | Text extraction + OCR for scanned pages | | **Images** | `.png`, `.jpg`, `.jpeg`, `.tiff`, `.bmp`, `.gif` | โœ… | Full OCR text extraction | | **Text** | `.txt`, `.rtf` | โŒ | Direct text indexing | | **Office** | `.doc`, `.docx` | โš ๏ธ | Limited support | ### Using the Interface #### Dashboard - **Document Statistics**: Total documents, storage usage, OCR status - **Recent Activity**: Latest uploads and processing status - **Quick Actions**: Fast access to upload and search #### Document Management - **List/Grid View**: Toggle between different viewing modes - **Sorting**: Sort by date, name, size, or file type - **Filtering**: Filter by tags, file types, and OCR status - **Bulk Actions**: Select multiple documents for batch operations #### Advanced Search - **Full-text Search**: Search within document content - **Metadata Filters**: Filter by upload date, file size, type - **Tag System**: Organize documents with custom tags - **OCR Status**: Find processed vs. pending documents #### Folder Watching - **Non-destructive**: Unlike paperless-ngx, source files remain untouched - **Automatic Processing**: New files are detected and processed automatically - **Configurable**: Set custom watch directories ### Tips for Best Results 1. **OCR Quality**: Higher resolution images (300+ DPI) produce better OCR results 2. **File Organization**: Use consistent naming conventions for easier searching 3. **Regular Backups**: Backup both database and file storage regularly 4. **Performance**: For large document collections, consider increasing server resources ## ๐Ÿ”ง Configuration ### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `DATABASE_URL` | - | PostgreSQL connection string | | `JWT_SECRET` | - | Secret key for JWT tokens (required) | | `SERVER_ADDRESS` | `0.0.0.0:8000` | Server bind address | | `UPLOAD_PATH` | `./uploads` | Document storage directory | | `WATCH_FOLDER` | `./watch` | Folder monitoring directory | | `ALLOWED_FILE_TYPES` | `pdf,png,jpg,jpeg,txt,doc,docx` | Allowed file extensions | ### Docker Configuration Customize `docker-compose.yml` for your environment: ```yaml services: readur: environment: - JWT_SECRET=change-this-secret-key - UPLOAD_PATH=/app/uploads volumes: - ./data/uploads:/app/uploads - ./data/watch:/app/watch ports: - "8000:8000" ``` ### Database Tuning For better search performance with large document collections: ```sql -- Increase shared_buffers for better caching ALTER SYSTEM SET shared_buffers = '256MB'; -- Optimize for full-text search ALTER SYSTEM SET default_text_search_config = 'pg_catalog.english'; -- Restart PostgreSQL after changes ``` ## ๐Ÿ”Œ API Reference ### Authentication Endpoints ```bash # Register new user POST /api/auth/register Content-Type: application/json { "username": "john_doe", "email": "john@example.com", "password": "secure_password" } # Login POST /api/auth/login Content-Type: application/json { "username": "john_doe", "password": "secure_password" } # Get current user GET /api/auth/me Authorization: Bearer ``` ### Document Management ```bash # Upload document POST /api/documents Authorization: Bearer Content-Type: multipart/form-data file: # List documents GET /api/documents?limit=50&offset=0 Authorization: Bearer # Download document GET /api/documents/{id}/download Authorization: Bearer ``` ### Search ```bash # Search documents GET /api/search?query=contract&limit=20 Authorization: Bearer # Advanced search with filters GET /api/search?query=invoice&mime_types=application/pdf&tags=important Authorization: Bearer ``` ## ๐Ÿงช Testing ### Run All Tests ```bash # Backend tests cargo test # Frontend tests cd frontend && npm test # Integration tests with Docker docker compose -f docker-compose.test.yml up --build ``` ### Test Coverage ```bash # Install cargo-tarpaulin for coverage cargo install cargo-tarpaulin # Generate coverage report cargo tarpaulin --out Html ``` ## ๐Ÿ”’ Security Considerations ### Production Deployment 1. **Change Default Credentials**: Update admin password immediately 2. **Use Strong JWT Secret**: Generate a secure random key 3. **Enable HTTPS**: Use a reverse proxy with SSL/TLS 4. **Database Security**: Use strong passwords and restrict network access 5. **File Permissions**: Ensure proper file system permissions 6. **Regular Updates**: Keep dependencies and base images updated ### Recommended Production Setup ```bash # Use environment-specific secrets JWT_SECRET=$(openssl rand -base64 64) # Restrict database access # Only allow connections from application container # Use read-only file system where possible # Mount uploads and watch folders as separate volumes ``` ## ๐Ÿš€ Deployment Options ### Docker Swarm ```yaml version: '3.8' services: readur: image: readur:latest deploy: replicas: 2 restart_policy: condition: on-failure networks: - readur-network secrets: - jwt_secret - db_password ``` ### Kubernetes ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: readur spec: replicas: 3 selector: matchLabels: app: readur template: spec: containers: - name: readur image: readur:latest env: - name: JWT_SECRET valueFrom: secretKeyRef: name: readur-secrets key: jwt-secret ``` ### Cloud Platforms - **AWS**: Use ECS with RDS PostgreSQL - **Google Cloud**: Deploy to Cloud Run with Cloud SQL - **Azure**: Use Container Instances with Azure Database - **DigitalOcean**: App Platform with Managed Database ## ๐Ÿค Contributing We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details. ### Development Setup ```bash # Fork and clone the repository git clone https://github.com/yourusername/readur.git cd readur # Create a feature branch git checkout -b feature/amazing-feature # Make your changes and test cargo test cd frontend && npm test # Submit a pull request ``` ### Code Style - **Rust**: Follow `rustfmt` and `clippy` recommendations - **Frontend**: Use Prettier and ESLint configurations - **Commits**: Use conventional commit format ## ๐Ÿ“ License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## ๐Ÿ™ Acknowledgments - [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) for text extraction - [Axum](https://github.com/tokio-rs/axum) for the web framework - [Material-UI](https://mui.com/) for the beautiful frontend components - [PostgreSQL](https://www.postgresql.org/) for robust full-text search ## ๐Ÿ“ž Support - **Documentation**: Check this README and inline code comments - **Issues**: Report bugs and request features on GitHub Issues - **Discussions**: Join community discussions on GitHub Discussions --- **Made with โค๏ธ and โ˜• by the Readur team**