3.7 KiB
3.7 KiB
Readur
A Rust-based document management system similar to paperless-ngx, featuring OCR, full-text search, and a modern React frontend.
Features
- = Authentication: JWT-based user authentication
- =<3D> File Upload: Support for PDF, text, images, and Office documents
- = OCR Processing: Automatic text extraction using Tesseract
- = Full-text Search: PostgreSQL-powered search with ranking
- =<3D> Folder Monitoring: Automatic processing of files in watch folder
- < Web Interface: Modern React frontend with drag-and-drop uploads
- =3 Docker Support: Complete containerization with multi-stage builds
Quick Start
Using Docker Compose (Recommended)
- Clone the repository:
git clone <repository-url>
cd readur
- Start the services:
docker-compose up -d
- Access the application at
http://localhost:8000
Manual Setup
Prerequisites
- Rust 1.75+
- PostgreSQL 15+
- Tesseract OCR
- Node.js 18+
Backend Setup
- Install system dependencies:
# Ubuntu/Debian
sudo apt-get install tesseract-ocr tesseract-ocr-eng libtesseract-dev libleptonica-dev
# macOS
brew install tesseract leptonica
- Set up environment variables:
cp .env.example .env
# Edit .env with your database URL and other settings
- Run database migrations:
cargo run
- Start the backend:
cargo run
Frontend Setup
- Navigate to frontend directory:
cd frontend
- Install dependencies:
npm install
- Start development server:
npm run dev
- Build for production:
npm run build
API Endpoints
Authentication
POST /api/auth/register- User registrationPOST /api/auth/login- User loginGET /api/auth/me- Get current user
Documents
POST /api/documents- Upload documentGET /api/documents- List user documentsGET /api/documents/:id/download- Download document
Search
GET /api/search?query=text- Search documents
Configuration
Environment variables:
DATABASE_URL- PostgreSQL connection stringJWT_SECRET- Secret key for JWT tokensSERVER_ADDRESS- Server bind address (default: 0.0.0.0:8000)UPLOAD_PATH- Directory for uploaded files (default: ./uploads)WATCH_FOLDER- Directory to monitor for new files (default: ./watch)ALLOWED_FILE_TYPES- Comma-separated list of allowed extensions
File Processing
The system supports:
- PDFs: Text extraction and OCR for scanned documents
- Images: OCR text extraction (PNG, JPG, JPEG, TIFF, BMP)
- Text files: Direct content indexing
- Office documents: DOC, DOCX support
Testing
Backend Tests
cargo test
Frontend Tests
cd frontend
npm test
Development
Project Structure
readur/
src/ # Rust backend source
auth.rs # Authentication logic
db.rs # Database operations
models.rs # Data models
ocr.rs # OCR processing
routes/ # API routes
tests/ # Unit tests
frontend/ # React frontend
src/
components/ # React components
contexts/ # React contexts
services/ # API services
package.json
Dockerfile # Multi-stage Docker build
docker-compose.yml # Development environment
Adding New Features
- Backend changes go in
src/ - Frontend changes go in
frontend/src/ - Add tests for new functionality
- Update API documentation
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
License
This project is licensed under the MIT License.