diff --git a/docs/images/gallery/image_2.png b/docs/images/gallery/image_2.png new file mode 100644 index 0000000..97ed026 Binary files /dev/null and b/docs/images/gallery/image_2.png differ diff --git a/docs/images/gallery/image_2.png:Zone.Identifier b/docs/images/gallery/image_2.png:Zone.Identifier new file mode 100644 index 0000000..a45e1ac --- /dev/null +++ b/docs/images/gallery/image_2.png:Zone.Identifier @@ -0,0 +1,2 @@ +[ZoneTransfer] +ZoneId=3 diff --git a/docs/images/gallery/image_3.png b/docs/images/gallery/image_3.png new file mode 100644 index 0000000..51bb0df Binary files /dev/null and b/docs/images/gallery/image_3.png differ diff --git a/docs/images/gallery/image_3.png:Zone.Identifier b/docs/images/gallery/image_3.png:Zone.Identifier new file mode 100644 index 0000000..a45e1ac --- /dev/null +++ b/docs/images/gallery/image_3.png:Zone.Identifier @@ -0,0 +1,2 @@ +[ZoneTransfer] +ZoneId=3 diff --git a/docs/images/gallery/image_4.png b/docs/images/gallery/image_4.png new file mode 100644 index 0000000..048aac8 Binary files /dev/null and b/docs/images/gallery/image_4.png differ diff --git a/docs/images/gallery/image_4.png:Zone.Identifier b/docs/images/gallery/image_4.png:Zone.Identifier new file mode 100644 index 0000000..a45e1ac --- /dev/null +++ b/docs/images/gallery/image_4.png:Zone.Identifier @@ -0,0 +1,2 @@ +[ZoneTransfer] +ZoneId=3 diff --git a/docs/images/gallery/image_5.png b/docs/images/gallery/image_5.png new file mode 100644 index 0000000..f902496 Binary files /dev/null and b/docs/images/gallery/image_5.png differ diff --git a/docs/images/gallery/image_5.png:Zone.Identifier b/docs/images/gallery/image_5.png:Zone.Identifier new file mode 100644 index 0000000..a45e1ac --- /dev/null +++ b/docs/images/gallery/image_5.png:Zone.Identifier @@ -0,0 +1,2 @@ +[ZoneTransfer] +ZoneId=3 diff --git a/docs/images/gallery/image_6.png b/docs/images/gallery/image_6.png new file mode 100644 index 0000000..535dde1 Binary files /dev/null and b/docs/images/gallery/image_6.png differ diff --git a/docs/images/gallery/image_6.png:Zone.Identifier b/docs/images/gallery/image_6.png:Zone.Identifier new file mode 100644 index 0000000..a45e1ac --- /dev/null +++ b/docs/images/gallery/image_6.png:Zone.Identifier @@ -0,0 +1,2 @@ +[ZoneTransfer] +ZoneId=3 diff --git a/docs/images/gallery/image_7.png b/docs/images/gallery/image_7.png new file mode 100644 index 0000000..5687851 Binary files /dev/null and b/docs/images/gallery/image_7.png differ diff --git a/docs/images/gallery/image_7.png:Zone.Identifier b/docs/images/gallery/image_7.png:Zone.Identifier new file mode 100644 index 0000000..a45e1ac --- /dev/null +++ b/docs/images/gallery/image_7.png:Zone.Identifier @@ -0,0 +1,2 @@ +[ZoneTransfer] +ZoneId=3 diff --git a/docs/per-user-watch-directories.md b/docs/per-user-watch-directories.md new file mode 100644 index 0000000..5a752eb --- /dev/null +++ b/docs/per-user-watch-directories.md @@ -0,0 +1,902 @@ +# Per-User Watch Directories Documentation + +## Table of Contents + +1. [Overview](#overview) +2. [Architecture and Components](#architecture-and-components) +3. [Prerequisites and Requirements](#prerequisites-and-requirements) +4. [Administrator Setup Guide](#administrator-setup-guide) +5. [User Guide](#user-guide) +6. [API Reference](#api-reference) +7. [Configuration Reference](#configuration-reference) +8. [Security Considerations](#security-considerations) +9. [Troubleshooting](#troubleshooting) +10. [Examples and Best Practices](#examples-and-best-practices) + +## Overview + +The Per-User Watch Directories feature in Readur allows each user to have their own dedicated folder for automatic document ingestion. When enabled, documents placed in a user's watch directory are automatically processed, OCR'd, and associated with that specific user's account. + +### Key Benefits + +- **User Isolation**: Each user's documents remain private and separate +- **Automatic Attribution**: Documents are automatically assigned to the correct user +- **Simplified Workflow**: Users can drop files into their folder without manual upload +- **Batch Processing**: Process multiple documents simultaneously +- **Integration Support**: Works with network shares, sync tools, and automated workflows + +### How It Works + +1. Administrator enables per-user watch directories in configuration +2. System creates a dedicated folder for each user (e.g., `/data/user_watch/username/`) +3. Users place documents in their watch folder +4. Readur's file watcher detects new files +5. Documents are automatically ingested and associated with the user +6. OCR processing extracts text for searching +7. Documents appear in the user's library + +## Architecture and Components + +### System Components + +1. **UserWatchService** (`src/services/user_watch_service.rs`) + - Manages user-specific watch directories + - Handles directory creation, validation, and cleanup + - Provides secure path operations + +2. **UserWatchManager** (`src/scheduling/user_watch_manager.rs`) + - Coordinates between file watcher and user management + - Maps file paths to users + - Manages user cache for performance + +3. **File Watcher** (`src/scheduling/watcher.rs`) + - Monitors both global and per-user directories + - Determines file ownership based on directory location + - Triggers document ingestion pipeline + +4. **API Endpoints** (`src/routes/users.rs`) + - REST API for managing user watch directories + - Provides status, creation, and deletion operations + +### Directory Structure + +``` +user_watch_base_dir/ # Base directory (configurable) +├── alice/ # User alice's watch directory +│ ├── document1.pdf +│ └── report.docx +├── bob/ # User bob's watch directory +│ └── invoice.pdf +└── charlie/ # User charlie's watch directory + ├── presentation.pptx + └── notes.txt +``` + +## Prerequisites and Requirements + +### System Requirements + +- **Operating System**: Linux, macOS, or Windows with proper file permissions +- **Storage**: Sufficient disk space for user directories and documents +- **File System**: Support for directory permissions (recommended: ext4, NTFS, APFS) +- **Readur Version**: 2.5.4 or later + +### Software Requirements + +- PostgreSQL database +- Readur server with file watching enabled +- Proper file system permissions for the Readur process + +### Network Requirements (Optional) + +- Network file system support (NFS, SMB/CIFS) for remote directories +- Stable network connection for remote file access + +## Administrator Setup Guide + +### Step 1: Enable Per-User Watch Directories + +Edit your `.env` file or set environment variables: + +```bash +# Enable the feature +ENABLE_PER_USER_WATCH=true + +# Set the base directory for user watch folders +USER_WATCH_BASE_DIR=/data/user_watch + +# Configure watch interval (optional, default: 60 seconds) +WATCH_INTERVAL_SECONDS=30 + +# Set file stability check (optional, default: 2000ms) +FILE_STABILITY_CHECK_MS=3000 + +# Set maximum file age to process (optional, default: 24 hours) +MAX_FILE_AGE_HOURS=48 +``` + +### Step 2: Create Base Directory + +Ensure the base directory exists with proper permissions: + +```bash +# Create the base directory +sudo mkdir -p /data/user_watch + +# Set ownership to the user running Readur +sudo chown readur:readur /data/user_watch + +# Set permissions (owner: read/write/execute, group: read/execute) +sudo chmod 755 /data/user_watch +``` + +### Step 3: Configure Directory Permissions + +For production environments, configure appropriate permissions: + +```bash +# Option 1: Shared group access +sudo groupadd readur-users +sudo usermod -a -G readur-users readur +sudo chgrp -R readur-users /data/user_watch +sudo chmod -R 2775 /data/user_watch # SGID bit ensures new files inherit group + +# Option 2: ACL-based permissions (more granular) +sudo setfacl -R -m u:readur:rwx /data/user_watch +sudo setfacl -R -d -m u:readur:rwx /data/user_watch +``` + +### Step 4: Network Share Setup (Optional) + +To allow users to access their watch directories via network shares: + +#### SMB/CIFS Share Configuration + +```ini +# /etc/samba/smb.conf +[readur-watch] + path = /data/user_watch + valid users = @readur-users + writable = yes + browseable = yes + create mask = 0660 + directory mask = 0770 + force group = readur-users +``` + +#### NFS Export Configuration + +```bash +# /etc/exports +/data/user_watch *(rw,sync,no_subtree_check,no_root_squash) +``` + +### Step 5: Restart Readur + +After configuration, restart the Readur service: + +```bash +# Systemd +sudo systemctl restart readur + +# Docker +docker-compose restart readur + +# Direct execution +# Stop the current process and start with new configuration +``` + +### Step 6: Verify Configuration + +Check the Readur logs to confirm per-user watch is enabled: + +```bash +# Check logs for confirmation +grep "Per-user watch enabled" /var/log/readur/readur.log + +# Expected output: +# ✅ Per-user watch enabled: true +# 📂 User watch base directory: /data/user_watch +``` + +## User Guide + +### Accessing Your Watch Directory + +#### Method 1: Direct File System Access + +If you have direct access to the server: + +```bash +# Navigate to your watch directory +cd /data/user_watch/your-username/ + +# Copy files +cp ~/Documents/*.pdf /data/user_watch/your-username/ + +# Move files +mv ~/Downloads/report.docx /data/user_watch/your-username/ +``` + +#### Method 2: Network Share Access + +Access via SMB/CIFS on Windows: + +1. Open File Explorer +2. Type in address bar: `\\server-name\readur-watch\your-username` +3. Drag and drop files into your folder + +Access via SMB/CIFS on macOS: + +1. Open Finder +2. Press Cmd+K +3. Enter: `smb://server-name/readur-watch/your-username` +4. Drag and drop files into your folder + +#### Method 3: Sync Tools + +Use synchronization tools for automatic uploads: + +```bash +# Using rsync +rsync -avz ~/Documents/*.pdf server:/data/user_watch/your-username/ + +# Using rclone +rclone copy ~/Documents server:user_watch/your-username/ + +# Using Syncthing (configure folder sync) +# Add /data/user_watch/your-username as a sync folder +``` + +### Managing Your Watch Directory via Web Interface + +1. **Check Directory Status** + - Navigate to Settings → Watch Folder + - View your watch directory path and status + - See if directory exists and is enabled + +2. **Create Your Directory** + - Click "Create Watch Directory" button + - System will create your personal folder + - Confirmation message will appear + +3. **View Directory Path** + - Your directory path is displayed + - Copy path for reference + - Share with IT for network access setup + +### Supported File Types + +Place any of these file types in your watch directory: + +- **Documents**: PDF, TXT, DOC, DOCX, ODT, RTF +- **Images**: PNG, JPG, JPEG, TIFF, BMP +- **Presentations**: PPT, PPTX, ODP +- **Spreadsheets**: XLS, XLSX, ODS + +### File Processing Workflow + +1. **File Detection**: System checks for new files every 30-60 seconds +2. **Stability Check**: Waits for file to stop changing (2-3 seconds) +3. **Validation**: Verifies file type and size +4. **Ingestion**: Creates document record in database +5. **OCR Queue**: Adds to processing queue +6. **Text Extraction**: OCR processes the document +7. **Search Index**: Document becomes searchable + +### Best Practices for Users + +1. **File Naming**: Use descriptive names for easier identification +2. **File Size**: Keep files under 50MB for optimal processing +3. **Batch Upload**: Can upload multiple files simultaneously +4. **Organization**: Create subfolders within your watch directory +5. **Patience**: Allow 1-5 minutes for processing depending on file size + +## API Reference + +### Get User Watch Directory Information + +Retrieve information about a user's watch directory. + +**Endpoint**: `GET /api/users/{user_id}/watch-directory` + +**Headers**: +```http +Authorization: Bearer {jwt_token} +``` + +**Response** (200 OK): +```json +{ + "user_id": "550e8400-e29b-41d4-a716-446655440000", + "username": "alice", + "watch_directory_path": "/data/user_watch/alice", + "exists": true, + "enabled": true +} +``` + +**Error Responses**: +- `401 Unauthorized`: Missing or invalid authentication +- `403 Forbidden`: Insufficient permissions +- `404 Not Found`: User not found +- `500 Internal Server Error`: Per-user watch disabled + +### Create User Watch Directory + +Create or ensure a user's watch directory exists. + +**Endpoint**: `POST /api/users/{user_id}/watch-directory` + +**Headers**: +```http +Authorization: Bearer {jwt_token} +Content-Type: application/json +``` + +**Request Body**: +```json +{ + "ensure_created": true +} +``` + +**Response** (200 OK): +```json +{ + "success": true, + "message": "Watch directory ready for user 'alice'", + "watch_directory_path": "/data/user_watch/alice" +} +``` + +**Error Responses**: +- `401 Unauthorized`: Missing or invalid authentication +- `403 Forbidden`: Insufficient permissions +- `404 Not Found`: User not found +- `500 Internal Server Error`: Creation failed or feature disabled + +### Delete User Watch Directory + +Remove a user's watch directory and its contents. + +**Endpoint**: `DELETE /api/users/{user_id}/watch-directory` + +**Headers**: +```http +Authorization: Bearer {jwt_token} +``` + +**Note**: Only administrators can delete watch directories. + +**Response** (200 OK): +```json +{ + "success": true, + "message": "Watch directory removed for user 'alice'", + "watch_directory_path": null +} +``` + +**Error Responses**: +- `401 Unauthorized`: Missing or invalid authentication +- `403 Forbidden`: Admin access required +- `404 Not Found`: User not found +- `500 Internal Server Error`: Deletion failed + +### API Usage Examples + +#### Python Example + +```python +import requests + +# Configuration +base_url = "https://readur.example.com/api" +token = "your-jwt-token" +user_id = "550e8400-e29b-41d4-a716-446655440000" + +headers = { + "Authorization": f"Bearer {token}", + "Content-Type": "application/json" +} + +# Get watch directory info +response = requests.get( + f"{base_url}/users/{user_id}/watch-directory", + headers=headers +) +info = response.json() +print(f"Watch directory: {info['watch_directory_path']}") +print(f"Exists: {info['exists']}") + +# Create watch directory +response = requests.post( + f"{base_url}/users/{user_id}/watch-directory", + headers=headers, + json={"ensure_created": True} +) +result = response.json() +if result['success']: + print(f"Created: {result['watch_directory_path']}") +``` + +#### JavaScript/TypeScript Example + +```typescript +// Using the provided API service +import { userWatchService } from './services/api'; + +// Get watch directory information +const getWatchInfo = async (userId: string) => { + try { + const response = await userWatchService.getUserWatchDirectory(userId); + console.log('Watch directory:', response.data.watch_directory_path); + console.log('Exists:', response.data.exists); + return response.data; + } catch (error) { + console.error('Failed to get watch directory info:', error); + } +}; + +// Create watch directory +const createWatchDirectory = async (userId: string) => { + try { + const response = await userWatchService.createUserWatchDirectory(userId); + if (response.data.success) { + console.log('Created:', response.data.watch_directory_path); + } + return response.data; + } catch (error) { + console.error('Failed to create watch directory:', error); + } +}; +``` + +#### cURL Examples + +```bash +# Get watch directory information +curl -X GET "https://readur.example.com/api/users/${USER_ID}/watch-directory" \ + -H "Authorization: Bearer ${TOKEN}" + +# Create watch directory +curl -X POST "https://readur.example.com/api/users/${USER_ID}/watch-directory" \ + -H "Authorization: Bearer ${TOKEN}" \ + -H "Content-Type: application/json" \ + -d '{"ensure_created": true}' + +# Delete watch directory (admin only) +curl -X DELETE "https://readur.example.com/api/users/${USER_ID}/watch-directory" \ + -H "Authorization: Bearer ${TOKEN}" +``` + +## Configuration Reference + +### Environment Variables + +| Variable | Type | Default | Description | +|----------|------|---------|-------------| +| `ENABLE_PER_USER_WATCH` | Boolean | `false` | Enable/disable per-user watch directories | +| `USER_WATCH_BASE_DIR` | String | `./user_watch` | Base directory for all user watch folders | +| `WATCH_INTERVAL_SECONDS` | Integer | `60` | How often to scan for new files (seconds) | +| `FILE_STABILITY_CHECK_MS` | Integer | `2000` | Time to wait for file size stability (milliseconds) | +| `MAX_FILE_AGE_HOURS` | Integer | `24` | Maximum age of files to process (hours) | + +### Configuration Validation + +The system performs several validation checks: + +1. **Path Validation**: Ensures paths are distinct and non-overlapping +2. **Directory Conflicts**: Prevents USER_WATCH_BASE_DIR from being: + - The same as UPLOAD_PATH + - The same as WATCH_FOLDER + - Inside UPLOAD_PATH + - Containing UPLOAD_PATH + +### Docker Configuration + +When using Docker, mount the user watch directory: + +```yaml +version: '3.8' + +services: + readur: + image: readur:latest + environment: + - ENABLE_PER_USER_WATCH=true + - USER_WATCH_BASE_DIR=/app/user_watch + - WATCH_INTERVAL_SECONDS=30 + volumes: + - ./user_watch:/app/user_watch + - ./uploads:/app/uploads + - ./watch:/app/watch + ports: + - "8000:8000" +``` + +### Kubernetes Configuration + +For Kubernetes deployments: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: readur-config +data: + ENABLE_PER_USER_WATCH: "true" + USER_WATCH_BASE_DIR: "/data/user_watch" + WATCH_INTERVAL_SECONDS: "30" +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: readur +spec: + template: + spec: + containers: + - name: readur + image: readur:latest + envFrom: + - configMapRef: + name: readur-config + volumeMounts: + - name: user-watch + mountPath: /data/user_watch + volumes: + - name: user-watch + persistentVolumeClaim: + claimName: readur-user-watch-pvc +``` + +## Security Considerations + +### Username Validation + +The system enforces strict username validation to prevent security issues: + +- **Length**: 1-64 characters +- **Allowed Characters**: Alphanumeric, underscore (_), dash (-) +- **Prohibited Patterns**: + - Path traversal attempts (.., /) + - Hidden directories (starting with .) + - Null bytes or special characters + +### Directory Permissions + +1. **User Isolation**: Each user's directory is separate +2. **Permission Model**: 755 (owner: rwx, group: r-x, others: r-x) +3. **Ownership**: Readur process owns all directories +4. **SGID Bit**: Optional for group inheritance + +### Path Security + +- **Canonicalization**: All paths are canonicalized to prevent traversal +- **Boundary Checking**: Files must be within designated directories +- **Validation**: Extracted usernames are validated before use + +### Access Control + +- **API Protection**: JWT authentication required +- **Permission Levels**: + - Users: Can only access their own directory + - Admins: Can manage all directories +- **Directory Creation**: Users can create their own, admins can create any +- **Directory Deletion**: Admin-only operation + +### Audit Considerations + +1. **Logging**: All directory operations are logged +2. **File Attribution**: Documents tracked to source user +3. **Access Tracking**: API access logged with user context + +## Troubleshooting + +### Common Issues and Solutions + +#### Issue: Per-user watch directories not working + +**Symptoms**: Files in user directories are not processed + +**Solutions**: +1. Verify feature is enabled: + ```bash + grep ENABLE_PER_USER_WATCH .env + # Should show: ENABLE_PER_USER_WATCH=true + ``` + +2. Check base directory exists and has correct permissions: + ```bash + ls -la /data/user_watch + # Should show readur as owner with 755 permissions + ``` + +3. Review logs for errors: + ```bash + grep -i "user watch" /var/log/readur/readur.log + ``` + +#### Issue: "User watch service not initialized" error + +**Symptoms**: API returns 500 error when accessing watch directories + +**Solutions**: +1. Ensure ENABLE_PER_USER_WATCH=true in configuration +2. Restart Readur service +3. Check initialization logs for errors + +#### Issue: Files not being detected + +**Symptoms**: Files placed in watch directory are not processed + +**Solutions**: +1. Check file permissions: + ```bash + ls -la /data/user_watch/username/ + # Files should be readable by readur user + ``` + +2. Verify file type is supported: + ```bash + echo $ALLOWED_FILE_TYPES + # Ensure your file extension is included + ``` + +3. Check file age restriction: + ```bash + # Files older than MAX_FILE_AGE_HOURS are ignored + find /data/user_watch -type f -mtime +1 + ``` + +#### Issue: Permission denied errors + +**Symptoms**: Users cannot write to their watch directories + +**Solutions**: +1. Fix directory ownership: + ```bash + sudo chown -R readur:readur /data/user_watch + ``` + +2. Set correct permissions: + ```bash + sudo chmod -R 755 /data/user_watch + ``` + +3. For shared access, use group permissions: + ```bash + sudo chmod -R 775 /data/user_watch + sudo chgrp -R readur-users /data/user_watch + ``` + +#### Issue: Duplicate documents created + +**Symptoms**: Same file creates multiple documents + +**Solutions**: +1. Ensure file stability check is adequate: + ```bash + # Increase if files are still being written + FILE_STABILITY_CHECK_MS=5000 + ``` + +2. Check for file system issues (timestamps, inode changes) +3. Review deduplication settings in configuration + +### Diagnostic Commands + +```bash +# Check if user watch is enabled +curl -H "Authorization: Bearer $TOKEN" \ + https://readur.example.com/api/users/$USER_ID/watch-directory + +# List all user directories +ls -la /data/user_watch/ + +# Check file watcher logs +journalctl -u readur | grep -i "watch" + +# Monitor file processing in real-time +tail -f /var/log/readur/readur.log | grep -E "(Processing new file|watch)" + +# Check directory permissions +namei -l /data/user_watch/username/ + +# Find recently modified files +find /data/user_watch -type f -mmin -60 + +# Check disk space +df -h /data/user_watch +``` + +## Examples and Best Practices + +### Example 1: Small Team Setup + +For a team of 5-10 users with local file access: + +```bash +# .env configuration +ENABLE_PER_USER_WATCH=true +USER_WATCH_BASE_DIR=/srv/readur/user_watches +WATCH_INTERVAL_SECONDS=60 +FILE_STABILITY_CHECK_MS=2000 +MAX_FILE_AGE_HOURS=72 + +# Directory structure +/srv/readur/user_watches/ +├── alice/ +├── bob/ +├── charlie/ +├── diana/ +└── edward/ +``` + +### Example 2: Enterprise Network Share Integration + +For larger organizations with network shares: + +```bash +# Mount network share +sudo mount -t cifs //fileserver/readur /mnt/readur \ + -o username=readur,domain=COMPANY + +# .env configuration +ENABLE_PER_USER_WATCH=true +USER_WATCH_BASE_DIR=/mnt/readur/user_watches +WATCH_INTERVAL_SECONDS=120 # Slower for network +FILE_STABILITY_CHECK_MS=5000 # Higher for network delays +``` + +### Example 3: Automated Document Workflow + +Script for automatic document routing: + +```python +#!/usr/bin/env python3 +""" +Auto-route documents to user watch directories based on metadata +""" +import os +import shutil +from pathlib import Path + +def route_document(file_path, user_mapping): + """Route document to appropriate user watch directory""" + + # Extract metadata (example: from filename) + filename = os.path.basename(file_path) + + # Determine target user (implement your logic) + if "invoice" in filename.lower(): + target_user = "accounting" + elif "report" in filename.lower(): + target_user = "management" + else: + target_user = "general" + + # Move to user's watch directory + user_watch_dir = Path(f"/data/user_watch/{target_user}") + if user_watch_dir.exists(): + dest = user_watch_dir / filename + shutil.move(file_path, dest) + print(f"Moved {filename} to {target_user}'s watch directory") + else: + print(f"User {target_user} watch directory does not exist") + +# Monitor incoming directory +incoming_dir = Path("/srv/incoming") +for file_path in incoming_dir.glob("*.pdf"): + route_document(file_path, user_mapping={}) +``` + +### Example 4: Bulk User Setup + +PowerShell script for creating multiple user directories: + +```powershell +# bulk-create-watch-dirs.ps1 +$baseUrl = "https://readur.example.com/api" +$adminToken = "your-admin-token" + +$users = @("alice", "bob", "charlie", "diana", "edward") + +foreach ($username in $users) { + # Get user ID + $userResponse = Invoke-RestMethod ` + -Uri "$baseUrl/users" ` + -Headers @{Authorization="Bearer $adminToken"} + + $user = $userResponse | Where-Object {$_.username -eq $username} + + if ($user) { + # Create watch directory + $body = @{ensure_created=$true} | ConvertTo-Json + + $result = Invoke-RestMethod ` + -Method Post ` + -Uri "$baseUrl/users/$($user.id)/watch-directory" ` + -Headers @{ + Authorization="Bearer $adminToken" + "Content-Type"="application/json" + } ` + -Body $body + + Write-Host "Created watch directory for $username at $($result.watch_directory_path)" + } +} +``` + +### Best Practices Summary + +#### For Administrators + +1. **Capacity Planning**: Allocate 1-5GB per user for watch directories +2. **Backup Strategy**: Include user watch directories in backup plans +3. **Monitoring**: Set up alerts for disk space and processing failures +4. **Documentation**: Maintain user guide with network paths +5. **Testing**: Test with various file types and sizes before deployment + +#### For Users + +1. **File Organization**: Use meaningful filenames and folder structure +2. **File Formats**: Prefer PDF for best OCR results +3. **Batch Processing**: Group related documents for upload +4. **Size Limits**: Split large documents if over 50MB +5. **Patience**: Allow processing time before expecting search results + +#### For Developers + +1. **API Integration**: Use provided client libraries when available +2. **Error Handling**: Implement retry logic for transient failures +3. **Validation**: Validate file types before placing in watch directories +4. **Monitoring**: Track processing status via WebSocket updates +5. **Caching**: Cache user directory paths to reduce API calls + +### Performance Optimization + +1. **File System**: Use SSD storage for watch directories +2. **Network**: Minimize latency for network-mounted directories +3. **Scheduling**: Adjust watch interval based on usage patterns +4. **Concurrency**: Configure OCR workers based on CPU cores +5. **Cleanup**: Implement retention policies for processed files + +### Migration from Global Watch Directory + +To migrate from a single global watch directory to per-user directories: + +1. **Preparation**: + ```bash + # Backup existing watch directory + tar -czf watch_backup.tar.gz /data/watch/ + ``` + +2. **Enable Feature**: + ```bash + # Update configuration + ENABLE_PER_USER_WATCH=true + USER_WATCH_BASE_DIR=/data/user_watch + ``` + +3. **Create User Directories**: + ```bash + # Script to create directories for existing users + for user in $(psql -d readur -c "SELECT username FROM users" -t); do + mkdir -p "/data/user_watch/$user" + chown readur:readur "/data/user_watch/$user" + done + ``` + +4. **Migrate Documents** (optional): + - Keep existing documents in place + - Or reassign to appropriate users through the UI + +5. **Update Documentation**: + - Notify users of new directory locations + - Update any automation scripts + - Revise backup procedures + +This completes the comprehensive documentation for the Per-User Watch Directories feature in Readur. \ No newline at end of file