feat(docs): update documentation for quite a few things

This commit is contained in:
perf3ct 2025-07-11 19:55:18 +00:00
parent 13d868ab36
commit 98b116c68c
No known key found for this signature in database
GPG Key ID: 569C4EEC436F5232
6 changed files with 2310 additions and 43 deletions

View File

@ -7,11 +7,16 @@ A powerful, modern document management system built with Rust and React. Readur
## ✨ Features
- 🔐 **Secure Authentication**: JWT-based user authentication with bcrypt password hashing
- 🔐 **Secure Authentication**: JWT-based user authentication with bcrypt password hashing + OIDC/SSO support
- 👥 **User Management**: Role-based access control with Admin and User roles
- 📤 **Smart File Upload**: Drag-and-drop support for PDF, images, text files, and Office documents
- 🔍 **Advanced OCR**: Automatic text extraction using Tesseract for searchable document content
- 🔎 **Powerful Search**: PostgreSQL full-text search with advanced filtering and ranking
- 👁️ **Folder Monitoring**: Non-destructive file watching (unlike paperless-ngx, doesn't consume source files)
- 🔎 **Powerful Search**: PostgreSQL full-text search with multiple modes (simple, phrase, fuzzy, boolean)
- 🔗 **Multi-Source Sync**: WebDAV, Local Folders, and S3-compatible storage integration
- 🏷️ **Labels & Organization**: Comprehensive tagging system with color-coding and hierarchical structure
- 👁️ **Folder Monitoring**: Non-destructive file watching with intelligent sync scheduling
- 📊 **Health Monitoring**: Proactive source validation and system health tracking
- 🔔 **Notifications**: Real-time alerts for sync events, OCR completion, and system status
- 🎨 **Modern UI**: Beautiful React frontend with Material-UI components and responsive design
- 🐳 **Docker Ready**: Complete containerization with production-ready multi-stage builds
- ⚡ **High Performance**: Rust backend for speed and reliability
@ -44,6 +49,13 @@ open http://localhost:8000
- [🔧 Configuration](docs/configuration.md) - Environment variables and settings
- [📖 User Guide](docs/user-guide.md) - How to use Readur effectively
### Core Features
- [🔗 Sources Guide](docs/sources-guide.md) - WebDAV, Local Folders, and S3 integration
- [👥 User Management](docs/user-management-guide.md) - Authentication, roles, and administration
- [🏷️ Labels & Organization](docs/labels-and-organization.md) - Document tagging and categorization
- [🔎 Advanced Search](docs/advanced-search.md) - Search modes, syntax, and optimization
- [🔐 OIDC Setup](docs/oidc-setup.md) - Single Sign-On integration
### Deployment & Operations
- [🚀 Deployment Guide](docs/deployment.md) - Production deployment, SSL, monitoring
- [🔄 Reverse Proxy Setup](docs/REVERSE_PROXY.md) - Nginx, Traefik, and more

687
docs/advanced-search.md Normal file
View File

@ -0,0 +1,687 @@
# Advanced Search Guide
Readur provides powerful search capabilities that go far beyond simple text matching. This comprehensive guide covers all search modes, advanced filtering, query syntax, and optimization techniques.
## Table of Contents
- [Overview](#overview)
- [Search Modes](#search-modes)
- [Query Syntax](#query-syntax)
- [Advanced Filtering](#advanced-filtering)
- [Search Interface](#search-interface)
- [Search Optimization](#search-optimization)
- [Saved Searches](#saved-searches)
- [Search Analytics](#search-analytics)
- [API Search](#api-search)
- [Troubleshooting](#troubleshooting)
## Overview
Readur's search system is built on PostgreSQL's full-text search capabilities with additional enhancements for document-specific requirements.
### Search Capabilities
- **Full-Text Search**: Search within document content and OCR-extracted text
- **Multiple Search Modes**: Simple, phrase, fuzzy, and boolean search options
- **Advanced Filtering**: Filter by file type, date, size, labels, and source
- **Real-Time Suggestions**: Auto-complete and query suggestions as you type
- **Faceted Search**: Browse documents by categories and properties
- **Cross-Language Support**: Search in multiple languages with OCR text
- **Relevance Ranking**: Intelligent scoring and result ordering
### Search Sources
Readur searches across multiple content sources:
1. **Document Content**: Original text from text files and PDFs
2. **OCR Text**: Extracted text from images and scanned documents
3. **Metadata**: File names, descriptions, and document properties
4. **Labels**: User-created and system-generated tags
5. **Source Information**: Upload source and file paths
## Search Modes
### Simple Search (Smart Search)
**Best for**: General purpose searching and quick document discovery
**How it works**:
- Automatically applies stemming and fuzzy matching
- Searches across all text content and metadata
- Provides intelligent relevance scoring
- Handles common typos and variations
**Example**:
```
invoice 2024
```
Finds: "Invoice Q1 2024", "invoicing for 2024", "2024 invoice data"
**Features**:
- **Auto-stemming**: "running" matches "run", "runs", "runner"
- **Fuzzy tolerance**: "recieve" matches "receive"
- **Partial matching**: "doc" matches "document", "documentation"
- **Relevance ranking**: More relevant matches appear first
### Phrase Search (Exact Match)
**Best for**: Finding exact phrases or specific terminology
**How it works**:
- Searches for the exact sequence of words
- Case-insensitive but order-sensitive
- Useful for finding specific quotes, names, or technical terms
**Syntax**: Use quotes around the phrase
```
"quarterly financial report"
"John Smith"
"error code 404"
```
**Features**:
- **Exact word order**: Only matches the precise sequence
- **Case insensitive**: "John Smith" matches "john smith"
- **Punctuation ignored**: "error-code" matches "error code"
### Fuzzy Search (Approximate Matching)
**Best for**: Handling typos, OCR errors, and spelling variations
**How it works**:
- Uses trigram similarity to find approximate matches
- Configurable similarity threshold (default: 0.8)
- Particularly useful for OCR-processed documents with errors
**Syntax**: Use the `~` operator
```
invoice~ # Finds "invoice", "invoce", "invoise"
contract~ # Finds "contract", "contarct", "conract"
```
**Configuration**:
- **Threshold adjustment**: Configure sensitivity via user settings
- **Language-specific**: Different languages may need different thresholds
- **OCR optimization**: Higher tolerance for OCR-processed documents
### Boolean Search (Logical Operators)
**Best for**: Complex queries with multiple conditions and precise control
**Operators**:
- **AND**: Both terms must be present
- **OR**: Either term can be present
- **NOT**: Exclude documents with the term
- **Parentheses**: Group conditions
**Examples**:
```
budget AND 2024 # Both "budget" and "2024"
invoice OR receipt # Either "invoice" or "receipt"
contract NOT draft # "contract" but not "draft"
(budget OR financial) AND 2024 # Complex grouping
marketing AND (campaign OR strategy) # Marketing documents about campaigns or strategy
```
**Advanced Boolean Examples**:
```
# Find completed project documents
project AND (final OR completed OR approved) NOT draft
# Financial documents excluding personal items
(invoice OR receipt OR budget) NOT personal
# Recent important documents
(urgent OR priority OR critical) AND label:"this month"
```
## Query Syntax
### Field-Specific Search
Search within specific document fields for precise targeting.
#### Available Fields
| Field | Description | Example |
|-------|-------------|---------|
| `filename:` | Search in file names | `filename:invoice` |
| `content:` | Search in document text | `content:"project status"` |
| `label:` | Search by labels | `label:urgent` |
| `type:` | Search by file type | `type:pdf` |
| `source:` | Search by upload source | `source:webdav` |
| `size:` | Search by file size | `size:>10MB` |
| `date:` | Search by date | `date:2024-01-01` |
#### Field Search Examples
```
filename:contract AND date:2024 # Contracts from 2024
label:"high priority" OR label:urgent # Priority documents
type:pdf AND content:budget # PDF files containing "budget"
source:webdav AND label:approved # Approved docs from WebDAV
```
### Range Queries
#### Date Ranges
```
date:2024-01-01..2024-03-31 # Q1 2024 documents
date:>2024-01-01 # After January 1, 2024
date:<2024-12-31 # Before December 31, 2024
```
#### Size Ranges
```
size:1MB..10MB # Between 1MB and 10MB
size:>50MB # Larger than 50MB
size:<1KB # Smaller than 1KB
```
### Wildcard Search
Use wildcards for partial matching:
```
proj* # Matches "project", "projects", "projection"
*report # Matches "annual report", "status report"
doc?ment # Matches "document", "documents" (? = single character)
```
### Exclusion Operators
Exclude unwanted results:
```
invoice -draft # Invoices but not drafts
budget NOT personal # Budget documents excluding personal
-label:archive proposal # Proposals not in archive
```
## Advanced Filtering
### File Type Filters
Filter by specific file formats:
**Common File Types**:
- **Documents**: PDF, DOC, DOCX, TXT, RTF
- **Images**: PNG, JPG, JPEG, TIFF, BMP, GIF
- **Spreadsheets**: XLS, XLSX, CSV
- **Presentations**: PPT, PPTX
**Filter Interface**:
1. **Checkbox Filters**: Select multiple file types
2. **MIME Type Groups**: Filter by general categories
3. **Custom Extensions**: Add specific file extensions
**Search Syntax**:
```
type:pdf # Only PDF files
type:(pdf OR doc) # PDF or Word documents
-type:image # Exclude all images
```
### Date and Time Filters
**Predefined Ranges**:
- Today, Yesterday, This Week, Last Week
- This Month, Last Month, This Quarter, Last Quarter
- This Year, Last Year
**Custom Date Ranges**:
- **Start Date**: Documents uploaded after specific date
- **End Date**: Documents uploaded before specific date
- **Date Range**: Documents within specific period
**Advanced Date Syntax**:
```
created:today # Documents uploaded today
modified:>2024-01-01 # Modified after January 1st
accessed:last-week # Accessed in the last week
```
### Size Filters
**Size Categories**:
- **Small**: < 1MB
- **Medium**: 1MB - 10MB
- **Large**: 10MB - 50MB
- **Very Large**: > 50MB
**Custom Size Ranges**:
```
size:>10MB # Larger than 10MB
size:1MB..5MB # Between 1MB and 5MB
size:<100KB # Smaller than 100KB
```
### Label Filters
**Label Selection**:
- **Multiple Labels**: Select multiple labels with AND/OR logic
- **Label Hierarchy**: Navigate nested label structures
- **Label Suggestions**: Auto-complete based on existing labels
**Label Search Syntax**:
```
label:project # Documents with "project" label
label:"high priority" # Multi-word labels in quotes
label:(urgent OR critical) # Documents with either label
-label:archive # Exclude archived documents
```
### Source Filters
Filter by document source or origin:
**Source Types**:
- **Manual Upload**: Documents uploaded directly
- **WebDAV Sync**: Documents from WebDAV sources
- **Local Folder**: Documents from watched folders
- **S3 Sync**: Documents from S3 buckets
**Source-Specific Filters**:
```
source:webdav # WebDAV synchronized documents
source:manual # Manually uploaded documents
source:"My Nextcloud" # Specific named source
```
### OCR Status Filters
Filter by OCR processing status:
**Status Options**:
- **Completed**: OCR successfully completed
- **Pending**: Waiting for OCR processing
- **Failed**: OCR processing failed
- **Not Applicable**: Text documents that don't need OCR
**OCR Quality Filters**:
- **High Confidence**: OCR confidence > 90%
- **Medium Confidence**: OCR confidence 70-90%
- **Low Confidence**: OCR confidence < 70%
## Search Interface
### Global Search Bar
**Location**: Available in the header on all pages
**Features**:
- **Real-time suggestions**: Shows results as you type
- **Quick results**: Top 5 matches with snippets
- **Fast navigation**: Direct access to documents
- **Search history**: Recent searches for quick access
**Usage**:
1. Click on the search bar in the header
2. Start typing your query
3. View instant suggestions and results
4. Click a result to navigate directly to the document
### Advanced Search Page
**Location**: Dedicated search page with full interface
**Features**:
- **Multiple search modes**: Toggle between search types
- **Filter sidebar**: All filtering options in one place
- **Result options**: Sorting, pagination, view modes
- **Export capabilities**: Export search results
**Interface Sections**:
#### Search Input Area
- **Query builder**: Visual query construction
- **Mode selector**: Choose search type (simple, phrase, fuzzy, boolean)
- **Suggestions**: Auto-complete and query recommendations
#### Filter Sidebar
- **File type filters**: Checkboxes for different formats
- **Date range picker**: Calendar interface for date selection
- **Size sliders**: Visual size range selection
- **Label selector**: Hierarchical label browser
- **Source filters**: Filter by upload source
#### Results Area
- **Sort options**: Relevance, date, filename, size
- **View modes**: List view, grid view, detail view
- **Pagination**: Navigate through result pages
- **Export options**: CSV, JSON export of results
### Search Results
#### Result Display Elements
**Document Cards**:
- **Filename**: Primary document identifier
- **Snippet**: Highlighted text excerpt showing search matches
- **Metadata**: File size, type, upload date, labels
- **Relevance Score**: Numerical relevance ranking
- **Quick Actions**: Download, view, edit labels
**Highlighting**:
- **Search terms**: Highlighted in yellow
- **Context**: Surrounding text for context
- **Multiple matches**: All instances highlighted
- **Snippet length**: Configurable in user settings
#### Result Sorting
**Sort Options**:
- **Relevance**: Best matches first (default)
- **Date**: Newest or oldest first
- **Filename**: Alphabetical order
- **Size**: Largest or smallest first
- **Score**: Highest search score first
**Secondary Sorting**:
- Apply secondary criteria when primary sort values are equal
- Example: Sort by relevance, then by date
### Search Configuration
#### User Preferences
**Search Settings** (accessible via Settings → Search):
- **Results per page**: 10, 25, 50, 100
- **Snippet length**: 100, 200, 300, 500 characters
- **Fuzzy threshold**: Sensitivity for approximate matching
- **Default sort**: Preferred default sorting option
- **Search history**: Enable/disable query history
#### Search Behavior
- **Auto-complete**: Enable search suggestions
- **Real-time search**: Search as you type
- **Search highlighting**: Highlight search terms in results
- **Context snippets**: Show surrounding text in results
## Search Optimization
### Query Optimization
#### Best Practices
1. **Use Specific Terms**: More specific queries yield better results
```
Good: "quarterly sales report Q1"
Poor: "document"
```
2. **Combine Search Modes**: Use appropriate mode for your needs
```
Exact phrases: "status update"
Flexible terms: project~
Complex logic: (budget OR financial) AND 2024
```
3. **Leverage Filters**: Combine text search with filters
```
Query: budget
Filters: Type = PDF, Date = This Quarter, Label = Finance
```
4. **Use Field Search**: Target specific document aspects
```
filename:invoice date:2024
content:"project milestone" label:important
```
### Performance Tips
#### Efficient Searching
1. **Start Broad, Then Narrow**: Begin with general terms, then add filters
2. **Use Filters Early**: Apply filters before complex text queries
3. **Avoid Wildcards at Start**: `*report` is slower than `report*`
4. **Combine Short Queries**: Use multiple short terms rather than long phrases
#### Search Index Optimization
The search system automatically optimizes for:
- **Frequent Terms**: Common words are indexed for fast retrieval
- **Document Updates**: New documents are indexed immediately
- **Language Support**: Multi-language stemming and analysis
- **Cache Management**: Frequent searches are cached
### OCR Search Optimization
#### Handling OCR Text
OCR-extracted text may contain errors that affect search:
**Strategies**:
1. **Use Fuzzy Search**: Handle OCR errors with approximate matching
2. **Try Variations**: Search for common OCR mistakes
3. **Use Context**: Include surrounding words for better matches
4. **Check Original**: Compare with original document when possible
**Common OCR Issues**:
- **Character confusion**: "m" vs "rn", "cl" vs "d"
- **Word boundaries**: "some thing" vs "something"
- **Special characters**: Missing or incorrect punctuation
**Optimization Examples**:
```
# Original: "invoice"
# OCR might produce: "irwoice", "invoce", "mvoice"
# Solution: Use fuzzy search
invoice~
# Or search for context
"invoice number" OR "irwoice number" OR "invoce number"
```
## Saved Searches
### Creating Saved Searches
1. **Build Your Query**: Create a search with desired parameters
2. **Test Results**: Verify the search returns expected documents
3. **Save Search**: Click "Save Search" button
4. **Name Search**: Provide descriptive name
5. **Configure Options**: Set update frequency and notifications
### Managing Saved Searches
**Saved Search Features**:
- **Quick Access**: Available in sidebar or dashboard
- **Automatic Updates**: Results update as new documents are added
- **Shared Access**: Share searches with other users (future feature)
- **Export Options**: Export results automatically
**Search Organization**:
- **Categories**: Group related searches
- **Favorites**: Mark frequently used searches
- **Recent**: Quick access to recently used searches
### Smart Collections
Saved searches that automatically include new documents:
**Examples**:
- **"This Month's Reports"**: `type:pdf AND content:report AND date:this-month`
- **"Pending Review"**: `label:"needs review" AND -label:completed`
- **"High Priority Items"**: `label:(urgent OR critical OR "high priority")`
## Search Analytics
### Search Performance Metrics
**Available Metrics**:
- **Query Performance**: Average search response times
- **Popular Searches**: Most frequently used search terms
- **Result Quality**: Click-through rates and user engagement
- **Search Patterns**: Common search behaviors and trends
### User Search History
**History Features**:
- **Recent Searches**: Quick access to previous queries
- **Search Suggestions**: Based on search history
- **Query Refinement**: Improve searches based on past patterns
- **Export History**: Download search history for analysis
## API Search
### Basic Search API
```bash
GET /api/search?query=invoice&limit=20
Authorization: Bearer <jwt_token>
```
**Query Parameters**:
- `query`: Search query string
- `limit`: Number of results (default: 50, max: 100)
- `offset`: Pagination offset
- `sort`: Sort order (relevance, date, filename, size)
### Advanced Search API
```bash
POST /api/search/advanced
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"query": "budget report",
"mode": "phrase",
"filters": {
"file_types": ["pdf", "docx"],
"labels": ["Q1 2024", "Finance"],
"date_range": {
"start": "2024-01-01",
"end": "2024-03-31"
},
"size_range": {
"min": 1048576,
"max": 52428800
}
},
"options": {
"fuzzy_threshold": 0.8,
"snippet_length": 200,
"highlight": true
}
}
```
### Search Response Format
```json
{
"results": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"filename": "Q1_Budget_Report.pdf",
"snippet": "The quarterly budget report shows a <mark>10% increase</mark> in revenue...",
"score": 0.95,
"highlights": ["budget", "report"],
"metadata": {
"size": 2048576,
"type": "application/pdf",
"uploaded_at": "2024-01-15T10:30:00Z",
"labels": ["Q1 2024", "Finance", "Budget"],
"source": "WebDAV Sync"
}
}
],
"total": 42,
"limit": 20,
"offset": 0,
"query_time": 0.085
}
```
## Troubleshooting
### Common Search Issues
#### No Results Found
**Possible Causes**:
1. **Typos**: Check spelling in search query
2. **Too Specific**: Query might be too restrictive
3. **Wrong Mode**: Using exact search when fuzzy would be better
4. **Filters**: Remove filters to check if they're excluding results
**Solutions**:
1. **Simplify Query**: Start with broader terms
2. **Check Spelling**: Use fuzzy search for typo tolerance
3. **Remove Filters**: Test without date, type, or label filters
4. **Try Synonyms**: Use alternative terms for the same concept
#### Irrelevant Results
**Possible Causes**:
1. **Too Broad**: Query matches too many unrelated documents
2. **Common Terms**: Using very common words that appear everywhere
3. **Wrong Mode**: Using fuzzy when exact match is needed
**Solutions**:
1. **Add Specificity**: Include more specific terms or context
2. **Use Filters**: Add file type, date, or label filters
3. **Phrase Search**: Use quotes for exact phrases
4. **Boolean Logic**: Use AND/OR/NOT for better control
#### Slow Search Performance
**Possible Causes**:
1. **Complex Queries**: Very complex boolean queries
2. **Large Result Sets**: Queries matching many documents
3. **Wildcard Overuse**: Starting queries with wildcards
**Solutions**:
1. **Simplify Queries**: Break complex queries into simpler ones
2. **Add Filters**: Use filters to reduce result set size
3. **Avoid Leading Wildcards**: Use `term*` instead of `*term`
4. **Use Pagination**: Request smaller result sets
### OCR Search Issues
#### OCR Text Not Searchable
**Symptoms**: Can't find text that's visible in document images
**Solutions**:
1. **Check OCR Status**: Verify OCR processing completed
2. **Retry OCR**: Manually retry OCR processing
3. **Use Fuzzy Search**: OCR might have character recognition errors
4. **Check Language Settings**: Ensure correct OCR language is configured
#### Poor OCR Search Quality
**Symptoms**: Fuzzy search required for most queries on scanned documents
**Solutions**:
1. **Improve Source Quality**: Use higher resolution scans (300+ DPI)
2. **OCR Language**: Verify correct language setting for documents
3. **Image Enhancement**: Enable OCR preprocessing options
4. **Manual Correction**: Consider manual text correction for important documents
### Search Configuration Issues
#### Settings Not Applied
**Symptoms**: Search settings changes don't take effect
**Solutions**:
1. **Reload Page**: Refresh browser to apply settings
2. **Clear Cache**: Clear browser cache and cookies
3. **Check Permissions**: Ensure user has permission to modify settings
4. **Database Issues**: Check if settings are being saved to database
#### Filter Problems
**Symptoms**: Filters not working as expected
**Solutions**:
1. **Clear All Filters**: Reset filters and apply one at a time
2. **Check Filter Logic**: Ensure AND/OR logic is correct
3. **Label Validation**: Verify labels exist and are spelled correctly
4. **Date Format**: Ensure dates are in correct format
## Next Steps
- Explore [labels and organization](labels-and-organization.md) for better search categorization
- Set up [sources](sources-guide.md) for automatic content ingestion
- Review [user guide](user-guide.md) for general search tips
- Check [API reference](api-reference.md) for programmatic search integration
- Configure [OCR optimization](dev/OCR_OPTIMIZATION_GUIDE.md) for better text extraction

View File

@ -0,0 +1,501 @@
# Labels and Organization Guide
Readur's labeling system provides powerful document organization and categorization capabilities. This guide covers creating, managing, and using labels to organize your document collection effectively.
## Table of Contents
- [Overview](#overview)
- [Label Types](#label-types)
- [Creating and Managing Labels](#creating-and-managing-labels)
- [Assigning Labels to Documents](#assigning-labels-to-documents)
- [Label-Based Search and Filtering](#label-based-search-and-filtering)
- [Label Organization Strategies](#label-organization-strategies)
- [Advanced Label Features](#advanced-label-features)
- [Best Practices](#best-practices)
- [API Integration](#api-integration)
## Overview
Labels in Readur provide a flexible tagging system that allows you to:
- **Categorize Documents**: Organize documents by type, project, department, or any custom criteria
- **Enhanced Search**: Filter search results by specific labels for precise document discovery
- **Visual Organization**: Color-coded labels provide instant visual categorization
- **Bulk Operations**: Apply or remove labels from multiple documents simultaneously
- **Project Management**: Track documents across projects, workflows, or time periods
### Key Features
- **Hierarchical Organization**: Create nested label structures for complex categorization
- **Color Coding**: Visual identification with customizable label colors
- **System Labels**: Automatic labels generated by Readur for administrative purposes
- **User Labels**: Custom labels created and managed by users
- **Smart Collections**: Save searches that automatically include documents with specific labels
- **Label Statistics**: Track document counts and usage analytics per label
## Label Types
### User Labels
**Custom labels** created and managed by users for personal or organizational categorization.
**Features:**
- **Full Control**: Create, edit, rename, and delete user-created labels
- **Color Customization**: Choose from a wide range of colors for visual organization
- **Flexible Naming**: Use any descriptive names that fit your workflow
- **Sharing**: Labels are visible to all users with access to labeled documents
**Common Use Cases:**
- Project names (e.g., "Project Alpha", "Q1 Budget")
- Document types (e.g., "Invoices", "Contracts", "Reports")
- Departments (e.g., "HR", "Engineering", "Marketing")
- Priority levels (e.g., "Urgent", "Review Needed", "Archive")
- Status indicators (e.g., "Draft", "Final", "Approved")
### System Labels
**Automatic labels** generated by Readur based on document properties and processing status.
**Examples:**
- **OCR Status**: "OCR Completed", "OCR Failed", "OCR Pending"
- **File Type**: "PDF", "Image", "Text Document"
- **Source Origin**: "WebDAV Upload", "Local Folder", "Manual Upload"
- **Processing Status**: "Recently Added", "High Confidence OCR", "Needs Review"
- **Size Categories**: "Large File", "Small File"
- **Date-based**: "This Week", "This Month", "This Year"
**Characteristics:**
- **Read-only**: Cannot be edited or deleted by users
- **Automatic Assignment**: Applied automatically based on document properties
- **System Managed**: Updated automatically when document properties change
- **Consistent Formatting**: Standardized naming and color scheme
## Creating and Managing Labels
### Creating New Labels
#### Via Label Management Page
1. **Navigate to Labels**: Go to Settings → Labels
2. **Click "Create Label"**
3. **Configure Label Properties**:
```
Name: Project Documentation
Color: Blue (#2196F3)
Description: Documents related to current projects
```
4. **Save** to create the label
#### During Document Upload
1. **Upload Document(s)**: Use the upload interface
2. **Add Labels Field**: In the upload form
3. **Create New Label**: Type a new label name
4. **Assign Color**: Choose color for the new label
5. **Complete Upload**: Label is created and assigned automatically
#### Quick Label Creation
- **Search Interface**: Create labels while filtering search results
- **Document Details**: Add new labels directly from document pages
- **Bulk Operations**: Create labels during bulk document operations
### Editing Labels
#### Renaming Labels
1. **Access Label Management**: Settings → Labels
2. **Find Target Label**: Use search or browse the label list
3. **Click "Edit"** or double-click the label name
4. **Modify Name**: Change to new descriptive name
5. **Save Changes**: Updates all documents using this label
#### Changing Colors
1. **Edit Label**: Follow renaming steps above
2. **Select New Color**: Choose from color palette or enter hex code
3. **Preview Changes**: See how the color looks in different contexts
4. **Apply**: Color updates immediately across all interfaces
#### Merging Labels
1. **Identify Similar Labels**: Find labels with overlapping purposes
2. **Select Target Label**: Choose the label to keep
3. **Merge Operation**: Use "Merge with..." option
4. **Confirm Merge**: All documents transfer to target label
5. **Source Label Deletion**: Original label is removed after merge
### Deleting Labels
#### Individual Label Deletion
1. **Label Management Page**: Access via Settings → Labels
2. **Select Label**: Find the label to delete
3. **Delete Action**: Click delete button or menu option
4. **Confirm Deletion**: Confirm removal (this cannot be undone)
5. **Document Update**: Label is removed from all associated documents
#### Bulk Label Cleanup
- **Unused Labels**: Automatically identify and remove labels with no documents
- **Duplicate Labels**: Find and merge labels with similar names
- **Batch Deletion**: Select multiple labels for simultaneous removal
## Assigning Labels to Documents
### Single Document Labeling
#### Document Details Page
1. **Open Document**: Click on any document to view details
2. **Labels Section**: Find the labels area in document metadata
3. **Add Labels**: Click "+" or "Add Label" button
4. **Select or Create**: Choose existing labels or create new ones
5. **Apply Changes**: Labels are assigned immediately
#### Quick Label Assignment
- **Hover Actions**: Quick label buttons appear when hovering over documents
- **Right-Click Menu**: Context menu with common label operations
- **Keyboard Shortcuts**: Assign frequently used labels with key combinations
### Bulk Label Operations
#### Multi-Document Selection
1. **Document Browser**: Navigate to documents page
2. **Select Documents**: Use checkboxes to select multiple documents
3. **Bulk Actions**: Click "Actions" or "Labels" in the toolbar
4. **Apply Labels**: Choose labels to add or remove
5. **Execute**: Apply changes to all selected documents
#### Search-Based Labeling
1. **Search for Documents**: Use search to find specific document sets
2. **Select All Results**: Choose all documents matching criteria
3. **Bulk Label Assignment**: Apply labels to entire result set
4. **Confirmation**: Review and confirm bulk changes
### Label Assignment During Upload
#### Upload Interface Labeling
1. **File Selection**: Choose files to upload
2. **Label Assignment**: Add labels before starting upload
3. **Label Creation**: Create new labels during upload process
4. **Automatic Application**: Labels assigned to all uploaded files
#### Drag and Drop Labeling
- **Pre-configured Areas**: Drag files to labeled drop zones
- **Automatic Tagging**: Labels applied based on drop location
- **Batch Processing**: Assign labels to multiple files simultaneously
## Label-Based Search and Filtering
### Label Filters in Search
#### Basic Label Filtering
1. **Search Interface**: Access the main search page
2. **Label Filter Section**: Find label filters in the sidebar
3. **Select Labels**: Check boxes for desired labels
4. **Apply Filter**: Search results automatically update
5. **Multiple Labels**: Combine multiple labels with AND/OR logic
#### Advanced Label Queries
**Search Syntax Examples:**
```
label:urgent # Documents with "urgent" label
label:"project alpha" # Documents with multi-word label
label:urgent AND label:review # Documents with both labels
label:draft OR label:final # Documents with either label
-label:archive # Exclude archived documents
```
### Smart Collections
#### Creating Smart Collections
1. **Build Search Query**: Create search with label filters
2. **Save Search**: Use "Save Search" option
3. **Name Collection**: Give descriptive name (e.g., "Active Projects")
4. **Automatic Updates**: Collection updates as documents are labeled
5. **Quick Access**: Access collections from sidebar or dashboard
#### Collection Examples
**Project-Based Collections:**
- "Q1 Budget Documents": `label:"Q1 budget" OR label:"financial planning"`
- "Marketing Materials": `label:marketing AND (label:final OR label:approved)`
- "Pending Review": `label:"needs review" AND -label:completed`
**Status-Based Collections:**
- "Recent Uploads": `label:"this month" AND -label:processed`
- "High Priority": `label:urgent OR label:critical`
- "Archive Ready": `label:completed AND label:final`
### Label-Based Dashboard Views
#### Custom Dashboard Widgets
- **Label Statistics**: Show document counts per label
- **Recent Activity**: Display recently labeled documents
- **Label Trends**: Track labeling patterns over time
- **Quick Access**: Direct links to frequently used label filters
## Label Organization Strategies
### Hierarchical Labeling
#### Category-Based Organization
**Structure Example:**
```
Projects/
├── Project Alpha/
│ ├── Requirements
│ ├── Design
│ └── Implementation
├── Project Beta/
│ ├── Research
│ ├── Proposals
│ └── Contracts
└── Infrastructure/
├── Servers
├── Network
└── Security
```
#### Implementation Approach
1. **Top-Level Categories**: Create broad organizational labels
2. **Subcategories**: Use descriptive naming for specific areas
3. **Consistent Naming**: Establish naming conventions across categories
4. **Cross-References**: Documents can belong to multiple hierarchies
### Functional Organization
#### Document Lifecycle Labels
**Workflow Stages:**
- **Creation**: "Draft", "In Progress", "Under Review"
- **Approval**: "Pending Approval", "Approved", "Rejected"
- **Distribution**: "Published", "Distributed", "Archived"
- **Maintenance**: "Current", "Outdated", "Superseded"
#### Department-Based Labeling
**Organizational Structure:**
- **Human Resources**: "HR Policy", "Employee Records", "Benefits"
- **Finance**: "Invoices", "Budget", "Audit", "Tax Documents"
- **Legal**: "Contracts", "Compliance", "IP Documents"
- **Operations**: "Procedures", "Manuals", "Incident Reports"
### Time-Based Organization
#### Date-Driven Labels
- **Fiscal Periods**: "Q1 2024", "FY2024", "H1 2024"
- **Project Phases**: "Phase 1", "Phase 2", "Final Phase"
- **Event-Based**: "Pre-Launch", "Launch", "Post-Launch"
- **Seasonal**: "Annual Review", "Budget Season", "Audit Period"
## Advanced Label Features
### Label Analytics
#### Usage Statistics
**Metrics Available:**
- **Document Count**: Number of documents per label
- **Recent Activity**: Labels used in recent uploads or assignments
- **Growth Trends**: How label usage changes over time
- **Popular Labels**: Most frequently used labels
- **Unused Labels**: Labels with no current document assignments
#### Label Performance
- **Search Frequency**: How often labels are used in searches
- **Click-Through Rates**: User engagement with labeled content
- **Organization Effectiveness**: How labels improve document discovery
### Label Automation
#### Auto-Labeling Rules
**OCR-Based Labeling:**
- **Content Detection**: Automatically label documents based on detected text
- **Template Recognition**: Recognize document types and apply appropriate labels
- **Entity Extraction**: Label documents based on detected entities (names, dates, amounts)
**Source-Based Labeling:**
- **Upload Location**: Apply labels based on upload source or folder
- **File Type**: Automatic labels based on file format and structure
- **Metadata**: Labels derived from file properties and EXIF data
#### Workflow Integration
- **Process Triggers**: Apply labels based on workflow stage completion
- **Approval Status**: Automatic labeling based on approval workflows
- **Time-Based Rules**: Apply labels based on document age or schedule
### Label Import/Export
#### Bulk Label Operations
**Import Scenarios:**
- **Migration**: Import existing label structures from other systems
- **Template Application**: Apply predefined label sets to document collections
- **Organizational Standards**: Implement company-wide labeling standards
**Export Capabilities:**
- **Backup**: Export label definitions for backup purposes
- **Reporting**: Generate reports of label usage and document organization
- **Integration**: Share label structures with other systems
## Best Practices
### Label Design
#### Naming Conventions
1. **Descriptive Names**: Use clear, self-explanatory label names
2. **Consistent Format**: Establish and follow naming patterns
3. **Avoid Ambiguity**: Choose names that won't be confused with similar concepts
4. **Length Consideration**: Keep names concise but informative
5. **Special Characters**: Avoid special characters that may cause issues
**Good Examples:**
- "Q1-2024-Budget" ✅
- "Legal-Contract-Template" ✅
- "Marketing-Campaign-Assets" ✅
**Poor Examples:**
- "Stuff" ❌ (too vague)
- "Q1 Budget Documents for 2024 Financial Planning" ❌ (too long)
- "Legal/Contract#Template@2024" ❌ (special characters)
#### Color Strategy
1. **Consistent Color Families**: Use similar colors for related label categories
2. **High Contrast**: Ensure labels are readable against various backgrounds
3. **Color Meaning**: Establish color conventions (e.g., red for urgent, green for completed)
4. **Accessibility**: Consider color-blind users when choosing colors
5. **Limited Palette**: Don't use too many different colors
### Organization Strategy
#### Start Simple
1. **Basic Categories**: Begin with broad, obvious categories
2. **Organic Growth**: Add labels as needs become apparent
3. **User Feedback**: Incorporate user suggestions for new labels
4. **Regular Review**: Periodically assess and refine label structure
#### Maintain Consistency
1. **Documentation**: Document labeling standards and conventions
2. **Training**: Educate users on proper labeling practices
3. **Regular Cleanup**: Remove unused or redundant labels
4. **Standardization**: Ensure consistent application across teams
### Performance Optimization
#### Label Management
1. **Avoid Over-Labeling**: Don't create too many similar labels
2. **Regular Cleanup**: Remove unused labels to reduce clutter
3. **Search Optimization**: Focus on labels that improve searchability
4. **User Training**: Educate users on effective labeling practices
#### System Performance
- **Index Optimization**: Labels are indexed for fast search performance
- **Bulk Operations**: Use bulk assignment for better efficiency
- **Caching**: Frequently used labels are cached for quick access
## API Integration
### Label Management API
#### Creating Labels
```bash
POST /api/labels
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"name": "Project Documentation",
"color": "#2196F3"
}
```
#### Listing Labels
```bash
GET /api/labels
Authorization: Bearer <jwt_token>
```
Response:
```json
{
"labels": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"name": "Project Documentation",
"color": "#2196F3",
"document_count": 42,
"created_at": "2024-01-01T00:00:00Z"
}
]
}
```
#### Assigning Labels to Documents
```bash
PATCH /api/documents/{document_id}
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"labels": ["Project Documentation", "Q1 2024", "High Priority"]
}
```
### Search Integration
#### Label-Based Search
```bash
GET /api/search?query=invoice&labels=urgent,review
Authorization: Bearer <jwt_token>
```
#### Advanced Label Queries
```bash
POST /api/search/advanced
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"query": "budget",
"filters": {
"labels": ["Q1 2024", "Finance"],
"label_logic": "AND"
}
}
```
## Next Steps
- Configure [advanced search](advanced-search.md) with label-based filtering
- Set up [sources](sources-guide.md) with automatic labeling rules
- Explore [user management](user-management-guide.md) for collaborative labeling
- Review [API reference](api-reference.md) for programmatic label management
- Check [best practices](user-guide.md#tips-for-best-results) for document organization

498
docs/sources-guide.md Normal file
View File

@ -0,0 +1,498 @@
# Sources Guide
Readur's Sources feature provides powerful automated document ingestion from multiple external storage systems. This comprehensive guide covers all supported source types and their configuration.
## Table of Contents
- [Overview](#overview)
- [Source Types](#source-types)
- [WebDAV Sources](#webdav-sources)
- [Local Folder Sources](#local-folder-sources)
- [S3 Sources](#s3-sources)
- [Getting Started](#getting-started)
- [Configuration](#configuration)
- [Sync Operations](#sync-operations)
- [Health Monitoring](#health-monitoring)
- [Troubleshooting](#troubleshooting)
- [Best Practices](#best-practices)
## Overview
Sources allow Readur to automatically discover, download, and process documents from external storage systems. Key features include:
- **Multi-Protocol Support**: WebDAV, Local Folders, and S3-compatible storage
- **Automated Syncing**: Scheduled synchronization with configurable intervals
- **Health Monitoring**: Proactive monitoring and validation of source connections
- **Intelligent Processing**: Duplicate detection, incremental syncs, and OCR integration
- **Real-time Status**: Live sync progress and comprehensive statistics
### How Sources Work
1. **Configuration**: Set up a source with connection details and preferences
2. **Discovery**: Readur scans the source for supported file types
3. **Synchronization**: New and changed files are downloaded and processed
4. **OCR Processing**: Documents are automatically queued for text extraction
5. **Search Integration**: Processed documents become searchable in your collection
## Source Types
### WebDAV Sources
WebDAV sources connect to cloud storage services and self-hosted servers that support the WebDAV protocol.
#### Supported WebDAV Servers
| Server Type | Status | Notes |
|-------------|--------|-------|
| **Nextcloud** | ✅ Fully Supported | Optimized discovery and authentication |
| **ownCloud** | ✅ Fully Supported | Native integration with server detection |
| **Apache WebDAV** | ✅ Supported | Generic WebDAV implementation |
| **nginx WebDAV** | ✅ Supported | Works with nginx dav module |
| **Box.com** | ⚠️ Limited | Basic WebDAV support |
| **Other WebDAV** | ✅ Supported | Generic WebDAV protocol compliance |
#### WebDAV Configuration
**Required Fields:**
- **Name**: Descriptive name for the source
- **Server URL**: Full WebDAV server URL (e.g., `https://cloud.example.com/remote.php/dav/files/username/`)
- **Username**: WebDAV authentication username
- **Password**: WebDAV authentication password or app password
**Optional Configuration:**
- **Watch Folders**: Specific directories to monitor (leave empty to sync entire accessible space)
- **File Extensions**: Limit to specific file types (default: all supported types)
- **Auto Sync**: Enable automatic scheduled synchronization
- **Sync Interval**: How often to check for changes (15 minutes to 24 hours)
- **Server Type**: Specify server type for optimizations (auto-detected)
#### Setting Up WebDAV Sources
1. **Navigate to Sources**: Go to Settings → Sources in the Readur interface
2. **Add New Source**: Click "Add Source" and select "WebDAV"
3. **Configure Connection**:
```
Name: My Nextcloud Documents
Server URL: https://cloud.mycompany.com/remote.php/dav/files/john/
Username: john
Password: app-password-here
```
4. **Test Connection**: Use the "Test Connection" button to verify credentials
5. **Configure Folders**: Specify directories to monitor:
```
Watch Folders:
- Documents/
- Projects/2024/
- Invoices/
```
6. **Set Sync Schedule**: Choose automatic sync interval (recommended: 30 minutes)
7. **Save and Sync**: Save configuration and trigger initial sync
#### WebDAV Best Practices
- **Use App Passwords**: Create dedicated app passwords instead of using main account passwords
- **Limit Scope**: Specify watch folders to avoid syncing unnecessary files
- **Server Optimization**: Let Readur auto-detect server type for optimal performance
- **Network Considerations**: Use longer sync intervals for slow connections
### Local Folder Sources
Local folder sources monitor directories on the Readur server's filesystem, including mounted network drives.
#### Use Cases
- **Watch Folders**: Monitor directories where documents are dropped
- **Network Mounts**: Sync from NFS, SMB/CIFS, or other mounted filesystems
- **Batch Processing**: Automatically process documents placed in specific folders
- **Archive Integration**: Monitor existing document archives
#### Local Folder Configuration
**Required Fields:**
- **Name**: Descriptive name for the source
- **Watch Folders**: Absolute paths to monitor directories
**Optional Configuration:**
- **File Extensions**: Filter by specific file types
- **Auto Sync**: Enable scheduled monitoring
- **Sync Interval**: Frequency of directory scans
- **Recursive**: Include subdirectories in scans
- **Follow Symlinks**: Follow symbolic links (use with caution)
#### Setting Up Local Folder Sources
1. **Prepare Directory**: Ensure the directory exists and is accessible
```bash
# Create watch folder
mkdir -p /mnt/documents/inbox
# Set permissions (if needed)
chmod 755 /mnt/documents/inbox
```
2. **Configure Source**:
```
Name: Document Inbox
Watch Folders: /mnt/documents/inbox
File Extensions: pdf,jpg,png,txt,docx
Auto Sync: Enabled
Sync Interval: 5 minutes
Recursive: Yes
```
3. **Test Setup**: Place a test document in the folder and verify detection
#### Network Mount Examples
**NFS Mount:**
```bash
# Mount NFS share
sudo mount -t nfs 192.168.1.100:/documents /mnt/nfs-docs
# Configure in Readur
Watch Folders: /mnt/nfs-docs/inbox
```
**SMB/CIFS Mount:**
```bash
# Mount SMB share
sudo mount -t cifs //server/documents /mnt/smb-docs -o username=user
# Configure in Readur
Watch Folders: /mnt/smb-docs/processing
```
### S3 Sources
S3 sources connect to Amazon S3 or S3-compatible storage services for document synchronization.
#### Supported S3 Services
| Service | Status | Configuration |
|---------|--------|---------------|
| **Amazon S3** | ✅ Fully Supported | Standard AWS configuration |
| **MinIO** | ✅ Fully Supported | Custom endpoint URL |
| **DigitalOcean Spaces** | ✅ Supported | S3-compatible API |
| **Wasabi** | ✅ Supported | Custom endpoint configuration |
| **Google Cloud Storage** | ⚠️ Limited | S3-compatible mode only |
#### S3 Configuration
**Required Fields:**
- **Name**: Descriptive name for the source
- **Bucket Name**: S3 bucket to monitor
- **Region**: AWS region (e.g., `us-east-1`)
- **Access Key ID**: AWS/S3 access key
- **Secret Access Key**: AWS/S3 secret key
**Optional Configuration:**
- **Endpoint URL**: Custom endpoint for S3-compatible services
- **Prefix**: Bucket path prefix to limit scope
- **Watch Folders**: Specific S3 "directories" to monitor
- **File Extensions**: Filter by file types
- **Auto Sync**: Enable scheduled synchronization
- **Sync Interval**: Frequency of bucket scans
#### Setting Up S3 Sources
1. **Prepare S3 Bucket**: Ensure bucket exists and credentials have access
2. **Configure Source**:
```
Name: Company Documents S3
Bucket Name: company-documents
Region: us-west-2
Access Key ID: AKIAIOSFODNN7EXAMPLE
Secret Access Key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Prefix: documents/
Watch Folders:
- invoices/
- contracts/
- reports/
```
3. **Test Connection**: Verify credentials and bucket access
#### S3-Compatible Services
**MinIO Configuration:**
```
Endpoint URL: https://minio.example.com:9000
Bucket Name: documents
Region: us-east-1 (can be any value for MinIO)
```
**DigitalOcean Spaces:**
```
Endpoint URL: https://nyc3.digitaloceanspaces.com
Bucket Name: my-documents
Region: nyc3
```
## Getting Started
### Adding Your First Source
1. **Access Sources Management**: Navigate to Settings → Sources
2. **Choose Source Type**: Select WebDAV, Local Folder, or S3 based on your needs
3. **Configure Connection**: Enter required credentials and connection details
4. **Test Connection**: Verify connectivity before saving
5. **Configure Sync**: Set up folders to monitor and sync schedule
6. **Initial Sync**: Trigger first synchronization to import existing documents
### Quick Setup Examples
#### Nextcloud WebDAV
```
Name: Nextcloud Documents
Server URL: https://cloud.company.com/remote.php/dav/files/username/
Username: username
Password: app-password
Watch Folders: Documents/, Shared/
Auto Sync: Every 30 minutes
```
#### Local Network Drive
```
Name: Network Archive
Watch Folders: /mnt/network/documents
File Extensions: pdf,doc,docx,txt
Recursive: Yes
Auto Sync: Every 15 minutes
```
#### AWS S3 Bucket
```
Name: AWS Document Bucket
Bucket: company-docs-bucket
Region: us-east-1
Access Key: [AWS Access Key]
Secret Key: [AWS Secret Key]
Prefix: active-documents/
Auto Sync: Every 1 hour
```
## Configuration
### Sync Settings
**Sync Intervals:**
- **Real-time**: Immediate processing (local folders only)
- **5-15 minutes**: High-frequency monitoring
- **30-60 minutes**: Standard monitoring (recommended)
- **2-24 hours**: Low-frequency, large dataset sync
**File Filtering:**
- **File Extensions**: `pdf,jpg,jpeg,png,txt,doc,docx,rtf`
- **Size Limits**: Configurable maximum file size (default: 50MB)
- **Path Exclusions**: Skip specific directories or file patterns
### Advanced Configuration
**Concurrency Settings:**
- **Concurrent Files**: Number of files processed simultaneously (default: 5)
- **Network Timeout**: Connection timeout for network sources
- **Retry Logic**: Automatic retry for failed downloads
**Deduplication:**
- **Hash-based**: SHA-256 content hashing prevents duplicate storage
- **Cross-source**: Duplicates detected across all sources
- **Metadata Preservation**: Tracks file origins while avoiding storage duplication
## Sync Operations
### Manual Sync
**Trigger Immediate Sync:**
1. Navigate to Sources page
2. Find the source to sync
3. Click the "Sync Now" button
4. Monitor progress in real-time
**Deep Scan:**
- Forces complete re-scan of entire source
- Useful for detecting changes in large directories
- Automatically triggered periodically
### Sync Status
**Status Indicators:**
- 🟢 **Idle**: Source ready, no sync in progress
- 🟡 **Syncing**: Active synchronization in progress
- 🔴 **Error**: Sync failed, requires attention
- ⚪ **Disabled**: Source disabled, no automatic sync
**Progress Information:**
- Files discovered vs. processed
- Current operation (scanning, downloading, processing)
- Estimated completion time
- Transfer speeds and statistics
### Stopping Sync
**Graceful Cancellation:**
1. Click "Stop Sync" button during active sync
2. Current file processing completes
3. Sync stops cleanly without corruption
4. Partial progress is saved
## Health Monitoring
### Health Scores
Sources are continuously monitored and assigned health scores (0-100):
- **90-100**: ✅ Excellent - No issues detected
- **75-89**: ⚠️ Good - Minor issues or warnings
- **50-74**: ⚠️ Fair - Moderate issues requiring attention
- **25-49**: ❌ Poor - Significant problems
- **0-24**: ❌ Critical - Severe issues, manual intervention required
### Health Checks
**Automatic Validation** (every 30 minutes):
- Connection testing
- Credential verification
- Configuration validation
- Sync pattern analysis
- Error rate monitoring
**Common Health Issues:**
- Authentication failures
- Network connectivity problems
- Permission or access issues
- Configuration errors
- Rate limiting or throttling
### Health Notifications
**Alert Types:**
- Connection failures
- Authentication expires
- Sync errors
- Performance degradation
- Configuration warnings
## Troubleshooting
### Common Issues
#### WebDAV Connection Problems
**Symptom**: "Connection failed" or authentication errors
**Solutions**:
1. Verify server URL format:
- Nextcloud: `https://server.com/remote.php/dav/files/username/`
- ownCloud: `https://server.com/remote.php/dav/files/username/`
- Generic: `https://server.com/webdav/`
2. Check credentials:
- Use app passwords instead of main passwords
- Verify username/password combination
- Test credentials in web browser or WebDAV client
3. Network issues:
- Verify server is accessible from Readur
- Check firewall and SSL certificate issues
- Test with curl: `curl -u username:password https://server.com/webdav/`
#### Local Folder Issues
**Symptom**: "Permission denied" or "Directory not found"
**Solutions**:
1. Check directory permissions:
```bash
ls -la /path/to/watch/folder
chmod 755 /path/to/watch/folder # If needed
```
2. Verify path exists:
```bash
stat /path/to/watch/folder
```
3. For network mounts:
```bash
mount | grep /path/to/mount # Verify mount
ls -la /path/to/mount # Test access
```
#### S3 Access Problems
**Symptom**: "Access denied" or "Bucket not found"
**Solutions**:
1. Verify credentials and permissions:
```bash
aws s3 ls s3://bucket-name --profile your-profile
```
2. Check bucket policy and IAM permissions
3. Verify region configuration matches bucket region
4. For S3-compatible services, ensure correct endpoint URL
### Performance Issues
#### Slow Sync Performance
**Causes and Solutions**:
1. **Large file sizes**: Increase timeout values, consider file size limits
2. **Network latency**: Reduce concurrent connections, increase intervals
3. **Server throttling**: Implement longer delays between requests
4. **Large directories**: Use watch folders to limit scope
#### High Resource Usage
**Optimization Strategies**:
1. **Reduce concurrency**: Lower concurrent file processing
2. **Increase intervals**: Less frequent sync checks
3. **Filter files**: Limit to specific file types and sizes
4. **Stagger syncs**: Avoid multiple sources syncing simultaneously
### Error Recovery
**Automatic Recovery:**
- Failed files are automatically retried
- Temporary network issues are handled gracefully
- Sync resumes from last successful point
**Manual Recovery:**
1. Check source health status
2. Review error logs in source details
3. Test connection manually
4. Trigger deep scan to reset sync state
## Best Practices
### Security
1. **Use Dedicated Credentials**: Create app-specific passwords and access keys
2. **Limit Permissions**: Grant minimum required access to source accounts
3. **Regular Rotation**: Periodically update passwords and access keys
4. **Network Security**: Use HTTPS/TLS for all connections
### Performance
1. **Strategic Scheduling**: Stagger sync times for multiple sources
2. **Scope Limitation**: Use watch folders to limit sync scope
3. **File Filtering**: Exclude unnecessary file types and large files
4. **Monitor Resources**: Watch CPU, memory, and network usage
### Organization
1. **Descriptive Names**: Use clear, descriptive source names
2. **Consistent Structure**: Maintain consistent folder organization
3. **Documentation**: Document source purposes and configurations
4. **Regular Maintenance**: Periodically review and clean up sources
### Reliability
1. **Health Monitoring**: Regularly check source health scores
2. **Backup Configuration**: Document source configurations
3. **Test Scenarios**: Periodically test sync and recovery procedures
4. **Monitor Logs**: Review sync logs for patterns or issues
## Next Steps
- Configure [notifications](notifications.md) for sync events
- Set up [advanced search](advanced-search.md) to find synced documents
- Review [OCR optimization](dev/OCR_OPTIMIZATION_GUIDE.md) for processing improvements
- Explore [labels and organization](labels-and-organization.md) for document management

View File

@ -10,11 +10,12 @@ A comprehensive guide to using Readur's features for document management, OCR pr
- [Dashboard](#dashboard)
- [Document Management](#document-management)
- [Advanced Search](#advanced-search)
- [Folder Watching](#folder-watching)
- [Sources and Synchronization](#sources-and-synchronization)
- [Document Upload](#document-upload)
- [OCR Processing](#ocr-processing)
- [Search Features](#search-features)
- [Tags and Organization](#tags-and-organization)
- [Labels and Organization](#labels-and-organization)
- [User Management](#user-management)
- [User Settings](#user-settings)
- [Tips for Best Results](#tips-for-best-results)
@ -117,20 +118,30 @@ tag:important invoice # Search within tagged documents
type:pdf contract # Search only PDFs
```
### Folder Watching
### Sources and Synchronization
The folder watching feature automatically imports documents:
Readur's Sources feature provides automated document ingestion from multiple external storage systems:
1. **Non-destructive**: Source files remain untouched
2. **Automatic Processing**: New files are detected and processed
3. **Configurable Intervals**: Adjust scan frequency
4. **Multiple Sources**: Watch local folders, network drives, cloud storage
1. **Multi-Protocol Support**: WebDAV, Local Folders, and S3-compatible storage
2. **Non-destructive**: Source files remain untouched in their original locations
3. **Automated Syncing**: Scheduled synchronization with configurable intervals
4. **Health Monitoring**: Proactive monitoring and validation of source connections
5. **Intelligent Processing**: Duplicate detection, incremental syncs, and OCR integration
#### Setting Up Watch Folders
1. Go to Settings → Sources
2. Add a new source with type "Local Folder"
3. Configure the path and scan interval
4. Enable/disable the source as needed
#### Supported Source Types
- **WebDAV Sources**: Nextcloud, ownCloud, generic WebDAV servers
- **Local Folder Sources**: Local filesystem directories and network mounts
- **S3 Sources**: Amazon S3 and S3-compatible storage (MinIO, DigitalOcean Spaces)
#### Setting Up Sources
1. Navigate to Settings → Sources
2. Click "Add Source" and select source type
3. Configure connection details and credentials
4. Test connection and configure sync settings
5. Set up folders to monitor and sync schedule
> 📖 **For comprehensive source configuration**, see the [Sources Guide](sources-guide.md)
## Document Upload
@ -171,43 +182,147 @@ The folder watching feature automatically imports documents:
## Search Features
### Quick Search
Readur provides powerful search capabilities with multiple modes and advanced filtering options.
### Search Modes
- **Simple Search**: General purpose searching with automatic stemming and fuzzy matching
- **Phrase Search**: Find exact phrases using quotes (e.g., `"quarterly report"`)
- **Fuzzy Search**: Handle typos and OCR errors with approximate matching (e.g., `invoice~`)
- **Boolean Search**: Complex queries with AND, OR, NOT operators
### Search Interface
#### Quick Search
- Available in the header on all pages
- Instant results as you type
- Shows top 5 matches with snippets
- Real-time suggestions
### Advanced Search Page
#### Advanced Search Page
- Full search interface with all filters
- Multiple search modes selector
- Comprehensive filtering options
- Export search results
- Save frequently used searches
- Search history
- Search history and analytics
### Advanced Filtering
- **File Types**: Filter by PDF, images, documents, etc.
- **Date Ranges**: Search within specific time periods
- **Labels**: Filter by document tags and categories
- **Sources**: Search within specific sync sources
- **File Size**: Filter by document size ranges
- **OCR Status**: Filter by text extraction status
### Search Tips
1. Use quotes for exact phrases
2. Combine filters for precise results
3. Use wildcards: `inv*` matches invoice, inventory
4. Search in specific fields: `filename:report`
1. Use quotes for exact phrases: `"project status"`
2. Combine text search with filters for precision
3. Use wildcards: `proj*` matches project, projects, projection
4. Search specific fields: `filename:report`, `label:urgent`
5. Use boolean logic: `(budget OR financial) AND 2024`
## Tags and Organization
> 🔍 **For detailed search techniques**, see the [Advanced Search Guide](advanced-search.md)
### Creating Tags
1. Select document(s)
2. Click "Add Tag"
3. Enter tag name or select existing
4. Tags are color-coded for easy identification
## Labels and Organization
### Tag Management
- Rename tags globally
- Merge similar tags
- Delete unused tags
- Set tag colors
Readur's labeling system provides comprehensive document organization and categorization capabilities.
### Label Types
- **User Labels**: Custom labels created and managed by users with full control
- **System Labels**: Automatic labels generated by Readur (OCR status, file type, etc.)
- **Color Coding**: Visual identification with customizable label colors
- **Hierarchical Structure**: Organize labels in categories and subcategories
### Creating and Managing Labels
#### Creating Labels
1. **Via Settings**: Go to Settings → Labels and click "Create Label"
2. **During Upload**: Add labels while uploading documents
3. **Document Details**: Add labels directly from document pages
4. **Bulk Operations**: Create and assign labels to multiple documents
#### Label Operations
- **Rename**: Change label names (updates all documents)
- **Merge**: Combine similar labels into one
- **Color Management**: Customize label colors for visual organization
- **Bulk Assignment**: Apply labels to multiple documents at once
### Organization Strategies
#### Category-Based Organization
- **Projects**: "Project Alpha", "Q1 Budget", "Infrastructure"
- **Departments**: "HR", "Finance", "Legal", "Marketing"
- **Document Types**: "Invoices", "Contracts", "Reports", "Policies"
- **Status**: "Draft", "Final", "Approved", "Archived"
#### Time-Based Organization
- **Fiscal Periods**: "Q1 2024", "FY2024", "Annual Review"
- **Project Phases**: "Planning", "Implementation", "Review"
- **Event-Based**: "Pre-Launch", "Launch", "Post-Launch"
### Smart Collections
Create saved searches based on:
- Tag combinations
- Date ranges
- File types
- Custom criteria
Create saved searches that automatically include documents with specific labels:
- **Active Projects**: Documents with current project labels
- **Pending Review**: Documents labeled for review
- **High Priority**: Documents with urgent or critical labels
> 🏷️ **For comprehensive labeling strategies**, see the [Labels and Organization Guide](labels-and-organization.md)
## User Management
Readur provides comprehensive user management with support for both local authentication and enterprise SSO integration.
### Authentication Methods
#### Local Authentication
- **Traditional Login**: Username and password authentication
- **Secure Storage**: Passwords hashed with bcrypt for security
- **Self Registration**: Users can create their own accounts (if enabled)
#### OIDC/SSO Authentication
- **Enterprise Integration**: Single Sign-On with corporate identity providers
- **Supported Providers**: Microsoft Azure AD, Google Workspace, Okta, Auth0, Keycloak
- **Automatic Provisioning**: User accounts created automatically on first login
- **Seamless Experience**: Users authenticate with existing corporate credentials
### User Roles and Permissions
#### User Role
Standard users with access to core document management functionality:
- Upload and manage documents
- Search and view documents
- Configure personal settings
- Create and manage labels
- Set up personal sources
#### Admin Role
Administrators with full system access and user management capabilities:
- **User Management**: Create, modify, and delete user accounts
- **System Settings**: Configure global system parameters
- **Role Management**: Assign and modify user roles
- **System Monitoring**: View system health and performance metrics
### Administrative Features
Administrators can access user management via Settings → Users:
- **Create Users**: Add new user accounts with role assignment
- **Modify Users**: Update user information, roles, and passwords
- **User Overview**: View all users with creation dates and roles
- **Authentication Methods**: Manage both local and OIDC users
- **Bulk Operations**: Perform operations on multiple users
### Mixed Authentication Environments
Readur supports both local and OIDC users in the same installation:
- Local admin accounts for system management
- OIDC user accounts for regular enterprise users
- Flexible role assignment regardless of authentication method
> 👥 **For detailed user administration**, see the [User Management Guide](user-management-guide.md)
> 🔐 **For OIDC configuration**, see the [OIDC Setup Guide](oidc-setup.md)
## User Settings
@ -276,7 +391,21 @@ Create saved searches based on:
## Next Steps
- Explore the [API Reference](api-reference.md) for automation
- Learn about [advanced configuration](configuration.md)
- Set up [automated workflows](WATCH_FOLDER.md)
- Optimize [OCR performance](dev/OCR_OPTIMIZATION_GUIDE.md)
### Explore Advanced Features
- [🔗 Sources Guide](sources-guide.md) - Set up WebDAV, Local Folder, and S3 synchronization
- [🔎 Advanced Search](advanced-search.md) - Master search modes, syntax, and optimization
- [🏷️ Labels & Organization](labels-and-organization.md) - Implement effective document organization
- [👥 User Management](user-management-guide.md) - Configure authentication and user administration
- [🔐 OIDC Setup](oidc-setup.md) - Integrate with enterprise identity providers
### System Administration
- [📦 Installation Guide](installation.md) - Full installation and setup instructions
- [🔧 Configuration](configuration.md) - Environment variables and advanced configuration
- [🚀 Deployment Guide](deployment.md) - Production deployment with SSL and monitoring
- [📁 Watch Folder Guide](WATCH_FOLDER.md) - Legacy folder watching setup
### Development and Integration
- [🔌 API Reference](api-reference.md) - REST API for automation and integration
- [🏗️ Developer Documentation](dev/) - Architecture and development setup
- [🔍 OCR Optimization](dev/OCR_OPTIMIZATION_GUIDE.md) - Improve OCR performance
- [📊 Queue Architecture](dev/QUEUE_IMPROVEMENTS.md) - Background processing optimization

View File

@ -0,0 +1,440 @@
# User Management Guide
This comprehensive guide covers user administration, authentication, role-based access control, and user preferences in Readur.
## Table of Contents
- [Overview](#overview)
- [Authentication Methods](#authentication-methods)
- [User Roles and Permissions](#user-roles-and-permissions)
- [Admin User Management](#admin-user-management)
- [User Settings and Preferences](#user-settings-and-preferences)
- [OIDC/SSO Integration](#oidcsso-integration)
- [Security Best Practices](#security-best-practices)
- [Troubleshooting](#troubleshooting)
## Overview
Readur provides a comprehensive user management system with support for both local authentication and enterprise SSO integration. The system features:
- **Dual Authentication**: Local accounts and OIDC/SSO support
- **Role-Based Access Control**: Admin and User roles with distinct permissions
- **User Preferences**: Extensive per-user configuration options
- **Enterprise Integration**: OIDC support for corporate identity providers
- **Security Features**: JWT tokens, bcrypt password hashing, and session management
## Authentication Methods
### Local Authentication
Local authentication uses traditional username/password combinations stored securely in Readur's database.
#### Features:
- **Secure Storage**: Passwords hashed with bcrypt (cost factor 12)
- **JWT Tokens**: 24-hour token validity with secure signing
- **User Registration**: Self-service account creation (if enabled)
- **Password Requirements**: Configurable complexity requirements
#### Creating Local Users:
1. **Admin Creation** (via Settings):
- Navigate to Settings → Users (Admin only)
- Click "Add User"
- Enter username, email, and initial password
- Assign user role (Admin or User)
2. **Self Registration** (if enabled):
- Visit the registration page
- Provide username, email, and password
- Account created with default User role
### OIDC/SSO Authentication
OIDC (OpenID Connect) authentication integrates with enterprise identity providers for single sign-on.
#### Supported Features:
- **Standard OIDC Flow**: Authorization code flow with PKCE
- **Automatic Discovery**: Reads provider configuration from `.well-known/openid-configuration`
- **User Provisioning**: Automatic user creation on first login
- **Identity Linking**: Maps OIDC identities to local user accounts
- **Profile Sync**: Updates user information from OIDC provider
#### Supported Providers:
- **Microsoft Azure AD**: Enterprise identity management
- **Google Workspace**: Google's enterprise SSO
- **Okta**: Popular enterprise identity provider
- **Auth0**: Developer-friendly authentication platform
- **Keycloak**: Open-source identity management
- **Generic OIDC**: Any standards-compliant OIDC provider
See the [OIDC Setup Guide](oidc-setup.md) for detailed configuration instructions.
## User Roles and Permissions
### User Role
**Standard Users** have access to core document management functionality:
**Permissions:**
- ✅ Upload and manage own documents
- ✅ Search all documents (based on sharing settings)
- ✅ Configure personal settings and preferences
- ✅ Create and manage personal labels
- ✅ Use OCR processing features
- ✅ Access personal sources (WebDAV, local folders, S3)
- ✅ View personal notifications
- ❌ User management (cannot create/modify other users)
- ❌ System-wide settings or configuration
- ❌ Access to other users' private documents
### Admin Role
**Administrators** have full system access and user management capabilities:
**Additional Permissions:**
- ✅ **User Management**: Create, modify, and delete user accounts
- ✅ **System Settings**: Configure global system parameters
- ✅ **User Impersonation**: Access other users' documents (if needed)
- ✅ **System Monitoring**: View system health and performance metrics
- ✅ **Advanced Configuration**: OCR settings, source configurations
- ✅ **Security Management**: Token management, authentication settings
**Default Admin Account:**
- Username: `admin`
- Default Password: `readur2024` ⚠️ **Change immediately in production!**
## Admin User Management
### Accessing User Management
1. Log in as an administrator
2. Navigate to **Settings** → **Users**
3. The user management interface displays all system users
### User Management Operations
#### Creating Users
1. **Click "Add User"** in the Users section
2. **Fill out user information**:
```
Username: john.doe
Email: john.doe@company.com
Password: [secure-password]
Role: User (or Admin)
```
3. **Save** to create the account
4. **Notify the user** of their credentials
#### Modifying Users
1. **Find the user** in the user list
2. **Click "Edit"** or the user row
3. **Update information**:
- Change email address
- Reset password
- Modify role (User ↔ Admin)
- Update username (if needed)
4. **Save changes**
#### Deleting Users
1. **Select the user** to delete
2. **Click "Delete"**
3. **Confirm deletion** (this action cannot be undone)
**Important Notes:**
- Users cannot delete their own accounts
- Deleting a user removes all their documents and settings
- Consider disabling instead of deleting for user retention
#### Bulk Operations
**Future Feature**: Bulk user operations for enterprise deployments:
- Bulk user import from CSV
- Bulk role changes
- Bulk user deactivation
### User Information Display
The user management interface shows:
- **Username and Email**: Primary identification
- **Role**: Current role assignment
- **Created Date**: Account creation timestamp
- **Last Login**: Recent activity indicator
- **Auth Provider**: Local or OIDC authentication method
- **Status**: Active/disabled status (future feature)
## User Settings and Preferences
### Personal Settings Access
Users can configure their preferences via:
1. **User Menu****Settings** (top-right corner)
2. **Settings Page****Personal** tab
### Settings Categories
#### OCR Preferences
**Language Settings:**
- **OCR Language**: Primary language for text recognition (25+ languages)
- **Fallback Languages**: Secondary languages for mixed documents
- **Auto-Detection**: Automatic language detection (if supported)
**Processing Options:**
- **Image Enhancement**: Enable preprocessing for better OCR results
- **Auto-Rotation**: Automatically rotate images for optimal text recognition
- **Confidence Threshold**: Minimum confidence level for OCR acceptance
- **Processing Priority**: User's OCR queue priority level
#### Search Preferences
**Display Settings:**
- **Results Per Page**: Number of search results to display (10-100)
- **Snippet Length**: Length of text previews in search results
- **Fuzzy Search Threshold**: Sensitivity for fuzzy/approximate matching
- **Search History**: Enable/disable search query history
**Search Behavior:**
- **Default Sort Order**: Relevance, date, filename, size
- **Auto-Complete**: Enable search suggestions
- **Real-time Search**: Search as you type functionality
#### File Processing
**Upload Settings:**
- **Default File Types**: Preferred file types for uploads
- **Auto-OCR**: Automatically queue uploads for OCR processing
- **Duplicate Handling**: How to handle duplicate file uploads
- **File Size Limits**: Personal file size restrictions
**Storage Preferences:**
- **Compression**: Enable compression for storage savings
- **Retention Period**: How long to keep documents (if configured)
- **Archive Behavior**: Automatic archiving of old documents
#### Interface Preferences
**Display Options:**
- **Theme**: Light/dark mode preference
- **Timezone**: Local timezone for timestamp display
- **Date Format**: Preferred date/time display format
- **Language**: Interface language (separate from OCR language)
**Navigation:**
- **Default View**: List or grid view for document browser
- **Sidebar Collapsed**: Default sidebar state
- **Items Per Page**: Default pagination size
#### Notification Settings
**Notification Types:**
- **OCR Completion**: Notify when document processing completes
- **Source Sync**: Notifications for source synchronization events
- **System Alerts**: Important system messages and warnings
- **Storage Warnings**: Alerts for storage space or quota issues
**Delivery Methods:**
- **In-App Notifications**: Browser notifications within Readur
- **Email Notifications**: Email delivery for important events (future)
- **Desktop Notifications**: Browser push notifications (future)
### Source-Specific Settings
**WebDAV Preferences:**
- **Connection Timeout**: How long to wait for WebDAV responses
- **Retry Attempts**: Number of retries for failed downloads
- **Sync Schedule**: Preferred automatic sync frequency
**Local Folder Settings:**
- **Watch Interval**: How often to scan local directories
- **File Permissions**: Permission handling for processed files
- **Symlink Handling**: Follow symbolic links during scans
### Saving and Applying Settings
1. **Modify preferences** in the settings interface
2. **Click "Save Settings"** to apply changes
3. **Settings take effect immediately** for most options
4. **Some settings** may require logout/login to fully apply
## OIDC/SSO Integration
### Overview
OIDC integration allows users to authenticate using their corporate credentials without creating separate passwords for Readur.
### User Experience with OIDC
#### First-Time Login
1. **User clicks "Login with SSO"** on login page
2. **Redirected to corporate identity provider** (e.g., Azure AD, Okta)
3. **User authenticates** with corporate credentials
4. **Readur creates user account automatically** with information from OIDC provider
5. **User is logged in** and can immediately start using Readur
#### Subsequent Logins
1. **Click "Login with SSO"**
2. **Automatic redirect** to identity provider
3. **Single sign-on** (may not require re-authentication)
4. **Immediate access** to Readur
### OIDC User Account Details
**Automatic Account Creation:**
- **Username**: Derived from OIDC `preferred_username` or `sub` claim
- **Email**: Uses OIDC `email` claim
- **Role**: Default "User" role (admins can promote later)
- **Auth Provider**: Marked as "OIDC" in user management
**Identity Mapping:**
- **OIDC Subject**: Unique identifier from identity provider
- **OIDC Issuer**: Identity provider URL
- **Linked Accounts**: Maps OIDC identity to Readur user
### Mixed Authentication Environments
Readur supports both local and OIDC users in the same installation:
- **Local Admin Accounts**: For initial setup and emergency access
- **OIDC User Accounts**: For regular enterprise users
- **Role Management**: Admins can promote OIDC users to admin role
- **Account Linking**: Future feature to link local and OIDC accounts
### OIDC Configuration
See the detailed [OIDC Setup Guide](oidc-setup.md) for complete configuration instructions.
## Security Best Practices
### Password Security
**For Local Accounts:**
1. **Use Strong Passwords**: Minimum 12 characters with mixed case, numbers, symbols
2. **Regular Rotation**: Change passwords periodically
3. **Unique Passwords**: Don't reuse passwords from other systems
4. **Admin Passwords**: Use extra-strong passwords for administrator accounts
### JWT Token Security
**Token Management:**
- **Secure Storage**: Tokens stored securely in browser localStorage
- **Automatic Expiration**: 24-hour token lifetime
- **Secure Transmission**: HTTPS required for production
- **Token Rotation**: Regular token refresh (future feature)
### Access Control
**Role Management:**
1. **Principle of Least Privilege**: Grant minimum necessary permissions
2. **Regular Review**: Periodically audit user roles and permissions
3. **Admin Accounts**: Limit number of administrator accounts
4. **Account Deactivation**: Disable accounts for departed users
### OIDC Security
**Provider Configuration:**
1. **Use HTTPS**: Ensure all OIDC endpoints use HTTPS
2. **Client Secret Protection**: Secure storage of OIDC client secrets
3. **Scope Limitation**: Request only necessary OIDC scopes
4. **Token Validation**: Proper verification of OIDC tokens
### Monitoring and Auditing
**Access Monitoring:**
- **Login Tracking**: Monitor successful and failed login attempts
- **Role Changes**: Audit administrator role assignments
- **Account Activity**: Track user document access patterns
- **Security Events**: Log authentication and authorization events
## Troubleshooting
### Common Authentication Issues
#### Local Login Problems
**Symptom**: "Invalid username or password"
**Solutions**:
1. **Verify credentials**: Check username/password carefully
2. **Account existence**: Confirm account exists in user management
3. **Password reset**: Admin can reset user password
4. **Account status**: Ensure account is active/enabled
#### OIDC Login Problems
**Symptom**: OIDC login fails or redirects incorrectly
**Solutions**:
1. **Check OIDC configuration**: Verify client ID, secret, and issuer URL
2. **Redirect URI**: Ensure redirect URI is registered with OIDC provider
3. **Provider status**: Confirm OIDC provider is operational
4. **Network connectivity**: Verify Readur can reach OIDC endpoints
#### JWT Token Issues
**Symptom**: "Invalid token" or frequent logouts
**Solutions**:
1. **Check system time**: Ensure server time is accurate
2. **JWT secret**: Verify JWT_SECRET environment variable
3. **Token expiration**: Tokens expire after 24 hours
4. **Browser storage**: Clear localStorage and re-login
### User Management Issues
#### Cannot Create Users
**Symptom**: User creation fails
**Solutions**:
1. **Admin permissions**: Ensure logged in as administrator
2. **Duplicate usernames**: Check for existing username/email
3. **Database connectivity**: Verify database connection
4. **Input validation**: Ensure all required fields are provided
#### User Settings Not Saving
**Symptom**: Settings changes don't persist
**Solutions**:
1. **Check permissions**: Ensure user has permission to modify settings
2. **Database issues**: Verify database write permissions
3. **Browser issues**: Try clearing browser cache
4. **Network connectivity**: Ensure stable connection during save
### Role and Permission Issues
#### Users Cannot Access Features
**Symptom**: User reports missing functionality
**Solutions**:
1. **Check user role**: Verify user has appropriate role assignment
2. **Permission scope**: Confirm feature is available to user role
3. **Session refresh**: User may need to logout/login after role change
4. **Feature availability**: Ensure feature is enabled in system configuration
#### Admin Access Problems
**Symptom**: Admin cannot access management features
**Solutions**:
1. **Role verification**: Confirm user has Admin role
2. **Token validity**: Ensure JWT token contains correct role information
3. **Database consistency**: Verify role is correctly stored in database
4. **Login refresh**: Try logging out and logging back in
### Performance Issues
#### Slow User Operations
**Symptom**: User management operations are slow
**Solutions**:
1. **Database performance**: Check database query performance
2. **User count**: Large user counts may require pagination
3. **Network latency**: OIDC operations may be affected by provider latency
4. **System resources**: Monitor CPU and memory usage
## Next Steps
- Configure [OIDC integration](oidc-setup.md) for enterprise authentication
- Set up [sources](sources-guide.md) for document synchronization
- Review [security best practices](deployment.md#security-considerations)
- Explore [advanced search](advanced-search.md) capabilities
- Configure [labels and organization](labels-and-organization.md) for document management