feat(docs): update documentation for quite a few things
This commit is contained in:
parent
13d868ab36
commit
98b116c68c
18
README.md
18
README.md
|
|
@ -7,11 +7,16 @@ A powerful, modern document management system built with Rust and React. Readur
|
|||
|
||||
## ✨ Features
|
||||
|
||||
- 🔐 **Secure Authentication**: JWT-based user authentication with bcrypt password hashing
|
||||
- 🔐 **Secure Authentication**: JWT-based user authentication with bcrypt password hashing + OIDC/SSO support
|
||||
- 👥 **User Management**: Role-based access control with Admin and User roles
|
||||
- 📤 **Smart File Upload**: Drag-and-drop support for PDF, images, text files, and Office documents
|
||||
- 🔍 **Advanced OCR**: Automatic text extraction using Tesseract for searchable document content
|
||||
- 🔎 **Powerful Search**: PostgreSQL full-text search with advanced filtering and ranking
|
||||
- 👁️ **Folder Monitoring**: Non-destructive file watching (unlike paperless-ngx, doesn't consume source files)
|
||||
- 🔎 **Powerful Search**: PostgreSQL full-text search with multiple modes (simple, phrase, fuzzy, boolean)
|
||||
- 🔗 **Multi-Source Sync**: WebDAV, Local Folders, and S3-compatible storage integration
|
||||
- 🏷️ **Labels & Organization**: Comprehensive tagging system with color-coding and hierarchical structure
|
||||
- 👁️ **Folder Monitoring**: Non-destructive file watching with intelligent sync scheduling
|
||||
- 📊 **Health Monitoring**: Proactive source validation and system health tracking
|
||||
- 🔔 **Notifications**: Real-time alerts for sync events, OCR completion, and system status
|
||||
- 🎨 **Modern UI**: Beautiful React frontend with Material-UI components and responsive design
|
||||
- 🐳 **Docker Ready**: Complete containerization with production-ready multi-stage builds
|
||||
- ⚡ **High Performance**: Rust backend for speed and reliability
|
||||
|
|
@ -44,6 +49,13 @@ open http://localhost:8000
|
|||
- [🔧 Configuration](docs/configuration.md) - Environment variables and settings
|
||||
- [📖 User Guide](docs/user-guide.md) - How to use Readur effectively
|
||||
|
||||
### Core Features
|
||||
- [🔗 Sources Guide](docs/sources-guide.md) - WebDAV, Local Folders, and S3 integration
|
||||
- [👥 User Management](docs/user-management-guide.md) - Authentication, roles, and administration
|
||||
- [🏷️ Labels & Organization](docs/labels-and-organization.md) - Document tagging and categorization
|
||||
- [🔎 Advanced Search](docs/advanced-search.md) - Search modes, syntax, and optimization
|
||||
- [🔐 OIDC Setup](docs/oidc-setup.md) - Single Sign-On integration
|
||||
|
||||
### Deployment & Operations
|
||||
- [🚀 Deployment Guide](docs/deployment.md) - Production deployment, SSL, monitoring
|
||||
- [🔄 Reverse Proxy Setup](docs/REVERSE_PROXY.md) - Nginx, Traefik, and more
|
||||
|
|
|
|||
|
|
@ -0,0 +1,687 @@
|
|||
# Advanced Search Guide
|
||||
|
||||
Readur provides powerful search capabilities that go far beyond simple text matching. This comprehensive guide covers all search modes, advanced filtering, query syntax, and optimization techniques.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Search Modes](#search-modes)
|
||||
- [Query Syntax](#query-syntax)
|
||||
- [Advanced Filtering](#advanced-filtering)
|
||||
- [Search Interface](#search-interface)
|
||||
- [Search Optimization](#search-optimization)
|
||||
- [Saved Searches](#saved-searches)
|
||||
- [Search Analytics](#search-analytics)
|
||||
- [API Search](#api-search)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
## Overview
|
||||
|
||||
Readur's search system is built on PostgreSQL's full-text search capabilities with additional enhancements for document-specific requirements.
|
||||
|
||||
### Search Capabilities
|
||||
|
||||
- **Full-Text Search**: Search within document content and OCR-extracted text
|
||||
- **Multiple Search Modes**: Simple, phrase, fuzzy, and boolean search options
|
||||
- **Advanced Filtering**: Filter by file type, date, size, labels, and source
|
||||
- **Real-Time Suggestions**: Auto-complete and query suggestions as you type
|
||||
- **Faceted Search**: Browse documents by categories and properties
|
||||
- **Cross-Language Support**: Search in multiple languages with OCR text
|
||||
- **Relevance Ranking**: Intelligent scoring and result ordering
|
||||
|
||||
### Search Sources
|
||||
|
||||
Readur searches across multiple content sources:
|
||||
|
||||
1. **Document Content**: Original text from text files and PDFs
|
||||
2. **OCR Text**: Extracted text from images and scanned documents
|
||||
3. **Metadata**: File names, descriptions, and document properties
|
||||
4. **Labels**: User-created and system-generated tags
|
||||
5. **Source Information**: Upload source and file paths
|
||||
|
||||
## Search Modes
|
||||
|
||||
### Simple Search (Smart Search)
|
||||
|
||||
**Best for**: General purpose searching and quick document discovery
|
||||
|
||||
**How it works**:
|
||||
- Automatically applies stemming and fuzzy matching
|
||||
- Searches across all text content and metadata
|
||||
- Provides intelligent relevance scoring
|
||||
- Handles common typos and variations
|
||||
|
||||
**Example**:
|
||||
```
|
||||
invoice 2024
|
||||
```
|
||||
Finds: "Invoice Q1 2024", "invoicing for 2024", "2024 invoice data"
|
||||
|
||||
**Features**:
|
||||
- **Auto-stemming**: "running" matches "run", "runs", "runner"
|
||||
- **Fuzzy tolerance**: "recieve" matches "receive"
|
||||
- **Partial matching**: "doc" matches "document", "documentation"
|
||||
- **Relevance ranking**: More relevant matches appear first
|
||||
|
||||
### Phrase Search (Exact Match)
|
||||
|
||||
**Best for**: Finding exact phrases or specific terminology
|
||||
|
||||
**How it works**:
|
||||
- Searches for the exact sequence of words
|
||||
- Case-insensitive but order-sensitive
|
||||
- Useful for finding specific quotes, names, or technical terms
|
||||
|
||||
**Syntax**: Use quotes around the phrase
|
||||
```
|
||||
"quarterly financial report"
|
||||
"John Smith"
|
||||
"error code 404"
|
||||
```
|
||||
|
||||
**Features**:
|
||||
- **Exact word order**: Only matches the precise sequence
|
||||
- **Case insensitive**: "John Smith" matches "john smith"
|
||||
- **Punctuation ignored**: "error-code" matches "error code"
|
||||
|
||||
### Fuzzy Search (Approximate Matching)
|
||||
|
||||
**Best for**: Handling typos, OCR errors, and spelling variations
|
||||
|
||||
**How it works**:
|
||||
- Uses trigram similarity to find approximate matches
|
||||
- Configurable similarity threshold (default: 0.8)
|
||||
- Particularly useful for OCR-processed documents with errors
|
||||
|
||||
**Syntax**: Use the `~` operator
|
||||
```
|
||||
invoice~ # Finds "invoice", "invoce", "invoise"
|
||||
contract~ # Finds "contract", "contarct", "conract"
|
||||
```
|
||||
|
||||
**Configuration**:
|
||||
- **Threshold adjustment**: Configure sensitivity via user settings
|
||||
- **Language-specific**: Different languages may need different thresholds
|
||||
- **OCR optimization**: Higher tolerance for OCR-processed documents
|
||||
|
||||
### Boolean Search (Logical Operators)
|
||||
|
||||
**Best for**: Complex queries with multiple conditions and precise control
|
||||
|
||||
**Operators**:
|
||||
- **AND**: Both terms must be present
|
||||
- **OR**: Either term can be present
|
||||
- **NOT**: Exclude documents with the term
|
||||
- **Parentheses**: Group conditions
|
||||
|
||||
**Examples**:
|
||||
```
|
||||
budget AND 2024 # Both "budget" and "2024"
|
||||
invoice OR receipt # Either "invoice" or "receipt"
|
||||
contract NOT draft # "contract" but not "draft"
|
||||
(budget OR financial) AND 2024 # Complex grouping
|
||||
marketing AND (campaign OR strategy) # Marketing documents about campaigns or strategy
|
||||
```
|
||||
|
||||
**Advanced Boolean Examples**:
|
||||
```
|
||||
# Find completed project documents
|
||||
project AND (final OR completed OR approved) NOT draft
|
||||
|
||||
# Financial documents excluding personal items
|
||||
(invoice OR receipt OR budget) NOT personal
|
||||
|
||||
# Recent important documents
|
||||
(urgent OR priority OR critical) AND label:"this month"
|
||||
```
|
||||
|
||||
## Query Syntax
|
||||
|
||||
### Field-Specific Search
|
||||
|
||||
Search within specific document fields for precise targeting.
|
||||
|
||||
#### Available Fields
|
||||
|
||||
| Field | Description | Example |
|
||||
|-------|-------------|---------|
|
||||
| `filename:` | Search in file names | `filename:invoice` |
|
||||
| `content:` | Search in document text | `content:"project status"` |
|
||||
| `label:` | Search by labels | `label:urgent` |
|
||||
| `type:` | Search by file type | `type:pdf` |
|
||||
| `source:` | Search by upload source | `source:webdav` |
|
||||
| `size:` | Search by file size | `size:>10MB` |
|
||||
| `date:` | Search by date | `date:2024-01-01` |
|
||||
|
||||
#### Field Search Examples
|
||||
|
||||
```
|
||||
filename:contract AND date:2024 # Contracts from 2024
|
||||
label:"high priority" OR label:urgent # Priority documents
|
||||
type:pdf AND content:budget # PDF files containing "budget"
|
||||
source:webdav AND label:approved # Approved docs from WebDAV
|
||||
```
|
||||
|
||||
### Range Queries
|
||||
|
||||
#### Date Ranges
|
||||
```
|
||||
date:2024-01-01..2024-03-31 # Q1 2024 documents
|
||||
date:>2024-01-01 # After January 1, 2024
|
||||
date:<2024-12-31 # Before December 31, 2024
|
||||
```
|
||||
|
||||
#### Size Ranges
|
||||
```
|
||||
size:1MB..10MB # Between 1MB and 10MB
|
||||
size:>50MB # Larger than 50MB
|
||||
size:<1KB # Smaller than 1KB
|
||||
```
|
||||
|
||||
### Wildcard Search
|
||||
|
||||
Use wildcards for partial matching:
|
||||
|
||||
```
|
||||
proj* # Matches "project", "projects", "projection"
|
||||
*report # Matches "annual report", "status report"
|
||||
doc?ment # Matches "document", "documents" (? = single character)
|
||||
```
|
||||
|
||||
### Exclusion Operators
|
||||
|
||||
Exclude unwanted results:
|
||||
|
||||
```
|
||||
invoice -draft # Invoices but not drafts
|
||||
budget NOT personal # Budget documents excluding personal
|
||||
-label:archive proposal # Proposals not in archive
|
||||
```
|
||||
|
||||
## Advanced Filtering
|
||||
|
||||
### File Type Filters
|
||||
|
||||
Filter by specific file formats:
|
||||
|
||||
**Common File Types**:
|
||||
- **Documents**: PDF, DOC, DOCX, TXT, RTF
|
||||
- **Images**: PNG, JPG, JPEG, TIFF, BMP, GIF
|
||||
- **Spreadsheets**: XLS, XLSX, CSV
|
||||
- **Presentations**: PPT, PPTX
|
||||
|
||||
**Filter Interface**:
|
||||
1. **Checkbox Filters**: Select multiple file types
|
||||
2. **MIME Type Groups**: Filter by general categories
|
||||
3. **Custom Extensions**: Add specific file extensions
|
||||
|
||||
**Search Syntax**:
|
||||
```
|
||||
type:pdf # Only PDF files
|
||||
type:(pdf OR doc) # PDF or Word documents
|
||||
-type:image # Exclude all images
|
||||
```
|
||||
|
||||
### Date and Time Filters
|
||||
|
||||
**Predefined Ranges**:
|
||||
- Today, Yesterday, This Week, Last Week
|
||||
- This Month, Last Month, This Quarter, Last Quarter
|
||||
- This Year, Last Year
|
||||
|
||||
**Custom Date Ranges**:
|
||||
- **Start Date**: Documents uploaded after specific date
|
||||
- **End Date**: Documents uploaded before specific date
|
||||
- **Date Range**: Documents within specific period
|
||||
|
||||
**Advanced Date Syntax**:
|
||||
```
|
||||
created:today # Documents uploaded today
|
||||
modified:>2024-01-01 # Modified after January 1st
|
||||
accessed:last-week # Accessed in the last week
|
||||
```
|
||||
|
||||
### Size Filters
|
||||
|
||||
**Size Categories**:
|
||||
- **Small**: < 1MB
|
||||
- **Medium**: 1MB - 10MB
|
||||
- **Large**: 10MB - 50MB
|
||||
- **Very Large**: > 50MB
|
||||
|
||||
**Custom Size Ranges**:
|
||||
```
|
||||
size:>10MB # Larger than 10MB
|
||||
size:1MB..5MB # Between 1MB and 5MB
|
||||
size:<100KB # Smaller than 100KB
|
||||
```
|
||||
|
||||
### Label Filters
|
||||
|
||||
**Label Selection**:
|
||||
- **Multiple Labels**: Select multiple labels with AND/OR logic
|
||||
- **Label Hierarchy**: Navigate nested label structures
|
||||
- **Label Suggestions**: Auto-complete based on existing labels
|
||||
|
||||
**Label Search Syntax**:
|
||||
```
|
||||
label:project # Documents with "project" label
|
||||
label:"high priority" # Multi-word labels in quotes
|
||||
label:(urgent OR critical) # Documents with either label
|
||||
-label:archive # Exclude archived documents
|
||||
```
|
||||
|
||||
### Source Filters
|
||||
|
||||
Filter by document source or origin:
|
||||
|
||||
**Source Types**:
|
||||
- **Manual Upload**: Documents uploaded directly
|
||||
- **WebDAV Sync**: Documents from WebDAV sources
|
||||
- **Local Folder**: Documents from watched folders
|
||||
- **S3 Sync**: Documents from S3 buckets
|
||||
|
||||
**Source-Specific Filters**:
|
||||
```
|
||||
source:webdav # WebDAV synchronized documents
|
||||
source:manual # Manually uploaded documents
|
||||
source:"My Nextcloud" # Specific named source
|
||||
```
|
||||
|
||||
### OCR Status Filters
|
||||
|
||||
Filter by OCR processing status:
|
||||
|
||||
**Status Options**:
|
||||
- **Completed**: OCR successfully completed
|
||||
- **Pending**: Waiting for OCR processing
|
||||
- **Failed**: OCR processing failed
|
||||
- **Not Applicable**: Text documents that don't need OCR
|
||||
|
||||
**OCR Quality Filters**:
|
||||
- **High Confidence**: OCR confidence > 90%
|
||||
- **Medium Confidence**: OCR confidence 70-90%
|
||||
- **Low Confidence**: OCR confidence < 70%
|
||||
|
||||
## Search Interface
|
||||
|
||||
### Global Search Bar
|
||||
|
||||
**Location**: Available in the header on all pages
|
||||
**Features**:
|
||||
- **Real-time suggestions**: Shows results as you type
|
||||
- **Quick results**: Top 5 matches with snippets
|
||||
- **Fast navigation**: Direct access to documents
|
||||
- **Search history**: Recent searches for quick access
|
||||
|
||||
**Usage**:
|
||||
1. Click on the search bar in the header
|
||||
2. Start typing your query
|
||||
3. View instant suggestions and results
|
||||
4. Click a result to navigate directly to the document
|
||||
|
||||
### Advanced Search Page
|
||||
|
||||
**Location**: Dedicated search page with full interface
|
||||
**Features**:
|
||||
- **Multiple search modes**: Toggle between search types
|
||||
- **Filter sidebar**: All filtering options in one place
|
||||
- **Result options**: Sorting, pagination, view modes
|
||||
- **Export capabilities**: Export search results
|
||||
|
||||
**Interface Sections**:
|
||||
|
||||
#### Search Input Area
|
||||
- **Query builder**: Visual query construction
|
||||
- **Mode selector**: Choose search type (simple, phrase, fuzzy, boolean)
|
||||
- **Suggestions**: Auto-complete and query recommendations
|
||||
|
||||
#### Filter Sidebar
|
||||
- **File type filters**: Checkboxes for different formats
|
||||
- **Date range picker**: Calendar interface for date selection
|
||||
- **Size sliders**: Visual size range selection
|
||||
- **Label selector**: Hierarchical label browser
|
||||
- **Source filters**: Filter by upload source
|
||||
|
||||
#### Results Area
|
||||
- **Sort options**: Relevance, date, filename, size
|
||||
- **View modes**: List view, grid view, detail view
|
||||
- **Pagination**: Navigate through result pages
|
||||
- **Export options**: CSV, JSON export of results
|
||||
|
||||
### Search Results
|
||||
|
||||
#### Result Display Elements
|
||||
|
||||
**Document Cards**:
|
||||
- **Filename**: Primary document identifier
|
||||
- **Snippet**: Highlighted text excerpt showing search matches
|
||||
- **Metadata**: File size, type, upload date, labels
|
||||
- **Relevance Score**: Numerical relevance ranking
|
||||
- **Quick Actions**: Download, view, edit labels
|
||||
|
||||
**Highlighting**:
|
||||
- **Search terms**: Highlighted in yellow
|
||||
- **Context**: Surrounding text for context
|
||||
- **Multiple matches**: All instances highlighted
|
||||
- **Snippet length**: Configurable in user settings
|
||||
|
||||
#### Result Sorting
|
||||
|
||||
**Sort Options**:
|
||||
- **Relevance**: Best matches first (default)
|
||||
- **Date**: Newest or oldest first
|
||||
- **Filename**: Alphabetical order
|
||||
- **Size**: Largest or smallest first
|
||||
- **Score**: Highest search score first
|
||||
|
||||
**Secondary Sorting**:
|
||||
- Apply secondary criteria when primary sort values are equal
|
||||
- Example: Sort by relevance, then by date
|
||||
|
||||
### Search Configuration
|
||||
|
||||
#### User Preferences
|
||||
|
||||
**Search Settings** (accessible via Settings → Search):
|
||||
- **Results per page**: 10, 25, 50, 100
|
||||
- **Snippet length**: 100, 200, 300, 500 characters
|
||||
- **Fuzzy threshold**: Sensitivity for approximate matching
|
||||
- **Default sort**: Preferred default sorting option
|
||||
- **Search history**: Enable/disable query history
|
||||
|
||||
#### Search Behavior
|
||||
- **Auto-complete**: Enable search suggestions
|
||||
- **Real-time search**: Search as you type
|
||||
- **Search highlighting**: Highlight search terms in results
|
||||
- **Context snippets**: Show surrounding text in results
|
||||
|
||||
## Search Optimization
|
||||
|
||||
### Query Optimization
|
||||
|
||||
#### Best Practices
|
||||
|
||||
1. **Use Specific Terms**: More specific queries yield better results
|
||||
```
|
||||
Good: "quarterly sales report Q1"
|
||||
Poor: "document"
|
||||
```
|
||||
|
||||
2. **Combine Search Modes**: Use appropriate mode for your needs
|
||||
```
|
||||
Exact phrases: "status update"
|
||||
Flexible terms: project~
|
||||
Complex logic: (budget OR financial) AND 2024
|
||||
```
|
||||
|
||||
3. **Leverage Filters**: Combine text search with filters
|
||||
```
|
||||
Query: budget
|
||||
Filters: Type = PDF, Date = This Quarter, Label = Finance
|
||||
```
|
||||
|
||||
4. **Use Field Search**: Target specific document aspects
|
||||
```
|
||||
filename:invoice date:2024
|
||||
content:"project milestone" label:important
|
||||
```
|
||||
|
||||
### Performance Tips
|
||||
|
||||
#### Efficient Searching
|
||||
|
||||
1. **Start Broad, Then Narrow**: Begin with general terms, then add filters
|
||||
2. **Use Filters Early**: Apply filters before complex text queries
|
||||
3. **Avoid Wildcards at Start**: `*report` is slower than `report*`
|
||||
4. **Combine Short Queries**: Use multiple short terms rather than long phrases
|
||||
|
||||
#### Search Index Optimization
|
||||
|
||||
The search system automatically optimizes for:
|
||||
- **Frequent Terms**: Common words are indexed for fast retrieval
|
||||
- **Document Updates**: New documents are indexed immediately
|
||||
- **Language Support**: Multi-language stemming and analysis
|
||||
- **Cache Management**: Frequent searches are cached
|
||||
|
||||
### OCR Search Optimization
|
||||
|
||||
#### Handling OCR Text
|
||||
|
||||
OCR-extracted text may contain errors that affect search:
|
||||
|
||||
**Strategies**:
|
||||
1. **Use Fuzzy Search**: Handle OCR errors with approximate matching
|
||||
2. **Try Variations**: Search for common OCR mistakes
|
||||
3. **Use Context**: Include surrounding words for better matches
|
||||
4. **Check Original**: Compare with original document when possible
|
||||
|
||||
**Common OCR Issues**:
|
||||
- **Character confusion**: "m" vs "rn", "cl" vs "d"
|
||||
- **Word boundaries**: "some thing" vs "something"
|
||||
- **Special characters**: Missing or incorrect punctuation
|
||||
|
||||
**Optimization Examples**:
|
||||
```
|
||||
# Original: "invoice"
|
||||
# OCR might produce: "irwoice", "invoce", "mvoice"
|
||||
# Solution: Use fuzzy search
|
||||
invoice~
|
||||
|
||||
# Or search for context
|
||||
"invoice number" OR "irwoice number" OR "invoce number"
|
||||
```
|
||||
|
||||
## Saved Searches
|
||||
|
||||
### Creating Saved Searches
|
||||
|
||||
1. **Build Your Query**: Create a search with desired parameters
|
||||
2. **Test Results**: Verify the search returns expected documents
|
||||
3. **Save Search**: Click "Save Search" button
|
||||
4. **Name Search**: Provide descriptive name
|
||||
5. **Configure Options**: Set update frequency and notifications
|
||||
|
||||
### Managing Saved Searches
|
||||
|
||||
**Saved Search Features**:
|
||||
- **Quick Access**: Available in sidebar or dashboard
|
||||
- **Automatic Updates**: Results update as new documents are added
|
||||
- **Shared Access**: Share searches with other users (future feature)
|
||||
- **Export Options**: Export results automatically
|
||||
|
||||
**Search Organization**:
|
||||
- **Categories**: Group related searches
|
||||
- **Favorites**: Mark frequently used searches
|
||||
- **Recent**: Quick access to recently used searches
|
||||
|
||||
### Smart Collections
|
||||
|
||||
Saved searches that automatically include new documents:
|
||||
|
||||
**Examples**:
|
||||
- **"This Month's Reports"**: `type:pdf AND content:report AND date:this-month`
|
||||
- **"Pending Review"**: `label:"needs review" AND -label:completed`
|
||||
- **"High Priority Items"**: `label:(urgent OR critical OR "high priority")`
|
||||
|
||||
## Search Analytics
|
||||
|
||||
### Search Performance Metrics
|
||||
|
||||
**Available Metrics**:
|
||||
- **Query Performance**: Average search response times
|
||||
- **Popular Searches**: Most frequently used search terms
|
||||
- **Result Quality**: Click-through rates and user engagement
|
||||
- **Search Patterns**: Common search behaviors and trends
|
||||
|
||||
### User Search History
|
||||
|
||||
**History Features**:
|
||||
- **Recent Searches**: Quick access to previous queries
|
||||
- **Search Suggestions**: Based on search history
|
||||
- **Query Refinement**: Improve searches based on past patterns
|
||||
- **Export History**: Download search history for analysis
|
||||
|
||||
## API Search
|
||||
|
||||
### Basic Search API
|
||||
|
||||
```bash
|
||||
GET /api/search?query=invoice&limit=20
|
||||
Authorization: Bearer <jwt_token>
|
||||
```
|
||||
|
||||
**Query Parameters**:
|
||||
- `query`: Search query string
|
||||
- `limit`: Number of results (default: 50, max: 100)
|
||||
- `offset`: Pagination offset
|
||||
- `sort`: Sort order (relevance, date, filename, size)
|
||||
|
||||
### Advanced Search API
|
||||
|
||||
```bash
|
||||
POST /api/search/advanced
|
||||
Authorization: Bearer <jwt_token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"query": "budget report",
|
||||
"mode": "phrase",
|
||||
"filters": {
|
||||
"file_types": ["pdf", "docx"],
|
||||
"labels": ["Q1 2024", "Finance"],
|
||||
"date_range": {
|
||||
"start": "2024-01-01",
|
||||
"end": "2024-03-31"
|
||||
},
|
||||
"size_range": {
|
||||
"min": 1048576,
|
||||
"max": 52428800
|
||||
}
|
||||
},
|
||||
"options": {
|
||||
"fuzzy_threshold": 0.8,
|
||||
"snippet_length": 200,
|
||||
"highlight": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Search Response Format
|
||||
|
||||
```json
|
||||
{
|
||||
"results": [
|
||||
{
|
||||
"id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"filename": "Q1_Budget_Report.pdf",
|
||||
"snippet": "The quarterly budget report shows a <mark>10% increase</mark> in revenue...",
|
||||
"score": 0.95,
|
||||
"highlights": ["budget", "report"],
|
||||
"metadata": {
|
||||
"size": 2048576,
|
||||
"type": "application/pdf",
|
||||
"uploaded_at": "2024-01-15T10:30:00Z",
|
||||
"labels": ["Q1 2024", "Finance", "Budget"],
|
||||
"source": "WebDAV Sync"
|
||||
}
|
||||
}
|
||||
],
|
||||
"total": 42,
|
||||
"limit": 20,
|
||||
"offset": 0,
|
||||
"query_time": 0.085
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Search Issues
|
||||
|
||||
#### No Results Found
|
||||
|
||||
**Possible Causes**:
|
||||
1. **Typos**: Check spelling in search query
|
||||
2. **Too Specific**: Query might be too restrictive
|
||||
3. **Wrong Mode**: Using exact search when fuzzy would be better
|
||||
4. **Filters**: Remove filters to check if they're excluding results
|
||||
|
||||
**Solutions**:
|
||||
1. **Simplify Query**: Start with broader terms
|
||||
2. **Check Spelling**: Use fuzzy search for typo tolerance
|
||||
3. **Remove Filters**: Test without date, type, or label filters
|
||||
4. **Try Synonyms**: Use alternative terms for the same concept
|
||||
|
||||
#### Irrelevant Results
|
||||
|
||||
**Possible Causes**:
|
||||
1. **Too Broad**: Query matches too many unrelated documents
|
||||
2. **Common Terms**: Using very common words that appear everywhere
|
||||
3. **Wrong Mode**: Using fuzzy when exact match is needed
|
||||
|
||||
**Solutions**:
|
||||
1. **Add Specificity**: Include more specific terms or context
|
||||
2. **Use Filters**: Add file type, date, or label filters
|
||||
3. **Phrase Search**: Use quotes for exact phrases
|
||||
4. **Boolean Logic**: Use AND/OR/NOT for better control
|
||||
|
||||
#### Slow Search Performance
|
||||
|
||||
**Possible Causes**:
|
||||
1. **Complex Queries**: Very complex boolean queries
|
||||
2. **Large Result Sets**: Queries matching many documents
|
||||
3. **Wildcard Overuse**: Starting queries with wildcards
|
||||
|
||||
**Solutions**:
|
||||
1. **Simplify Queries**: Break complex queries into simpler ones
|
||||
2. **Add Filters**: Use filters to reduce result set size
|
||||
3. **Avoid Leading Wildcards**: Use `term*` instead of `*term`
|
||||
4. **Use Pagination**: Request smaller result sets
|
||||
|
||||
### OCR Search Issues
|
||||
|
||||
#### OCR Text Not Searchable
|
||||
|
||||
**Symptoms**: Can't find text that's visible in document images
|
||||
**Solutions**:
|
||||
1. **Check OCR Status**: Verify OCR processing completed
|
||||
2. **Retry OCR**: Manually retry OCR processing
|
||||
3. **Use Fuzzy Search**: OCR might have character recognition errors
|
||||
4. **Check Language Settings**: Ensure correct OCR language is configured
|
||||
|
||||
#### Poor OCR Search Quality
|
||||
|
||||
**Symptoms**: Fuzzy search required for most queries on scanned documents
|
||||
**Solutions**:
|
||||
1. **Improve Source Quality**: Use higher resolution scans (300+ DPI)
|
||||
2. **OCR Language**: Verify correct language setting for documents
|
||||
3. **Image Enhancement**: Enable OCR preprocessing options
|
||||
4. **Manual Correction**: Consider manual text correction for important documents
|
||||
|
||||
### Search Configuration Issues
|
||||
|
||||
#### Settings Not Applied
|
||||
|
||||
**Symptoms**: Search settings changes don't take effect
|
||||
**Solutions**:
|
||||
1. **Reload Page**: Refresh browser to apply settings
|
||||
2. **Clear Cache**: Clear browser cache and cookies
|
||||
3. **Check Permissions**: Ensure user has permission to modify settings
|
||||
4. **Database Issues**: Check if settings are being saved to database
|
||||
|
||||
#### Filter Problems
|
||||
|
||||
**Symptoms**: Filters not working as expected
|
||||
**Solutions**:
|
||||
1. **Clear All Filters**: Reset filters and apply one at a time
|
||||
2. **Check Filter Logic**: Ensure AND/OR logic is correct
|
||||
3. **Label Validation**: Verify labels exist and are spelled correctly
|
||||
4. **Date Format**: Ensure dates are in correct format
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Explore [labels and organization](labels-and-organization.md) for better search categorization
|
||||
- Set up [sources](sources-guide.md) for automatic content ingestion
|
||||
- Review [user guide](user-guide.md) for general search tips
|
||||
- Check [API reference](api-reference.md) for programmatic search integration
|
||||
- Configure [OCR optimization](dev/OCR_OPTIMIZATION_GUIDE.md) for better text extraction
|
||||
|
|
@ -0,0 +1,501 @@
|
|||
# Labels and Organization Guide
|
||||
|
||||
Readur's labeling system provides powerful document organization and categorization capabilities. This guide covers creating, managing, and using labels to organize your document collection effectively.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Label Types](#label-types)
|
||||
- [Creating and Managing Labels](#creating-and-managing-labels)
|
||||
- [Assigning Labels to Documents](#assigning-labels-to-documents)
|
||||
- [Label-Based Search and Filtering](#label-based-search-and-filtering)
|
||||
- [Label Organization Strategies](#label-organization-strategies)
|
||||
- [Advanced Label Features](#advanced-label-features)
|
||||
- [Best Practices](#best-practices)
|
||||
- [API Integration](#api-integration)
|
||||
|
||||
## Overview
|
||||
|
||||
Labels in Readur provide a flexible tagging system that allows you to:
|
||||
|
||||
- **Categorize Documents**: Organize documents by type, project, department, or any custom criteria
|
||||
- **Enhanced Search**: Filter search results by specific labels for precise document discovery
|
||||
- **Visual Organization**: Color-coded labels provide instant visual categorization
|
||||
- **Bulk Operations**: Apply or remove labels from multiple documents simultaneously
|
||||
- **Project Management**: Track documents across projects, workflows, or time periods
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Hierarchical Organization**: Create nested label structures for complex categorization
|
||||
- **Color Coding**: Visual identification with customizable label colors
|
||||
- **System Labels**: Automatic labels generated by Readur for administrative purposes
|
||||
- **User Labels**: Custom labels created and managed by users
|
||||
- **Smart Collections**: Save searches that automatically include documents with specific labels
|
||||
- **Label Statistics**: Track document counts and usage analytics per label
|
||||
|
||||
## Label Types
|
||||
|
||||
### User Labels
|
||||
|
||||
**Custom labels** created and managed by users for personal or organizational categorization.
|
||||
|
||||
**Features:**
|
||||
- **Full Control**: Create, edit, rename, and delete user-created labels
|
||||
- **Color Customization**: Choose from a wide range of colors for visual organization
|
||||
- **Flexible Naming**: Use any descriptive names that fit your workflow
|
||||
- **Sharing**: Labels are visible to all users with access to labeled documents
|
||||
|
||||
**Common Use Cases:**
|
||||
- Project names (e.g., "Project Alpha", "Q1 Budget")
|
||||
- Document types (e.g., "Invoices", "Contracts", "Reports")
|
||||
- Departments (e.g., "HR", "Engineering", "Marketing")
|
||||
- Priority levels (e.g., "Urgent", "Review Needed", "Archive")
|
||||
- Status indicators (e.g., "Draft", "Final", "Approved")
|
||||
|
||||
### System Labels
|
||||
|
||||
**Automatic labels** generated by Readur based on document properties and processing status.
|
||||
|
||||
**Examples:**
|
||||
- **OCR Status**: "OCR Completed", "OCR Failed", "OCR Pending"
|
||||
- **File Type**: "PDF", "Image", "Text Document"
|
||||
- **Source Origin**: "WebDAV Upload", "Local Folder", "Manual Upload"
|
||||
- **Processing Status**: "Recently Added", "High Confidence OCR", "Needs Review"
|
||||
- **Size Categories**: "Large File", "Small File"
|
||||
- **Date-based**: "This Week", "This Month", "This Year"
|
||||
|
||||
**Characteristics:**
|
||||
- **Read-only**: Cannot be edited or deleted by users
|
||||
- **Automatic Assignment**: Applied automatically based on document properties
|
||||
- **System Managed**: Updated automatically when document properties change
|
||||
- **Consistent Formatting**: Standardized naming and color scheme
|
||||
|
||||
## Creating and Managing Labels
|
||||
|
||||
### Creating New Labels
|
||||
|
||||
#### Via Label Management Page
|
||||
|
||||
1. **Navigate to Labels**: Go to Settings → Labels
|
||||
2. **Click "Create Label"**
|
||||
3. **Configure Label Properties**:
|
||||
```
|
||||
Name: Project Documentation
|
||||
Color: Blue (#2196F3)
|
||||
Description: Documents related to current projects
|
||||
```
|
||||
4. **Save** to create the label
|
||||
|
||||
#### During Document Upload
|
||||
|
||||
1. **Upload Document(s)**: Use the upload interface
|
||||
2. **Add Labels Field**: In the upload form
|
||||
3. **Create New Label**: Type a new label name
|
||||
4. **Assign Color**: Choose color for the new label
|
||||
5. **Complete Upload**: Label is created and assigned automatically
|
||||
|
||||
#### Quick Label Creation
|
||||
|
||||
- **Search Interface**: Create labels while filtering search results
|
||||
- **Document Details**: Add new labels directly from document pages
|
||||
- **Bulk Operations**: Create labels during bulk document operations
|
||||
|
||||
### Editing Labels
|
||||
|
||||
#### Renaming Labels
|
||||
|
||||
1. **Access Label Management**: Settings → Labels
|
||||
2. **Find Target Label**: Use search or browse the label list
|
||||
3. **Click "Edit"** or double-click the label name
|
||||
4. **Modify Name**: Change to new descriptive name
|
||||
5. **Save Changes**: Updates all documents using this label
|
||||
|
||||
#### Changing Colors
|
||||
|
||||
1. **Edit Label**: Follow renaming steps above
|
||||
2. **Select New Color**: Choose from color palette or enter hex code
|
||||
3. **Preview Changes**: See how the color looks in different contexts
|
||||
4. **Apply**: Color updates immediately across all interfaces
|
||||
|
||||
#### Merging Labels
|
||||
|
||||
1. **Identify Similar Labels**: Find labels with overlapping purposes
|
||||
2. **Select Target Label**: Choose the label to keep
|
||||
3. **Merge Operation**: Use "Merge with..." option
|
||||
4. **Confirm Merge**: All documents transfer to target label
|
||||
5. **Source Label Deletion**: Original label is removed after merge
|
||||
|
||||
### Deleting Labels
|
||||
|
||||
#### Individual Label Deletion
|
||||
|
||||
1. **Label Management Page**: Access via Settings → Labels
|
||||
2. **Select Label**: Find the label to delete
|
||||
3. **Delete Action**: Click delete button or menu option
|
||||
4. **Confirm Deletion**: Confirm removal (this cannot be undone)
|
||||
5. **Document Update**: Label is removed from all associated documents
|
||||
|
||||
#### Bulk Label Cleanup
|
||||
|
||||
- **Unused Labels**: Automatically identify and remove labels with no documents
|
||||
- **Duplicate Labels**: Find and merge labels with similar names
|
||||
- **Batch Deletion**: Select multiple labels for simultaneous removal
|
||||
|
||||
## Assigning Labels to Documents
|
||||
|
||||
### Single Document Labeling
|
||||
|
||||
#### Document Details Page
|
||||
|
||||
1. **Open Document**: Click on any document to view details
|
||||
2. **Labels Section**: Find the labels area in document metadata
|
||||
3. **Add Labels**: Click "+" or "Add Label" button
|
||||
4. **Select or Create**: Choose existing labels or create new ones
|
||||
5. **Apply Changes**: Labels are assigned immediately
|
||||
|
||||
#### Quick Label Assignment
|
||||
|
||||
- **Hover Actions**: Quick label buttons appear when hovering over documents
|
||||
- **Right-Click Menu**: Context menu with common label operations
|
||||
- **Keyboard Shortcuts**: Assign frequently used labels with key combinations
|
||||
|
||||
### Bulk Label Operations
|
||||
|
||||
#### Multi-Document Selection
|
||||
|
||||
1. **Document Browser**: Navigate to documents page
|
||||
2. **Select Documents**: Use checkboxes to select multiple documents
|
||||
3. **Bulk Actions**: Click "Actions" or "Labels" in the toolbar
|
||||
4. **Apply Labels**: Choose labels to add or remove
|
||||
5. **Execute**: Apply changes to all selected documents
|
||||
|
||||
#### Search-Based Labeling
|
||||
|
||||
1. **Search for Documents**: Use search to find specific document sets
|
||||
2. **Select All Results**: Choose all documents matching criteria
|
||||
3. **Bulk Label Assignment**: Apply labels to entire result set
|
||||
4. **Confirmation**: Review and confirm bulk changes
|
||||
|
||||
### Label Assignment During Upload
|
||||
|
||||
#### Upload Interface Labeling
|
||||
|
||||
1. **File Selection**: Choose files to upload
|
||||
2. **Label Assignment**: Add labels before starting upload
|
||||
3. **Label Creation**: Create new labels during upload process
|
||||
4. **Automatic Application**: Labels assigned to all uploaded files
|
||||
|
||||
#### Drag and Drop Labeling
|
||||
|
||||
- **Pre-configured Areas**: Drag files to labeled drop zones
|
||||
- **Automatic Tagging**: Labels applied based on drop location
|
||||
- **Batch Processing**: Assign labels to multiple files simultaneously
|
||||
|
||||
## Label-Based Search and Filtering
|
||||
|
||||
### Label Filters in Search
|
||||
|
||||
#### Basic Label Filtering
|
||||
|
||||
1. **Search Interface**: Access the main search page
|
||||
2. **Label Filter Section**: Find label filters in the sidebar
|
||||
3. **Select Labels**: Check boxes for desired labels
|
||||
4. **Apply Filter**: Search results automatically update
|
||||
5. **Multiple Labels**: Combine multiple labels with AND/OR logic
|
||||
|
||||
#### Advanced Label Queries
|
||||
|
||||
**Search Syntax Examples:**
|
||||
```
|
||||
label:urgent # Documents with "urgent" label
|
||||
label:"project alpha" # Documents with multi-word label
|
||||
label:urgent AND label:review # Documents with both labels
|
||||
label:draft OR label:final # Documents with either label
|
||||
-label:archive # Exclude archived documents
|
||||
```
|
||||
|
||||
### Smart Collections
|
||||
|
||||
#### Creating Smart Collections
|
||||
|
||||
1. **Build Search Query**: Create search with label filters
|
||||
2. **Save Search**: Use "Save Search" option
|
||||
3. **Name Collection**: Give descriptive name (e.g., "Active Projects")
|
||||
4. **Automatic Updates**: Collection updates as documents are labeled
|
||||
5. **Quick Access**: Access collections from sidebar or dashboard
|
||||
|
||||
#### Collection Examples
|
||||
|
||||
**Project-Based Collections:**
|
||||
- "Q1 Budget Documents": `label:"Q1 budget" OR label:"financial planning"`
|
||||
- "Marketing Materials": `label:marketing AND (label:final OR label:approved)`
|
||||
- "Pending Review": `label:"needs review" AND -label:completed`
|
||||
|
||||
**Status-Based Collections:**
|
||||
- "Recent Uploads": `label:"this month" AND -label:processed`
|
||||
- "High Priority": `label:urgent OR label:critical`
|
||||
- "Archive Ready": `label:completed AND label:final`
|
||||
|
||||
### Label-Based Dashboard Views
|
||||
|
||||
#### Custom Dashboard Widgets
|
||||
|
||||
- **Label Statistics**: Show document counts per label
|
||||
- **Recent Activity**: Display recently labeled documents
|
||||
- **Label Trends**: Track labeling patterns over time
|
||||
- **Quick Access**: Direct links to frequently used label filters
|
||||
|
||||
## Label Organization Strategies
|
||||
|
||||
### Hierarchical Labeling
|
||||
|
||||
#### Category-Based Organization
|
||||
|
||||
**Structure Example:**
|
||||
```
|
||||
Projects/
|
||||
├── Project Alpha/
|
||||
│ ├── Requirements
|
||||
│ ├── Design
|
||||
│ └── Implementation
|
||||
├── Project Beta/
|
||||
│ ├── Research
|
||||
│ ├── Proposals
|
||||
│ └── Contracts
|
||||
└── Infrastructure/
|
||||
├── Servers
|
||||
├── Network
|
||||
└── Security
|
||||
```
|
||||
|
||||
#### Implementation Approach
|
||||
|
||||
1. **Top-Level Categories**: Create broad organizational labels
|
||||
2. **Subcategories**: Use descriptive naming for specific areas
|
||||
3. **Consistent Naming**: Establish naming conventions across categories
|
||||
4. **Cross-References**: Documents can belong to multiple hierarchies
|
||||
|
||||
### Functional Organization
|
||||
|
||||
#### Document Lifecycle Labels
|
||||
|
||||
**Workflow Stages:**
|
||||
- **Creation**: "Draft", "In Progress", "Under Review"
|
||||
- **Approval**: "Pending Approval", "Approved", "Rejected"
|
||||
- **Distribution**: "Published", "Distributed", "Archived"
|
||||
- **Maintenance**: "Current", "Outdated", "Superseded"
|
||||
|
||||
#### Department-Based Labeling
|
||||
|
||||
**Organizational Structure:**
|
||||
- **Human Resources**: "HR Policy", "Employee Records", "Benefits"
|
||||
- **Finance**: "Invoices", "Budget", "Audit", "Tax Documents"
|
||||
- **Legal**: "Contracts", "Compliance", "IP Documents"
|
||||
- **Operations**: "Procedures", "Manuals", "Incident Reports"
|
||||
|
||||
### Time-Based Organization
|
||||
|
||||
#### Date-Driven Labels
|
||||
|
||||
- **Fiscal Periods**: "Q1 2024", "FY2024", "H1 2024"
|
||||
- **Project Phases**: "Phase 1", "Phase 2", "Final Phase"
|
||||
- **Event-Based**: "Pre-Launch", "Launch", "Post-Launch"
|
||||
- **Seasonal**: "Annual Review", "Budget Season", "Audit Period"
|
||||
|
||||
## Advanced Label Features
|
||||
|
||||
### Label Analytics
|
||||
|
||||
#### Usage Statistics
|
||||
|
||||
**Metrics Available:**
|
||||
- **Document Count**: Number of documents per label
|
||||
- **Recent Activity**: Labels used in recent uploads or assignments
|
||||
- **Growth Trends**: How label usage changes over time
|
||||
- **Popular Labels**: Most frequently used labels
|
||||
- **Unused Labels**: Labels with no current document assignments
|
||||
|
||||
#### Label Performance
|
||||
|
||||
- **Search Frequency**: How often labels are used in searches
|
||||
- **Click-Through Rates**: User engagement with labeled content
|
||||
- **Organization Effectiveness**: How labels improve document discovery
|
||||
|
||||
### Label Automation
|
||||
|
||||
#### Auto-Labeling Rules
|
||||
|
||||
**OCR-Based Labeling:**
|
||||
- **Content Detection**: Automatically label documents based on detected text
|
||||
- **Template Recognition**: Recognize document types and apply appropriate labels
|
||||
- **Entity Extraction**: Label documents based on detected entities (names, dates, amounts)
|
||||
|
||||
**Source-Based Labeling:**
|
||||
- **Upload Location**: Apply labels based on upload source or folder
|
||||
- **File Type**: Automatic labels based on file format and structure
|
||||
- **Metadata**: Labels derived from file properties and EXIF data
|
||||
|
||||
#### Workflow Integration
|
||||
|
||||
- **Process Triggers**: Apply labels based on workflow stage completion
|
||||
- **Approval Status**: Automatic labeling based on approval workflows
|
||||
- **Time-Based Rules**: Apply labels based on document age or schedule
|
||||
|
||||
### Label Import/Export
|
||||
|
||||
#### Bulk Label Operations
|
||||
|
||||
**Import Scenarios:**
|
||||
- **Migration**: Import existing label structures from other systems
|
||||
- **Template Application**: Apply predefined label sets to document collections
|
||||
- **Organizational Standards**: Implement company-wide labeling standards
|
||||
|
||||
**Export Capabilities:**
|
||||
- **Backup**: Export label definitions for backup purposes
|
||||
- **Reporting**: Generate reports of label usage and document organization
|
||||
- **Integration**: Share label structures with other systems
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Label Design
|
||||
|
||||
#### Naming Conventions
|
||||
|
||||
1. **Descriptive Names**: Use clear, self-explanatory label names
|
||||
2. **Consistent Format**: Establish and follow naming patterns
|
||||
3. **Avoid Ambiguity**: Choose names that won't be confused with similar concepts
|
||||
4. **Length Consideration**: Keep names concise but informative
|
||||
5. **Special Characters**: Avoid special characters that may cause issues
|
||||
|
||||
**Good Examples:**
|
||||
- "Q1-2024-Budget" ✅
|
||||
- "Legal-Contract-Template" ✅
|
||||
- "Marketing-Campaign-Assets" ✅
|
||||
|
||||
**Poor Examples:**
|
||||
- "Stuff" ❌ (too vague)
|
||||
- "Q1 Budget Documents for 2024 Financial Planning" ❌ (too long)
|
||||
- "Legal/Contract#Template@2024" ❌ (special characters)
|
||||
|
||||
#### Color Strategy
|
||||
|
||||
1. **Consistent Color Families**: Use similar colors for related label categories
|
||||
2. **High Contrast**: Ensure labels are readable against various backgrounds
|
||||
3. **Color Meaning**: Establish color conventions (e.g., red for urgent, green for completed)
|
||||
4. **Accessibility**: Consider color-blind users when choosing colors
|
||||
5. **Limited Palette**: Don't use too many different colors
|
||||
|
||||
### Organization Strategy
|
||||
|
||||
#### Start Simple
|
||||
|
||||
1. **Basic Categories**: Begin with broad, obvious categories
|
||||
2. **Organic Growth**: Add labels as needs become apparent
|
||||
3. **User Feedback**: Incorporate user suggestions for new labels
|
||||
4. **Regular Review**: Periodically assess and refine label structure
|
||||
|
||||
#### Maintain Consistency
|
||||
|
||||
1. **Documentation**: Document labeling standards and conventions
|
||||
2. **Training**: Educate users on proper labeling practices
|
||||
3. **Regular Cleanup**: Remove unused or redundant labels
|
||||
4. **Standardization**: Ensure consistent application across teams
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
#### Label Management
|
||||
|
||||
1. **Avoid Over-Labeling**: Don't create too many similar labels
|
||||
2. **Regular Cleanup**: Remove unused labels to reduce clutter
|
||||
3. **Search Optimization**: Focus on labels that improve searchability
|
||||
4. **User Training**: Educate users on effective labeling practices
|
||||
|
||||
#### System Performance
|
||||
|
||||
- **Index Optimization**: Labels are indexed for fast search performance
|
||||
- **Bulk Operations**: Use bulk assignment for better efficiency
|
||||
- **Caching**: Frequently used labels are cached for quick access
|
||||
|
||||
## API Integration
|
||||
|
||||
### Label Management API
|
||||
|
||||
#### Creating Labels
|
||||
|
||||
```bash
|
||||
POST /api/labels
|
||||
Authorization: Bearer <jwt_token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"name": "Project Documentation",
|
||||
"color": "#2196F3"
|
||||
}
|
||||
```
|
||||
|
||||
#### Listing Labels
|
||||
|
||||
```bash
|
||||
GET /api/labels
|
||||
Authorization: Bearer <jwt_token>
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"labels": [
|
||||
{
|
||||
"id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"name": "Project Documentation",
|
||||
"color": "#2196F3",
|
||||
"document_count": 42,
|
||||
"created_at": "2024-01-01T00:00:00Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### Assigning Labels to Documents
|
||||
|
||||
```bash
|
||||
PATCH /api/documents/{document_id}
|
||||
Authorization: Bearer <jwt_token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"labels": ["Project Documentation", "Q1 2024", "High Priority"]
|
||||
}
|
||||
```
|
||||
|
||||
### Search Integration
|
||||
|
||||
#### Label-Based Search
|
||||
|
||||
```bash
|
||||
GET /api/search?query=invoice&labels=urgent,review
|
||||
Authorization: Bearer <jwt_token>
|
||||
```
|
||||
|
||||
#### Advanced Label Queries
|
||||
|
||||
```bash
|
||||
POST /api/search/advanced
|
||||
Authorization: Bearer <jwt_token>
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"query": "budget",
|
||||
"filters": {
|
||||
"labels": ["Q1 2024", "Finance"],
|
||||
"label_logic": "AND"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Configure [advanced search](advanced-search.md) with label-based filtering
|
||||
- Set up [sources](sources-guide.md) with automatic labeling rules
|
||||
- Explore [user management](user-management-guide.md) for collaborative labeling
|
||||
- Review [API reference](api-reference.md) for programmatic label management
|
||||
- Check [best practices](user-guide.md#tips-for-best-results) for document organization
|
||||
|
|
@ -0,0 +1,498 @@
|
|||
# Sources Guide
|
||||
|
||||
Readur's Sources feature provides powerful automated document ingestion from multiple external storage systems. This comprehensive guide covers all supported source types and their configuration.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Source Types](#source-types)
|
||||
- [WebDAV Sources](#webdav-sources)
|
||||
- [Local Folder Sources](#local-folder-sources)
|
||||
- [S3 Sources](#s3-sources)
|
||||
- [Getting Started](#getting-started)
|
||||
- [Configuration](#configuration)
|
||||
- [Sync Operations](#sync-operations)
|
||||
- [Health Monitoring](#health-monitoring)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
- [Best Practices](#best-practices)
|
||||
|
||||
## Overview
|
||||
|
||||
Sources allow Readur to automatically discover, download, and process documents from external storage systems. Key features include:
|
||||
|
||||
- **Multi-Protocol Support**: WebDAV, Local Folders, and S3-compatible storage
|
||||
- **Automated Syncing**: Scheduled synchronization with configurable intervals
|
||||
- **Health Monitoring**: Proactive monitoring and validation of source connections
|
||||
- **Intelligent Processing**: Duplicate detection, incremental syncs, and OCR integration
|
||||
- **Real-time Status**: Live sync progress and comprehensive statistics
|
||||
|
||||
### How Sources Work
|
||||
|
||||
1. **Configuration**: Set up a source with connection details and preferences
|
||||
2. **Discovery**: Readur scans the source for supported file types
|
||||
3. **Synchronization**: New and changed files are downloaded and processed
|
||||
4. **OCR Processing**: Documents are automatically queued for text extraction
|
||||
5. **Search Integration**: Processed documents become searchable in your collection
|
||||
|
||||
## Source Types
|
||||
|
||||
### WebDAV Sources
|
||||
|
||||
WebDAV sources connect to cloud storage services and self-hosted servers that support the WebDAV protocol.
|
||||
|
||||
#### Supported WebDAV Servers
|
||||
|
||||
| Server Type | Status | Notes |
|
||||
|-------------|--------|-------|
|
||||
| **Nextcloud** | ✅ Fully Supported | Optimized discovery and authentication |
|
||||
| **ownCloud** | ✅ Fully Supported | Native integration with server detection |
|
||||
| **Apache WebDAV** | ✅ Supported | Generic WebDAV implementation |
|
||||
| **nginx WebDAV** | ✅ Supported | Works with nginx dav module |
|
||||
| **Box.com** | ⚠️ Limited | Basic WebDAV support |
|
||||
| **Other WebDAV** | ✅ Supported | Generic WebDAV protocol compliance |
|
||||
|
||||
#### WebDAV Configuration
|
||||
|
||||
**Required Fields:**
|
||||
- **Name**: Descriptive name for the source
|
||||
- **Server URL**: Full WebDAV server URL (e.g., `https://cloud.example.com/remote.php/dav/files/username/`)
|
||||
- **Username**: WebDAV authentication username
|
||||
- **Password**: WebDAV authentication password or app password
|
||||
|
||||
**Optional Configuration:**
|
||||
- **Watch Folders**: Specific directories to monitor (leave empty to sync entire accessible space)
|
||||
- **File Extensions**: Limit to specific file types (default: all supported types)
|
||||
- **Auto Sync**: Enable automatic scheduled synchronization
|
||||
- **Sync Interval**: How often to check for changes (15 minutes to 24 hours)
|
||||
- **Server Type**: Specify server type for optimizations (auto-detected)
|
||||
|
||||
#### Setting Up WebDAV Sources
|
||||
|
||||
1. **Navigate to Sources**: Go to Settings → Sources in the Readur interface
|
||||
2. **Add New Source**: Click "Add Source" and select "WebDAV"
|
||||
3. **Configure Connection**:
|
||||
```
|
||||
Name: My Nextcloud Documents
|
||||
Server URL: https://cloud.mycompany.com/remote.php/dav/files/john/
|
||||
Username: john
|
||||
Password: app-password-here
|
||||
```
|
||||
4. **Test Connection**: Use the "Test Connection" button to verify credentials
|
||||
5. **Configure Folders**: Specify directories to monitor:
|
||||
```
|
||||
Watch Folders:
|
||||
- Documents/
|
||||
- Projects/2024/
|
||||
- Invoices/
|
||||
```
|
||||
6. **Set Sync Schedule**: Choose automatic sync interval (recommended: 30 minutes)
|
||||
7. **Save and Sync**: Save configuration and trigger initial sync
|
||||
|
||||
#### WebDAV Best Practices
|
||||
|
||||
- **Use App Passwords**: Create dedicated app passwords instead of using main account passwords
|
||||
- **Limit Scope**: Specify watch folders to avoid syncing unnecessary files
|
||||
- **Server Optimization**: Let Readur auto-detect server type for optimal performance
|
||||
- **Network Considerations**: Use longer sync intervals for slow connections
|
||||
|
||||
### Local Folder Sources
|
||||
|
||||
Local folder sources monitor directories on the Readur server's filesystem, including mounted network drives.
|
||||
|
||||
#### Use Cases
|
||||
|
||||
- **Watch Folders**: Monitor directories where documents are dropped
|
||||
- **Network Mounts**: Sync from NFS, SMB/CIFS, or other mounted filesystems
|
||||
- **Batch Processing**: Automatically process documents placed in specific folders
|
||||
- **Archive Integration**: Monitor existing document archives
|
||||
|
||||
#### Local Folder Configuration
|
||||
|
||||
**Required Fields:**
|
||||
- **Name**: Descriptive name for the source
|
||||
- **Watch Folders**: Absolute paths to monitor directories
|
||||
|
||||
**Optional Configuration:**
|
||||
- **File Extensions**: Filter by specific file types
|
||||
- **Auto Sync**: Enable scheduled monitoring
|
||||
- **Sync Interval**: Frequency of directory scans
|
||||
- **Recursive**: Include subdirectories in scans
|
||||
- **Follow Symlinks**: Follow symbolic links (use with caution)
|
||||
|
||||
#### Setting Up Local Folder Sources
|
||||
|
||||
1. **Prepare Directory**: Ensure the directory exists and is accessible
|
||||
```bash
|
||||
# Create watch folder
|
||||
mkdir -p /mnt/documents/inbox
|
||||
|
||||
# Set permissions (if needed)
|
||||
chmod 755 /mnt/documents/inbox
|
||||
```
|
||||
|
||||
2. **Configure Source**:
|
||||
```
|
||||
Name: Document Inbox
|
||||
Watch Folders: /mnt/documents/inbox
|
||||
File Extensions: pdf,jpg,png,txt,docx
|
||||
Auto Sync: Enabled
|
||||
Sync Interval: 5 minutes
|
||||
Recursive: Yes
|
||||
```
|
||||
|
||||
3. **Test Setup**: Place a test document in the folder and verify detection
|
||||
|
||||
#### Network Mount Examples
|
||||
|
||||
**NFS Mount:**
|
||||
```bash
|
||||
# Mount NFS share
|
||||
sudo mount -t nfs 192.168.1.100:/documents /mnt/nfs-docs
|
||||
|
||||
# Configure in Readur
|
||||
Watch Folders: /mnt/nfs-docs/inbox
|
||||
```
|
||||
|
||||
**SMB/CIFS Mount:**
|
||||
```bash
|
||||
# Mount SMB share
|
||||
sudo mount -t cifs //server/documents /mnt/smb-docs -o username=user
|
||||
|
||||
# Configure in Readur
|
||||
Watch Folders: /mnt/smb-docs/processing
|
||||
```
|
||||
|
||||
### S3 Sources
|
||||
|
||||
S3 sources connect to Amazon S3 or S3-compatible storage services for document synchronization.
|
||||
|
||||
#### Supported S3 Services
|
||||
|
||||
| Service | Status | Configuration |
|
||||
|---------|--------|---------------|
|
||||
| **Amazon S3** | ✅ Fully Supported | Standard AWS configuration |
|
||||
| **MinIO** | ✅ Fully Supported | Custom endpoint URL |
|
||||
| **DigitalOcean Spaces** | ✅ Supported | S3-compatible API |
|
||||
| **Wasabi** | ✅ Supported | Custom endpoint configuration |
|
||||
| **Google Cloud Storage** | ⚠️ Limited | S3-compatible mode only |
|
||||
|
||||
#### S3 Configuration
|
||||
|
||||
**Required Fields:**
|
||||
- **Name**: Descriptive name for the source
|
||||
- **Bucket Name**: S3 bucket to monitor
|
||||
- **Region**: AWS region (e.g., `us-east-1`)
|
||||
- **Access Key ID**: AWS/S3 access key
|
||||
- **Secret Access Key**: AWS/S3 secret key
|
||||
|
||||
**Optional Configuration:**
|
||||
- **Endpoint URL**: Custom endpoint for S3-compatible services
|
||||
- **Prefix**: Bucket path prefix to limit scope
|
||||
- **Watch Folders**: Specific S3 "directories" to monitor
|
||||
- **File Extensions**: Filter by file types
|
||||
- **Auto Sync**: Enable scheduled synchronization
|
||||
- **Sync Interval**: Frequency of bucket scans
|
||||
|
||||
#### Setting Up S3 Sources
|
||||
|
||||
1. **Prepare S3 Bucket**: Ensure bucket exists and credentials have access
|
||||
2. **Configure Source**:
|
||||
```
|
||||
Name: Company Documents S3
|
||||
Bucket Name: company-documents
|
||||
Region: us-west-2
|
||||
Access Key ID: AKIAIOSFODNN7EXAMPLE
|
||||
Secret Access Key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||
Prefix: documents/
|
||||
Watch Folders:
|
||||
- invoices/
|
||||
- contracts/
|
||||
- reports/
|
||||
```
|
||||
|
||||
3. **Test Connection**: Verify credentials and bucket access
|
||||
|
||||
#### S3-Compatible Services
|
||||
|
||||
**MinIO Configuration:**
|
||||
```
|
||||
Endpoint URL: https://minio.example.com:9000
|
||||
Bucket Name: documents
|
||||
Region: us-east-1 (can be any value for MinIO)
|
||||
```
|
||||
|
||||
**DigitalOcean Spaces:**
|
||||
```
|
||||
Endpoint URL: https://nyc3.digitaloceanspaces.com
|
||||
Bucket Name: my-documents
|
||||
Region: nyc3
|
||||
```
|
||||
|
||||
## Getting Started
|
||||
|
||||
### Adding Your First Source
|
||||
|
||||
1. **Access Sources Management**: Navigate to Settings → Sources
|
||||
2. **Choose Source Type**: Select WebDAV, Local Folder, or S3 based on your needs
|
||||
3. **Configure Connection**: Enter required credentials and connection details
|
||||
4. **Test Connection**: Verify connectivity before saving
|
||||
5. **Configure Sync**: Set up folders to monitor and sync schedule
|
||||
6. **Initial Sync**: Trigger first synchronization to import existing documents
|
||||
|
||||
### Quick Setup Examples
|
||||
|
||||
#### Nextcloud WebDAV
|
||||
```
|
||||
Name: Nextcloud Documents
|
||||
Server URL: https://cloud.company.com/remote.php/dav/files/username/
|
||||
Username: username
|
||||
Password: app-password
|
||||
Watch Folders: Documents/, Shared/
|
||||
Auto Sync: Every 30 minutes
|
||||
```
|
||||
|
||||
#### Local Network Drive
|
||||
```
|
||||
Name: Network Archive
|
||||
Watch Folders: /mnt/network/documents
|
||||
File Extensions: pdf,doc,docx,txt
|
||||
Recursive: Yes
|
||||
Auto Sync: Every 15 minutes
|
||||
```
|
||||
|
||||
#### AWS S3 Bucket
|
||||
```
|
||||
Name: AWS Document Bucket
|
||||
Bucket: company-docs-bucket
|
||||
Region: us-east-1
|
||||
Access Key: [AWS Access Key]
|
||||
Secret Key: [AWS Secret Key]
|
||||
Prefix: active-documents/
|
||||
Auto Sync: Every 1 hour
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Sync Settings
|
||||
|
||||
**Sync Intervals:**
|
||||
- **Real-time**: Immediate processing (local folders only)
|
||||
- **5-15 minutes**: High-frequency monitoring
|
||||
- **30-60 minutes**: Standard monitoring (recommended)
|
||||
- **2-24 hours**: Low-frequency, large dataset sync
|
||||
|
||||
**File Filtering:**
|
||||
- **File Extensions**: `pdf,jpg,jpeg,png,txt,doc,docx,rtf`
|
||||
- **Size Limits**: Configurable maximum file size (default: 50MB)
|
||||
- **Path Exclusions**: Skip specific directories or file patterns
|
||||
|
||||
### Advanced Configuration
|
||||
|
||||
**Concurrency Settings:**
|
||||
- **Concurrent Files**: Number of files processed simultaneously (default: 5)
|
||||
- **Network Timeout**: Connection timeout for network sources
|
||||
- **Retry Logic**: Automatic retry for failed downloads
|
||||
|
||||
**Deduplication:**
|
||||
- **Hash-based**: SHA-256 content hashing prevents duplicate storage
|
||||
- **Cross-source**: Duplicates detected across all sources
|
||||
- **Metadata Preservation**: Tracks file origins while avoiding storage duplication
|
||||
|
||||
## Sync Operations
|
||||
|
||||
### Manual Sync
|
||||
|
||||
**Trigger Immediate Sync:**
|
||||
1. Navigate to Sources page
|
||||
2. Find the source to sync
|
||||
3. Click the "Sync Now" button
|
||||
4. Monitor progress in real-time
|
||||
|
||||
**Deep Scan:**
|
||||
- Forces complete re-scan of entire source
|
||||
- Useful for detecting changes in large directories
|
||||
- Automatically triggered periodically
|
||||
|
||||
### Sync Status
|
||||
|
||||
**Status Indicators:**
|
||||
- 🟢 **Idle**: Source ready, no sync in progress
|
||||
- 🟡 **Syncing**: Active synchronization in progress
|
||||
- 🔴 **Error**: Sync failed, requires attention
|
||||
- ⚪ **Disabled**: Source disabled, no automatic sync
|
||||
|
||||
**Progress Information:**
|
||||
- Files discovered vs. processed
|
||||
- Current operation (scanning, downloading, processing)
|
||||
- Estimated completion time
|
||||
- Transfer speeds and statistics
|
||||
|
||||
### Stopping Sync
|
||||
|
||||
**Graceful Cancellation:**
|
||||
1. Click "Stop Sync" button during active sync
|
||||
2. Current file processing completes
|
||||
3. Sync stops cleanly without corruption
|
||||
4. Partial progress is saved
|
||||
|
||||
## Health Monitoring
|
||||
|
||||
### Health Scores
|
||||
|
||||
Sources are continuously monitored and assigned health scores (0-100):
|
||||
|
||||
- **90-100**: ✅ Excellent - No issues detected
|
||||
- **75-89**: ⚠️ Good - Minor issues or warnings
|
||||
- **50-74**: ⚠️ Fair - Moderate issues requiring attention
|
||||
- **25-49**: ❌ Poor - Significant problems
|
||||
- **0-24**: ❌ Critical - Severe issues, manual intervention required
|
||||
|
||||
### Health Checks
|
||||
|
||||
**Automatic Validation** (every 30 minutes):
|
||||
- Connection testing
|
||||
- Credential verification
|
||||
- Configuration validation
|
||||
- Sync pattern analysis
|
||||
- Error rate monitoring
|
||||
|
||||
**Common Health Issues:**
|
||||
- Authentication failures
|
||||
- Network connectivity problems
|
||||
- Permission or access issues
|
||||
- Configuration errors
|
||||
- Rate limiting or throttling
|
||||
|
||||
### Health Notifications
|
||||
|
||||
**Alert Types:**
|
||||
- Connection failures
|
||||
- Authentication expires
|
||||
- Sync errors
|
||||
- Performance degradation
|
||||
- Configuration warnings
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### WebDAV Connection Problems
|
||||
|
||||
**Symptom**: "Connection failed" or authentication errors
|
||||
**Solutions**:
|
||||
1. Verify server URL format:
|
||||
- Nextcloud: `https://server.com/remote.php/dav/files/username/`
|
||||
- ownCloud: `https://server.com/remote.php/dav/files/username/`
|
||||
- Generic: `https://server.com/webdav/`
|
||||
|
||||
2. Check credentials:
|
||||
- Use app passwords instead of main passwords
|
||||
- Verify username/password combination
|
||||
- Test credentials in web browser or WebDAV client
|
||||
|
||||
3. Network issues:
|
||||
- Verify server is accessible from Readur
|
||||
- Check firewall and SSL certificate issues
|
||||
- Test with curl: `curl -u username:password https://server.com/webdav/`
|
||||
|
||||
#### Local Folder Issues
|
||||
|
||||
**Symptom**: "Permission denied" or "Directory not found"
|
||||
**Solutions**:
|
||||
1. Check directory permissions:
|
||||
```bash
|
||||
ls -la /path/to/watch/folder
|
||||
chmod 755 /path/to/watch/folder # If needed
|
||||
```
|
||||
|
||||
2. Verify path exists:
|
||||
```bash
|
||||
stat /path/to/watch/folder
|
||||
```
|
||||
|
||||
3. For network mounts:
|
||||
```bash
|
||||
mount | grep /path/to/mount # Verify mount
|
||||
ls -la /path/to/mount # Test access
|
||||
```
|
||||
|
||||
#### S3 Access Problems
|
||||
|
||||
**Symptom**: "Access denied" or "Bucket not found"
|
||||
**Solutions**:
|
||||
1. Verify credentials and permissions:
|
||||
```bash
|
||||
aws s3 ls s3://bucket-name --profile your-profile
|
||||
```
|
||||
|
||||
2. Check bucket policy and IAM permissions
|
||||
3. Verify region configuration matches bucket region
|
||||
4. For S3-compatible services, ensure correct endpoint URL
|
||||
|
||||
### Performance Issues
|
||||
|
||||
#### Slow Sync Performance
|
||||
|
||||
**Causes and Solutions**:
|
||||
1. **Large file sizes**: Increase timeout values, consider file size limits
|
||||
2. **Network latency**: Reduce concurrent connections, increase intervals
|
||||
3. **Server throttling**: Implement longer delays between requests
|
||||
4. **Large directories**: Use watch folders to limit scope
|
||||
|
||||
#### High Resource Usage
|
||||
|
||||
**Optimization Strategies**:
|
||||
1. **Reduce concurrency**: Lower concurrent file processing
|
||||
2. **Increase intervals**: Less frequent sync checks
|
||||
3. **Filter files**: Limit to specific file types and sizes
|
||||
4. **Stagger syncs**: Avoid multiple sources syncing simultaneously
|
||||
|
||||
### Error Recovery
|
||||
|
||||
**Automatic Recovery:**
|
||||
- Failed files are automatically retried
|
||||
- Temporary network issues are handled gracefully
|
||||
- Sync resumes from last successful point
|
||||
|
||||
**Manual Recovery:**
|
||||
1. Check source health status
|
||||
2. Review error logs in source details
|
||||
3. Test connection manually
|
||||
4. Trigger deep scan to reset sync state
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Security
|
||||
|
||||
1. **Use Dedicated Credentials**: Create app-specific passwords and access keys
|
||||
2. **Limit Permissions**: Grant minimum required access to source accounts
|
||||
3. **Regular Rotation**: Periodically update passwords and access keys
|
||||
4. **Network Security**: Use HTTPS/TLS for all connections
|
||||
|
||||
### Performance
|
||||
|
||||
1. **Strategic Scheduling**: Stagger sync times for multiple sources
|
||||
2. **Scope Limitation**: Use watch folders to limit sync scope
|
||||
3. **File Filtering**: Exclude unnecessary file types and large files
|
||||
4. **Monitor Resources**: Watch CPU, memory, and network usage
|
||||
|
||||
### Organization
|
||||
|
||||
1. **Descriptive Names**: Use clear, descriptive source names
|
||||
2. **Consistent Structure**: Maintain consistent folder organization
|
||||
3. **Documentation**: Document source purposes and configurations
|
||||
4. **Regular Maintenance**: Periodically review and clean up sources
|
||||
|
||||
### Reliability
|
||||
|
||||
1. **Health Monitoring**: Regularly check source health scores
|
||||
2. **Backup Configuration**: Document source configurations
|
||||
3. **Test Scenarios**: Periodically test sync and recovery procedures
|
||||
4. **Monitor Logs**: Review sync logs for patterns or issues
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Configure [notifications](notifications.md) for sync events
|
||||
- Set up [advanced search](advanced-search.md) to find synced documents
|
||||
- Review [OCR optimization](dev/OCR_OPTIMIZATION_GUIDE.md) for processing improvements
|
||||
- Explore [labels and organization](labels-and-organization.md) for document management
|
||||
|
|
@ -10,11 +10,12 @@ A comprehensive guide to using Readur's features for document management, OCR pr
|
|||
- [Dashboard](#dashboard)
|
||||
- [Document Management](#document-management)
|
||||
- [Advanced Search](#advanced-search)
|
||||
- [Folder Watching](#folder-watching)
|
||||
- [Sources and Synchronization](#sources-and-synchronization)
|
||||
- [Document Upload](#document-upload)
|
||||
- [OCR Processing](#ocr-processing)
|
||||
- [Search Features](#search-features)
|
||||
- [Tags and Organization](#tags-and-organization)
|
||||
- [Labels and Organization](#labels-and-organization)
|
||||
- [User Management](#user-management)
|
||||
- [User Settings](#user-settings)
|
||||
- [Tips for Best Results](#tips-for-best-results)
|
||||
|
||||
|
|
@ -117,20 +118,30 @@ tag:important invoice # Search within tagged documents
|
|||
type:pdf contract # Search only PDFs
|
||||
```
|
||||
|
||||
### Folder Watching
|
||||
### Sources and Synchronization
|
||||
|
||||
The folder watching feature automatically imports documents:
|
||||
Readur's Sources feature provides automated document ingestion from multiple external storage systems:
|
||||
|
||||
1. **Non-destructive**: Source files remain untouched
|
||||
2. **Automatic Processing**: New files are detected and processed
|
||||
3. **Configurable Intervals**: Adjust scan frequency
|
||||
4. **Multiple Sources**: Watch local folders, network drives, cloud storage
|
||||
1. **Multi-Protocol Support**: WebDAV, Local Folders, and S3-compatible storage
|
||||
2. **Non-destructive**: Source files remain untouched in their original locations
|
||||
3. **Automated Syncing**: Scheduled synchronization with configurable intervals
|
||||
4. **Health Monitoring**: Proactive monitoring and validation of source connections
|
||||
5. **Intelligent Processing**: Duplicate detection, incremental syncs, and OCR integration
|
||||
|
||||
#### Setting Up Watch Folders
|
||||
1. Go to Settings → Sources
|
||||
2. Add a new source with type "Local Folder"
|
||||
3. Configure the path and scan interval
|
||||
4. Enable/disable the source as needed
|
||||
#### Supported Source Types
|
||||
|
||||
- **WebDAV Sources**: Nextcloud, ownCloud, generic WebDAV servers
|
||||
- **Local Folder Sources**: Local filesystem directories and network mounts
|
||||
- **S3 Sources**: Amazon S3 and S3-compatible storage (MinIO, DigitalOcean Spaces)
|
||||
|
||||
#### Setting Up Sources
|
||||
1. Navigate to Settings → Sources
|
||||
2. Click "Add Source" and select source type
|
||||
3. Configure connection details and credentials
|
||||
4. Test connection and configure sync settings
|
||||
5. Set up folders to monitor and sync schedule
|
||||
|
||||
> 📖 **For comprehensive source configuration**, see the [Sources Guide](sources-guide.md)
|
||||
|
||||
## Document Upload
|
||||
|
||||
|
|
@ -171,43 +182,147 @@ The folder watching feature automatically imports documents:
|
|||
|
||||
## Search Features
|
||||
|
||||
### Quick Search
|
||||
Readur provides powerful search capabilities with multiple modes and advanced filtering options.
|
||||
|
||||
### Search Modes
|
||||
|
||||
- **Simple Search**: General purpose searching with automatic stemming and fuzzy matching
|
||||
- **Phrase Search**: Find exact phrases using quotes (e.g., `"quarterly report"`)
|
||||
- **Fuzzy Search**: Handle typos and OCR errors with approximate matching (e.g., `invoice~`)
|
||||
- **Boolean Search**: Complex queries with AND, OR, NOT operators
|
||||
|
||||
### Search Interface
|
||||
|
||||
#### Quick Search
|
||||
- Available in the header on all pages
|
||||
- Instant results as you type
|
||||
- Shows top 5 matches with snippets
|
||||
- Real-time suggestions
|
||||
|
||||
### Advanced Search Page
|
||||
#### Advanced Search Page
|
||||
- Full search interface with all filters
|
||||
- Multiple search modes selector
|
||||
- Comprehensive filtering options
|
||||
- Export search results
|
||||
- Save frequently used searches
|
||||
- Search history
|
||||
- Search history and analytics
|
||||
|
||||
### Advanced Filtering
|
||||
|
||||
- **File Types**: Filter by PDF, images, documents, etc.
|
||||
- **Date Ranges**: Search within specific time periods
|
||||
- **Labels**: Filter by document tags and categories
|
||||
- **Sources**: Search within specific sync sources
|
||||
- **File Size**: Filter by document size ranges
|
||||
- **OCR Status**: Filter by text extraction status
|
||||
|
||||
### Search Tips
|
||||
1. Use quotes for exact phrases
|
||||
2. Combine filters for precise results
|
||||
3. Use wildcards: `inv*` matches invoice, inventory
|
||||
4. Search in specific fields: `filename:report`
|
||||
1. Use quotes for exact phrases: `"project status"`
|
||||
2. Combine text search with filters for precision
|
||||
3. Use wildcards: `proj*` matches project, projects, projection
|
||||
4. Search specific fields: `filename:report`, `label:urgent`
|
||||
5. Use boolean logic: `(budget OR financial) AND 2024`
|
||||
|
||||
## Tags and Organization
|
||||
> 🔍 **For detailed search techniques**, see the [Advanced Search Guide](advanced-search.md)
|
||||
|
||||
### Creating Tags
|
||||
1. Select document(s)
|
||||
2. Click "Add Tag"
|
||||
3. Enter tag name or select existing
|
||||
4. Tags are color-coded for easy identification
|
||||
## Labels and Organization
|
||||
|
||||
### Tag Management
|
||||
- Rename tags globally
|
||||
- Merge similar tags
|
||||
- Delete unused tags
|
||||
- Set tag colors
|
||||
Readur's labeling system provides comprehensive document organization and categorization capabilities.
|
||||
|
||||
### Label Types
|
||||
|
||||
- **User Labels**: Custom labels created and managed by users with full control
|
||||
- **System Labels**: Automatic labels generated by Readur (OCR status, file type, etc.)
|
||||
- **Color Coding**: Visual identification with customizable label colors
|
||||
- **Hierarchical Structure**: Organize labels in categories and subcategories
|
||||
|
||||
### Creating and Managing Labels
|
||||
|
||||
#### Creating Labels
|
||||
1. **Via Settings**: Go to Settings → Labels and click "Create Label"
|
||||
2. **During Upload**: Add labels while uploading documents
|
||||
3. **Document Details**: Add labels directly from document pages
|
||||
4. **Bulk Operations**: Create and assign labels to multiple documents
|
||||
|
||||
#### Label Operations
|
||||
- **Rename**: Change label names (updates all documents)
|
||||
- **Merge**: Combine similar labels into one
|
||||
- **Color Management**: Customize label colors for visual organization
|
||||
- **Bulk Assignment**: Apply labels to multiple documents at once
|
||||
|
||||
### Organization Strategies
|
||||
|
||||
#### Category-Based Organization
|
||||
- **Projects**: "Project Alpha", "Q1 Budget", "Infrastructure"
|
||||
- **Departments**: "HR", "Finance", "Legal", "Marketing"
|
||||
- **Document Types**: "Invoices", "Contracts", "Reports", "Policies"
|
||||
- **Status**: "Draft", "Final", "Approved", "Archived"
|
||||
|
||||
#### Time-Based Organization
|
||||
- **Fiscal Periods**: "Q1 2024", "FY2024", "Annual Review"
|
||||
- **Project Phases**: "Planning", "Implementation", "Review"
|
||||
- **Event-Based**: "Pre-Launch", "Launch", "Post-Launch"
|
||||
|
||||
### Smart Collections
|
||||
Create saved searches based on:
|
||||
- Tag combinations
|
||||
- Date ranges
|
||||
- File types
|
||||
- Custom criteria
|
||||
Create saved searches that automatically include documents with specific labels:
|
||||
- **Active Projects**: Documents with current project labels
|
||||
- **Pending Review**: Documents labeled for review
|
||||
- **High Priority**: Documents with urgent or critical labels
|
||||
|
||||
> 🏷️ **For comprehensive labeling strategies**, see the [Labels and Organization Guide](labels-and-organization.md)
|
||||
|
||||
## User Management
|
||||
|
||||
Readur provides comprehensive user management with support for both local authentication and enterprise SSO integration.
|
||||
|
||||
### Authentication Methods
|
||||
|
||||
#### Local Authentication
|
||||
- **Traditional Login**: Username and password authentication
|
||||
- **Secure Storage**: Passwords hashed with bcrypt for security
|
||||
- **Self Registration**: Users can create their own accounts (if enabled)
|
||||
|
||||
#### OIDC/SSO Authentication
|
||||
- **Enterprise Integration**: Single Sign-On with corporate identity providers
|
||||
- **Supported Providers**: Microsoft Azure AD, Google Workspace, Okta, Auth0, Keycloak
|
||||
- **Automatic Provisioning**: User accounts created automatically on first login
|
||||
- **Seamless Experience**: Users authenticate with existing corporate credentials
|
||||
|
||||
### User Roles and Permissions
|
||||
|
||||
#### User Role
|
||||
Standard users with access to core document management functionality:
|
||||
- Upload and manage documents
|
||||
- Search and view documents
|
||||
- Configure personal settings
|
||||
- Create and manage labels
|
||||
- Set up personal sources
|
||||
|
||||
#### Admin Role
|
||||
Administrators with full system access and user management capabilities:
|
||||
- **User Management**: Create, modify, and delete user accounts
|
||||
- **System Settings**: Configure global system parameters
|
||||
- **Role Management**: Assign and modify user roles
|
||||
- **System Monitoring**: View system health and performance metrics
|
||||
|
||||
### Administrative Features
|
||||
|
||||
Administrators can access user management via Settings → Users:
|
||||
- **Create Users**: Add new user accounts with role assignment
|
||||
- **Modify Users**: Update user information, roles, and passwords
|
||||
- **User Overview**: View all users with creation dates and roles
|
||||
- **Authentication Methods**: Manage both local and OIDC users
|
||||
- **Bulk Operations**: Perform operations on multiple users
|
||||
|
||||
### Mixed Authentication Environments
|
||||
|
||||
Readur supports both local and OIDC users in the same installation:
|
||||
- Local admin accounts for system management
|
||||
- OIDC user accounts for regular enterprise users
|
||||
- Flexible role assignment regardless of authentication method
|
||||
|
||||
> 👥 **For detailed user administration**, see the [User Management Guide](user-management-guide.md)
|
||||
> 🔐 **For OIDC configuration**, see the [OIDC Setup Guide](oidc-setup.md)
|
||||
|
||||
## User Settings
|
||||
|
||||
|
|
@ -276,7 +391,21 @@ Create saved searches based on:
|
|||
|
||||
## Next Steps
|
||||
|
||||
- Explore the [API Reference](api-reference.md) for automation
|
||||
- Learn about [advanced configuration](configuration.md)
|
||||
- Set up [automated workflows](WATCH_FOLDER.md)
|
||||
- Optimize [OCR performance](dev/OCR_OPTIMIZATION_GUIDE.md)
|
||||
### Explore Advanced Features
|
||||
- [🔗 Sources Guide](sources-guide.md) - Set up WebDAV, Local Folder, and S3 synchronization
|
||||
- [🔎 Advanced Search](advanced-search.md) - Master search modes, syntax, and optimization
|
||||
- [🏷️ Labels & Organization](labels-and-organization.md) - Implement effective document organization
|
||||
- [👥 User Management](user-management-guide.md) - Configure authentication and user administration
|
||||
- [🔐 OIDC Setup](oidc-setup.md) - Integrate with enterprise identity providers
|
||||
|
||||
### System Administration
|
||||
- [📦 Installation Guide](installation.md) - Full installation and setup instructions
|
||||
- [🔧 Configuration](configuration.md) - Environment variables and advanced configuration
|
||||
- [🚀 Deployment Guide](deployment.md) - Production deployment with SSL and monitoring
|
||||
- [📁 Watch Folder Guide](WATCH_FOLDER.md) - Legacy folder watching setup
|
||||
|
||||
### Development and Integration
|
||||
- [🔌 API Reference](api-reference.md) - REST API for automation and integration
|
||||
- [🏗️ Developer Documentation](dev/) - Architecture and development setup
|
||||
- [🔍 OCR Optimization](dev/OCR_OPTIMIZATION_GUIDE.md) - Improve OCR performance
|
||||
- [📊 Queue Architecture](dev/QUEUE_IMPROVEMENTS.md) - Background processing optimization
|
||||
|
|
@ -0,0 +1,440 @@
|
|||
# User Management Guide
|
||||
|
||||
This comprehensive guide covers user administration, authentication, role-based access control, and user preferences in Readur.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Authentication Methods](#authentication-methods)
|
||||
- [User Roles and Permissions](#user-roles-and-permissions)
|
||||
- [Admin User Management](#admin-user-management)
|
||||
- [User Settings and Preferences](#user-settings-and-preferences)
|
||||
- [OIDC/SSO Integration](#oidcsso-integration)
|
||||
- [Security Best Practices](#security-best-practices)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
## Overview
|
||||
|
||||
Readur provides a comprehensive user management system with support for both local authentication and enterprise SSO integration. The system features:
|
||||
|
||||
- **Dual Authentication**: Local accounts and OIDC/SSO support
|
||||
- **Role-Based Access Control**: Admin and User roles with distinct permissions
|
||||
- **User Preferences**: Extensive per-user configuration options
|
||||
- **Enterprise Integration**: OIDC support for corporate identity providers
|
||||
- **Security Features**: JWT tokens, bcrypt password hashing, and session management
|
||||
|
||||
## Authentication Methods
|
||||
|
||||
### Local Authentication
|
||||
|
||||
Local authentication uses traditional username/password combinations stored securely in Readur's database.
|
||||
|
||||
#### Features:
|
||||
- **Secure Storage**: Passwords hashed with bcrypt (cost factor 12)
|
||||
- **JWT Tokens**: 24-hour token validity with secure signing
|
||||
- **User Registration**: Self-service account creation (if enabled)
|
||||
- **Password Requirements**: Configurable complexity requirements
|
||||
|
||||
#### Creating Local Users:
|
||||
1. **Admin Creation** (via Settings):
|
||||
- Navigate to Settings → Users (Admin only)
|
||||
- Click "Add User"
|
||||
- Enter username, email, and initial password
|
||||
- Assign user role (Admin or User)
|
||||
|
||||
2. **Self Registration** (if enabled):
|
||||
- Visit the registration page
|
||||
- Provide username, email, and password
|
||||
- Account created with default User role
|
||||
|
||||
### OIDC/SSO Authentication
|
||||
|
||||
OIDC (OpenID Connect) authentication integrates with enterprise identity providers for single sign-on.
|
||||
|
||||
#### Supported Features:
|
||||
- **Standard OIDC Flow**: Authorization code flow with PKCE
|
||||
- **Automatic Discovery**: Reads provider configuration from `.well-known/openid-configuration`
|
||||
- **User Provisioning**: Automatic user creation on first login
|
||||
- **Identity Linking**: Maps OIDC identities to local user accounts
|
||||
- **Profile Sync**: Updates user information from OIDC provider
|
||||
|
||||
#### Supported Providers:
|
||||
- **Microsoft Azure AD**: Enterprise identity management
|
||||
- **Google Workspace**: Google's enterprise SSO
|
||||
- **Okta**: Popular enterprise identity provider
|
||||
- **Auth0**: Developer-friendly authentication platform
|
||||
- **Keycloak**: Open-source identity management
|
||||
- **Generic OIDC**: Any standards-compliant OIDC provider
|
||||
|
||||
See the [OIDC Setup Guide](oidc-setup.md) for detailed configuration instructions.
|
||||
|
||||
## User Roles and Permissions
|
||||
|
||||
### User Role
|
||||
|
||||
**Standard Users** have access to core document management functionality:
|
||||
|
||||
**Permissions:**
|
||||
- ✅ Upload and manage own documents
|
||||
- ✅ Search all documents (based on sharing settings)
|
||||
- ✅ Configure personal settings and preferences
|
||||
- ✅ Create and manage personal labels
|
||||
- ✅ Use OCR processing features
|
||||
- ✅ Access personal sources (WebDAV, local folders, S3)
|
||||
- ✅ View personal notifications
|
||||
- ❌ User management (cannot create/modify other users)
|
||||
- ❌ System-wide settings or configuration
|
||||
- ❌ Access to other users' private documents
|
||||
|
||||
### Admin Role
|
||||
|
||||
**Administrators** have full system access and user management capabilities:
|
||||
|
||||
**Additional Permissions:**
|
||||
- ✅ **User Management**: Create, modify, and delete user accounts
|
||||
- ✅ **System Settings**: Configure global system parameters
|
||||
- ✅ **User Impersonation**: Access other users' documents (if needed)
|
||||
- ✅ **System Monitoring**: View system health and performance metrics
|
||||
- ✅ **Advanced Configuration**: OCR settings, source configurations
|
||||
- ✅ **Security Management**: Token management, authentication settings
|
||||
|
||||
**Default Admin Account:**
|
||||
- Username: `admin`
|
||||
- Default Password: `readur2024` ⚠️ **Change immediately in production!**
|
||||
|
||||
## Admin User Management
|
||||
|
||||
### Accessing User Management
|
||||
|
||||
1. Log in as an administrator
|
||||
2. Navigate to **Settings** → **Users**
|
||||
3. The user management interface displays all system users
|
||||
|
||||
### User Management Operations
|
||||
|
||||
#### Creating Users
|
||||
|
||||
1. **Click "Add User"** in the Users section
|
||||
2. **Fill out user information**:
|
||||
```
|
||||
Username: john.doe
|
||||
Email: john.doe@company.com
|
||||
Password: [secure-password]
|
||||
Role: User (or Admin)
|
||||
```
|
||||
3. **Save** to create the account
|
||||
4. **Notify the user** of their credentials
|
||||
|
||||
#### Modifying Users
|
||||
|
||||
1. **Find the user** in the user list
|
||||
2. **Click "Edit"** or the user row
|
||||
3. **Update information**:
|
||||
- Change email address
|
||||
- Reset password
|
||||
- Modify role (User ↔ Admin)
|
||||
- Update username (if needed)
|
||||
4. **Save changes**
|
||||
|
||||
#### Deleting Users
|
||||
|
||||
1. **Select the user** to delete
|
||||
2. **Click "Delete"**
|
||||
3. **Confirm deletion** (this action cannot be undone)
|
||||
|
||||
**Important Notes:**
|
||||
- Users cannot delete their own accounts
|
||||
- Deleting a user removes all their documents and settings
|
||||
- Consider disabling instead of deleting for user retention
|
||||
|
||||
#### Bulk Operations
|
||||
|
||||
**Future Feature**: Bulk user operations for enterprise deployments:
|
||||
- Bulk user import from CSV
|
||||
- Bulk role changes
|
||||
- Bulk user deactivation
|
||||
|
||||
### User Information Display
|
||||
|
||||
The user management interface shows:
|
||||
- **Username and Email**: Primary identification
|
||||
- **Role**: Current role assignment
|
||||
- **Created Date**: Account creation timestamp
|
||||
- **Last Login**: Recent activity indicator
|
||||
- **Auth Provider**: Local or OIDC authentication method
|
||||
- **Status**: Active/disabled status (future feature)
|
||||
|
||||
## User Settings and Preferences
|
||||
|
||||
### Personal Settings Access
|
||||
|
||||
Users can configure their preferences via:
|
||||
1. **User Menu** → **Settings** (top-right corner)
|
||||
2. **Settings Page** → **Personal** tab
|
||||
|
||||
### Settings Categories
|
||||
|
||||
#### OCR Preferences
|
||||
|
||||
**Language Settings:**
|
||||
- **OCR Language**: Primary language for text recognition (25+ languages)
|
||||
- **Fallback Languages**: Secondary languages for mixed documents
|
||||
- **Auto-Detection**: Automatic language detection (if supported)
|
||||
|
||||
**Processing Options:**
|
||||
- **Image Enhancement**: Enable preprocessing for better OCR results
|
||||
- **Auto-Rotation**: Automatically rotate images for optimal text recognition
|
||||
- **Confidence Threshold**: Minimum confidence level for OCR acceptance
|
||||
- **Processing Priority**: User's OCR queue priority level
|
||||
|
||||
#### Search Preferences
|
||||
|
||||
**Display Settings:**
|
||||
- **Results Per Page**: Number of search results to display (10-100)
|
||||
- **Snippet Length**: Length of text previews in search results
|
||||
- **Fuzzy Search Threshold**: Sensitivity for fuzzy/approximate matching
|
||||
- **Search History**: Enable/disable search query history
|
||||
|
||||
**Search Behavior:**
|
||||
- **Default Sort Order**: Relevance, date, filename, size
|
||||
- **Auto-Complete**: Enable search suggestions
|
||||
- **Real-time Search**: Search as you type functionality
|
||||
|
||||
#### File Processing
|
||||
|
||||
**Upload Settings:**
|
||||
- **Default File Types**: Preferred file types for uploads
|
||||
- **Auto-OCR**: Automatically queue uploads for OCR processing
|
||||
- **Duplicate Handling**: How to handle duplicate file uploads
|
||||
- **File Size Limits**: Personal file size restrictions
|
||||
|
||||
**Storage Preferences:**
|
||||
- **Compression**: Enable compression for storage savings
|
||||
- **Retention Period**: How long to keep documents (if configured)
|
||||
- **Archive Behavior**: Automatic archiving of old documents
|
||||
|
||||
#### Interface Preferences
|
||||
|
||||
**Display Options:**
|
||||
- **Theme**: Light/dark mode preference
|
||||
- **Timezone**: Local timezone for timestamp display
|
||||
- **Date Format**: Preferred date/time display format
|
||||
- **Language**: Interface language (separate from OCR language)
|
||||
|
||||
**Navigation:**
|
||||
- **Default View**: List or grid view for document browser
|
||||
- **Sidebar Collapsed**: Default sidebar state
|
||||
- **Items Per Page**: Default pagination size
|
||||
|
||||
#### Notification Settings
|
||||
|
||||
**Notification Types:**
|
||||
- **OCR Completion**: Notify when document processing completes
|
||||
- **Source Sync**: Notifications for source synchronization events
|
||||
- **System Alerts**: Important system messages and warnings
|
||||
- **Storage Warnings**: Alerts for storage space or quota issues
|
||||
|
||||
**Delivery Methods:**
|
||||
- **In-App Notifications**: Browser notifications within Readur
|
||||
- **Email Notifications**: Email delivery for important events (future)
|
||||
- **Desktop Notifications**: Browser push notifications (future)
|
||||
|
||||
### Source-Specific Settings
|
||||
|
||||
**WebDAV Preferences:**
|
||||
- **Connection Timeout**: How long to wait for WebDAV responses
|
||||
- **Retry Attempts**: Number of retries for failed downloads
|
||||
- **Sync Schedule**: Preferred automatic sync frequency
|
||||
|
||||
**Local Folder Settings:**
|
||||
- **Watch Interval**: How often to scan local directories
|
||||
- **File Permissions**: Permission handling for processed files
|
||||
- **Symlink Handling**: Follow symbolic links during scans
|
||||
|
||||
### Saving and Applying Settings
|
||||
|
||||
1. **Modify preferences** in the settings interface
|
||||
2. **Click "Save Settings"** to apply changes
|
||||
3. **Settings take effect immediately** for most options
|
||||
4. **Some settings** may require logout/login to fully apply
|
||||
|
||||
## OIDC/SSO Integration
|
||||
|
||||
### Overview
|
||||
|
||||
OIDC integration allows users to authenticate using their corporate credentials without creating separate passwords for Readur.
|
||||
|
||||
### User Experience with OIDC
|
||||
|
||||
#### First-Time Login
|
||||
|
||||
1. **User clicks "Login with SSO"** on login page
|
||||
2. **Redirected to corporate identity provider** (e.g., Azure AD, Okta)
|
||||
3. **User authenticates** with corporate credentials
|
||||
4. **Readur creates user account automatically** with information from OIDC provider
|
||||
5. **User is logged in** and can immediately start using Readur
|
||||
|
||||
#### Subsequent Logins
|
||||
|
||||
1. **Click "Login with SSO"**
|
||||
2. **Automatic redirect** to identity provider
|
||||
3. **Single sign-on** (may not require re-authentication)
|
||||
4. **Immediate access** to Readur
|
||||
|
||||
### OIDC User Account Details
|
||||
|
||||
**Automatic Account Creation:**
|
||||
- **Username**: Derived from OIDC `preferred_username` or `sub` claim
|
||||
- **Email**: Uses OIDC `email` claim
|
||||
- **Role**: Default "User" role (admins can promote later)
|
||||
- **Auth Provider**: Marked as "OIDC" in user management
|
||||
|
||||
**Identity Mapping:**
|
||||
- **OIDC Subject**: Unique identifier from identity provider
|
||||
- **OIDC Issuer**: Identity provider URL
|
||||
- **Linked Accounts**: Maps OIDC identity to Readur user
|
||||
|
||||
### Mixed Authentication Environments
|
||||
|
||||
Readur supports both local and OIDC users in the same installation:
|
||||
|
||||
- **Local Admin Accounts**: For initial setup and emergency access
|
||||
- **OIDC User Accounts**: For regular enterprise users
|
||||
- **Role Management**: Admins can promote OIDC users to admin role
|
||||
- **Account Linking**: Future feature to link local and OIDC accounts
|
||||
|
||||
### OIDC Configuration
|
||||
|
||||
See the detailed [OIDC Setup Guide](oidc-setup.md) for complete configuration instructions.
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
### Password Security
|
||||
|
||||
**For Local Accounts:**
|
||||
1. **Use Strong Passwords**: Minimum 12 characters with mixed case, numbers, symbols
|
||||
2. **Regular Rotation**: Change passwords periodically
|
||||
3. **Unique Passwords**: Don't reuse passwords from other systems
|
||||
4. **Admin Passwords**: Use extra-strong passwords for administrator accounts
|
||||
|
||||
### JWT Token Security
|
||||
|
||||
**Token Management:**
|
||||
- **Secure Storage**: Tokens stored securely in browser localStorage
|
||||
- **Automatic Expiration**: 24-hour token lifetime
|
||||
- **Secure Transmission**: HTTPS required for production
|
||||
- **Token Rotation**: Regular token refresh (future feature)
|
||||
|
||||
### Access Control
|
||||
|
||||
**Role Management:**
|
||||
1. **Principle of Least Privilege**: Grant minimum necessary permissions
|
||||
2. **Regular Review**: Periodically audit user roles and permissions
|
||||
3. **Admin Accounts**: Limit number of administrator accounts
|
||||
4. **Account Deactivation**: Disable accounts for departed users
|
||||
|
||||
### OIDC Security
|
||||
|
||||
**Provider Configuration:**
|
||||
1. **Use HTTPS**: Ensure all OIDC endpoints use HTTPS
|
||||
2. **Client Secret Protection**: Secure storage of OIDC client secrets
|
||||
3. **Scope Limitation**: Request only necessary OIDC scopes
|
||||
4. **Token Validation**: Proper verification of OIDC tokens
|
||||
|
||||
### Monitoring and Auditing
|
||||
|
||||
**Access Monitoring:**
|
||||
- **Login Tracking**: Monitor successful and failed login attempts
|
||||
- **Role Changes**: Audit administrator role assignments
|
||||
- **Account Activity**: Track user document access patterns
|
||||
- **Security Events**: Log authentication and authorization events
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Authentication Issues
|
||||
|
||||
#### Local Login Problems
|
||||
|
||||
**Symptom**: "Invalid username or password"
|
||||
**Solutions**:
|
||||
1. **Verify credentials**: Check username/password carefully
|
||||
2. **Account existence**: Confirm account exists in user management
|
||||
3. **Password reset**: Admin can reset user password
|
||||
4. **Account status**: Ensure account is active/enabled
|
||||
|
||||
#### OIDC Login Problems
|
||||
|
||||
**Symptom**: OIDC login fails or redirects incorrectly
|
||||
**Solutions**:
|
||||
1. **Check OIDC configuration**: Verify client ID, secret, and issuer URL
|
||||
2. **Redirect URI**: Ensure redirect URI is registered with OIDC provider
|
||||
3. **Provider status**: Confirm OIDC provider is operational
|
||||
4. **Network connectivity**: Verify Readur can reach OIDC endpoints
|
||||
|
||||
#### JWT Token Issues
|
||||
|
||||
**Symptom**: "Invalid token" or frequent logouts
|
||||
**Solutions**:
|
||||
1. **Check system time**: Ensure server time is accurate
|
||||
2. **JWT secret**: Verify JWT_SECRET environment variable
|
||||
3. **Token expiration**: Tokens expire after 24 hours
|
||||
4. **Browser storage**: Clear localStorage and re-login
|
||||
|
||||
### User Management Issues
|
||||
|
||||
#### Cannot Create Users
|
||||
|
||||
**Symptom**: User creation fails
|
||||
**Solutions**:
|
||||
1. **Admin permissions**: Ensure logged in as administrator
|
||||
2. **Duplicate usernames**: Check for existing username/email
|
||||
3. **Database connectivity**: Verify database connection
|
||||
4. **Input validation**: Ensure all required fields are provided
|
||||
|
||||
#### User Settings Not Saving
|
||||
|
||||
**Symptom**: Settings changes don't persist
|
||||
**Solutions**:
|
||||
1. **Check permissions**: Ensure user has permission to modify settings
|
||||
2. **Database issues**: Verify database write permissions
|
||||
3. **Browser issues**: Try clearing browser cache
|
||||
4. **Network connectivity**: Ensure stable connection during save
|
||||
|
||||
### Role and Permission Issues
|
||||
|
||||
#### Users Cannot Access Features
|
||||
|
||||
**Symptom**: User reports missing functionality
|
||||
**Solutions**:
|
||||
1. **Check user role**: Verify user has appropriate role assignment
|
||||
2. **Permission scope**: Confirm feature is available to user role
|
||||
3. **Session refresh**: User may need to logout/login after role change
|
||||
4. **Feature availability**: Ensure feature is enabled in system configuration
|
||||
|
||||
#### Admin Access Problems
|
||||
|
||||
**Symptom**: Admin cannot access management features
|
||||
**Solutions**:
|
||||
1. **Role verification**: Confirm user has Admin role
|
||||
2. **Token validity**: Ensure JWT token contains correct role information
|
||||
3. **Database consistency**: Verify role is correctly stored in database
|
||||
4. **Login refresh**: Try logging out and logging back in
|
||||
|
||||
### Performance Issues
|
||||
|
||||
#### Slow User Operations
|
||||
|
||||
**Symptom**: User management operations are slow
|
||||
**Solutions**:
|
||||
1. **Database performance**: Check database query performance
|
||||
2. **User count**: Large user counts may require pagination
|
||||
3. **Network latency**: OIDC operations may be affected by provider latency
|
||||
4. **System resources**: Monitor CPU and memory usage
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Configure [OIDC integration](oidc-setup.md) for enterprise authentication
|
||||
- Set up [sources](sources-guide.md) for document synchronization
|
||||
- Review [security best practices](deployment.md#security-considerations)
|
||||
- Explore [advanced search](advanced-search.md) capabilities
|
||||
- Configure [labels and organization](labels-and-organization.md) for document management
|
||||
Loading…
Reference in New Issue