perf3ct
089d6c2853
fix(tests): resolve issues in unit tests due to dep changes
2025-07-17 16:09:10 +00:00
perf3ct
03bb4bf87d
fix(client): resolve issues with showing user settings on debug pge
2025-07-15 16:20:30 +00:00
perf3ct
ccc3bc2ce4
feat(ocr): use ocrmypdf and pdftotext to get OCR layer if it already exists
2025-07-15 15:59:29 +00:00
perf3ct
901942ae74
fix(client): also resolve missing thumbnails
2025-07-15 15:57:14 +00:00
perf3ct
a3f33140ee
feat(dev): drop pdf_extract in favor of ocrmypdf
2025-07-15 14:50:17 +00:00
perf3ct
862eb3217a
fix(tests): resolve issues in integration tests for the new multiple ocr languages
2025-07-14 21:28:55 +00:00
perf3ct
7317fd5ebb
Merge branch 'feat/multiple-ocr-languages' of https://github.com/readur/readur into feat/multiple-ocr-languages
2025-07-14 19:33:51 +00:00
perf3ct
849c9f91c7
feat(lang): update backend to support multiple languages at the same time during OCR
2025-07-14 19:33:43 +00:00
Jon Fuller
f0e39d155e
Merge branch 'main' into feat/multiple-ocr-languages
2025-07-14 11:29:46 -07:00
perf3ct
721a348888
feat(dev): update references of readur to newer versions
2025-07-14 17:17:24 +00:00
perf3ct
6165148e4d
feat(ocr): gracefully handle problematic PDFs in all the ways, create tests so that it doesn't happen again
2025-07-14 16:36:32 +00:00
perf3ct
469ca29f5c
fix(tests): resolve some broken e2e tests?
2025-07-14 01:39:30 +00:00
perf3ct
1bf4a66754
fix(tests): resolve issue in compilation of tests due to multiple ocr languages
2025-07-13 17:26:06 +00:00
perf3ct
e6fd8424d2
fix(dev): merge main into feature
2025-07-13 17:15:59 +00:00
perf3ct
ac49d8c2c9
fix(tests): fix urls used in test
2025-07-12 23:27:27 +00:00
perf3ct
e736d485ee
fix(webdav): fix incorrect concatenation logic when building URLs
2025-07-12 23:26:19 +00:00
perf3ct
3fbf941364
fix(webdav): make sure to normalize URL for double slashes, trailing slashes, etc.
2025-07-12 23:15:14 +00:00
perfectra1n
4a039eac82
feat(client): add new frontend page for admins to view running config settings
2025-07-12 14:06:09 -07:00
perfectra1n
9e143649d4
fix(upload): resolve issue with Axum not having config values set
2025-07-12 14:04:54 -07:00
perf3ct
e16fd1d420
feat(server): rename queue-failed endpoint
2025-07-11 21:28:23 +00:00
perf3ct
b31e1a672d
feat(server): gracefully manage requeue requests for the same document
2025-07-11 21:27:12 +00:00
perf3ct
0b9b935334
fix(server): don't log usernames to log file, I guess
2025-07-11 20:05:56 +00:00
perf3ct
979078b5ac
fix(tests): also resolve document struct in this unit test
2025-07-11 17:50:49 +00:00
perf3ct
8c90c5c3c3
feat(tests): do tests pass now?
2025-07-11 00:39:12 +00:00
perf3ct
fb831e9624
feat(server): implement unit tests for source metadata extraction
2025-07-10 22:02:41 +00:00
perf3ct
305c6f1fb1
feat(server): show source metadata EVEN better
2025-07-10 21:51:30 +00:00
perf3ct
ea43f79a90
feat(server): show source metadata better, and implement tests
2025-07-10 21:40:16 +00:00
perf3ct
b7f1522b4a
feat(server): updating the watcher.rs file to preserve source metadata
2025-07-10 21:18:08 +00:00
perf3ct
4521cc5ac6
feat(client): show new file metadata fields on the client
2025-07-10 21:07:35 +00:00
perf3ct
0465777890
feat(client): show more fields for Documents
2025-07-10 21:02:15 +00:00
perf3ct
96c47af2c0
feat(server/client): make sure that the documents endpoint isn't broken
2025-07-10 19:57:25 +00:00
perf3ct
438e79c441
fix(tests): no way, all the integration tests pass now
2025-07-10 01:38:55 +00:00
perf3ct
29800bdd1f
fix(tests): resolve integration test response format
2025-07-09 20:10:36 +00:00
perf3ct
17f486a8b7
fix(server/client): rename document_id to id in DocumentUploadResponse, again
2025-07-09 01:40:50 +00:00
perf3ct
f2a050458b
fix(stats): create new get_queue_statistics function to avoid conflicts
2025-07-09 00:27:43 +00:00
perf3ct
f0dc0669bd
debug(tests): add some debug lines to see why CI is upset
2025-07-08 22:32:32 +00:00
perf3ct
a6f2b6df09
fix(stats): try to fix the stats extraction, again
2025-07-08 21:18:21 +00:00
perf3ct
e628b0d4d5
fix(server): resolve incorrect document failure titles
2025-07-08 20:24:52 +00:00
perf3ct
12bdeab503
fix(stats): try to fix stats export, again again again
2025-07-08 20:21:46 +00:00
perf3ct
2a59651fb9
fix(stats): try to fix stats export, again again
2025-07-08 20:16:33 +00:00
perf3ct
a7e9f75eab
fix(stats): try to fix stats export, again
2025-07-08 20:03:55 +00:00
perf3ct
03555ed756
fix(tests): fix the crazy metrics collection issue
2025-07-08 16:52:23 +00:00
perf3ct
58b8a71404
fix(tests): and resolve missing endpoint
2025-07-08 04:37:33 +00:00
perf3ct
faf3299119
debug(tests): add some debug lines to see why CI is upset
2025-07-08 00:03:30 +00:00
perf3ct
a4b9626616
fix(web_upload): resolve issue that caused files that were uploaded via the web, to not be added to the queue
2025-07-07 19:28:08 +00:00
perf3ct
b356017484
feat(server): implement better error checking for sources
2025-07-07 19:10:45 +00:00
perf3ct
c6089dd1b2
feat(webdav): resolve failing etag unit tests
2025-07-05 21:16:15 +00:00
perf3ct
26c618f984
feat(webdav): resolve failing etag unit tests
2025-07-05 19:47:21 +00:00
perf3ct
5b21c87675
feat(tests): move integration and unit tests to correct locations
2025-07-04 19:37:43 +00:00
perf3ct
fbd7d561c3
fix(tests): binary and tests at least compile now
2025-07-04 19:07:53 +00:00
Jon Fuller
ac6b4a522f
Merge pull request #96 from readur/feat/deduplicate-test-utils-1
...
feat(tests): deduplicate test functionalities
2025-07-04 09:12:40 -07:00
perf3ct
545046509f
fix(server): fix axum groups
2025-07-04 03:07:28 +00:00
perf3ct
922478d995
fix(server): resolve compilation errors due to splitting up the large files
2025-07-04 03:06:29 +00:00
perf3ct
497b34ce0a
fix(server): resolve type issues and functions for compilation issues
2025-07-04 00:53:32 +00:00
perf3ct
51fb3a7e48
fix(tests): resolve broken test utils
2025-07-04 00:31:53 +00:00
perf3ct
0e84993afa
fix(server): resolve import issues
2025-07-03 23:58:11 +00:00
perf3ct
f862df9a90
feat(dev): break up the large sources.rs file into smaller ones
2025-07-03 23:44:49 +00:00
perf3ct
3a7c8e8bda
feat(dev): break up the large documents.rs file, again
2025-07-03 23:33:53 +00:00
perf3ct
b9e0e5b905
feat(dev): also break up the large webdav_service.rs file into smaller ones
2025-07-03 19:57:31 +00:00
perf3ct
ed942d02c7
feat(dev): break up the large documents.rs file
2025-07-03 19:47:31 +00:00
perf3ct
86bcd613e4
feat(dev): split up large models.rs file to smaller ones
2025-07-03 19:35:36 +00:00
perf3ct
44aaaca5c5
feat(ocr): add even more about the multiple ocr languages
2025-07-03 19:20:19 +00:00
perf3ct
a3f49f9bd7
feat(tests): try to deduplicate test code even more
2025-07-03 19:17:33 +00:00
perf3ct
7993786e18
feat(tests): deduplicate tests too
2025-07-03 17:21:39 +00:00
perf3ct
7074a8d868
feat(webdav): add validation statuses to sources
2025-07-03 14:03:26 +00:00
perf3ct
459b8622bb
feat(webdav): also add some crazy source automatic validation
2025-07-03 05:26:36 +00:00
perf3ct
15b1f40cc1
feat(webdav): make sure to have scanned all subdirectories
2025-07-03 05:02:17 +00:00
perf3ct
69c40c10fa
feat(webdav): gracefully recover webdav from stops/crashes
2025-07-03 04:45:25 +00:00
perf3ct
2297eb8261
feat(webdav): also set up deep scanning button and fix unit tests
2025-07-03 04:24:26 +00:00
perf3ct
b8dd23655d
feat(webdav): directory etag smart checking and all that
2025-07-03 00:26:56 +00:00
perf3ct
be29316ff4
fix(tests): resolve compilation error in tests and source scheduler
2025-07-02 23:49:46 +00:00
perf3ct
d26b9e386b
fix(webdav): resolve issue with webdav subdirectories not being discovered
2025-07-02 23:37:39 +00:00
perf3ct
f7414af15c
fix(tests): resolve silly new ocr retry tests
2025-07-02 22:51:09 +00:00
perf3ct
6d40feadb3
fix(server): resolve issues with the retry ocr tests
2025-07-02 22:47:51 +00:00
perf3ct
ab03b8d73d
fix(server): resolve ocr test functionality failing due to db trigger
2025-07-02 22:38:13 +00:00
perf3ct
dd4bd03af6
feat(tests): fix ocr_retry issues in tests
2025-07-02 21:48:01 +00:00
perf3ct
ffad8c4561
feat(tests): fix ocr_retry issues in tests
2025-07-02 21:30:36 +00:00
perf3ct
3c4e06fa77
feat(tests): fix ocr_retry issues in tests
2025-07-02 18:48:26 +00:00
perf3ct
8cea916abf
feat(server): allow also completed documents to be retried
2025-07-02 18:15:41 +00:00
perf3ct
6bdd6f4a56
feat(server): implement DEBUG environment variable
2025-07-02 17:57:57 +00:00
perf3ct
68aa492a96
fix(server): resolve NUMERIC db type and f64 rust type
2025-07-02 02:26:11 +00:00
perf3ct
d4b57d2ae0
feat(server/client): implement retry functionality for both successful and failed documents
2025-07-02 00:06:47 +00:00
perf3ct
a381cdd12c
feat(webdav): also fix the parser to include directories, and add tests
2025-07-01 22:03:06 +00:00
perf3ct
c1dbd06df2
feat(tests): add unit tests for new webdav functionality
2025-07-01 21:39:31 +00:00
perf3ct
92b21350db
feat(webdav): track directory etags
...
✅ Core Optimizations Implemented
1. 📊 New Database Schema: Added webdav_directories table to track
directory ETags, file counts, and metadata
2. 🔍 Smart Directory Checking: Before deep scans, check directory
ETags with lightweight Depth: 0 PROPFIND requests
3. ΓÜí Skip Unchanged Directories: If directory ETag matches, skip the
entire deep scan
4. 🗂️ N-Depth Subdirectory Tracking: Recursively track all
subdirectories found during scans
5. 🎯 Individual Subdirectory Checks: When parent unchanged, check
each known subdirectory individually
🚀 Performance Benefits
Before: Every sync = Full Depth: infinity scan of entire directory
treeAfter:
- First sync: Full scan + directory tracking setup
- Subsequent syncs: Quick ETag checks → skip unchanged directories
entirely
- Changed directories: Only scan the specific changed subdirectories
📁 How It Works
1. Initial Request: PROPFIND Depth: 0 on /Documents → get directory
ETag
2. Database Check: Compare with stored ETag for /Documents
3. If Unchanged: Check each known subdirectory (/Documents/2024,
/Documents/Archive) individually
4. If Changed: Full recursive scan + update all directory tracking
data
2025-07-01 21:22:16 +00:00
perf3ct
6a23a407bf
feat(client): update swagger ui endpoints
2025-07-01 20:54:45 +00:00
Jon Fuller
2e1a05fc8d
Merge branch 'main' into feat/multiple-ocr-languages
2025-07-01 11:53:42 -07:00
perf3ct
df281f3b26
feat(pdf): implement ocrmypdf to extract text from PDFs
2025-07-01 00:56:48 +00:00
Jon Fuller
706e20f35c
Merge branch 'main' into feat/debug-page
2025-06-30 17:19:31 -07:00
perf3ct
231f88f038
feat(debug): debug page actually works and does something
2025-07-01 00:15:48 +00:00
perf3ct
0052032772
fix(pdf): resolve PDF wordcount error
2025-07-01 00:10:49 +00:00
perf3ct
830f9d0b38
feat(server): mark documents with 0 words as failed, and fix webdav unit tests
2025-06-30 22:43:25 +00:00
perf3ct
69279344cb
fix(tests): fix documents tests
2025-06-30 21:56:21 +00:00
perf3ct
b38c1fca07
feat(server): fix serialization issues
2025-06-30 19:40:05 +00:00
perf3ct
9e43df2fbe
feat(server/client): add metadata to file view
2025-06-30 19:13:16 +00:00
perf3ct
fef28a33c6
feat(server): continue to try to wrangle the failed and ignored documents
2025-06-29 23:27:51 +00:00
perf3ct
87cfab9ff8
fix(tests): resolve compilation error in the multiple OCR functionality
2025-06-29 23:21:42 +00:00
perf3ct
197afc19f4
feat(tests): implement and update tests for multiple OCR languages
2025-06-29 23:03:37 +00:00
perf3ct
6b6890d529
feat(server/client): support multiple OCR languages
2025-06-29 22:51:06 +00:00
perf3ct
fbf89c213d
fix(tests): resolve a whole lot of test issues
2025-06-28 22:50:40 +00:00