Commit Graph

202 Commits

Author SHA1 Message Date
perf3ct 12bdeab503 fix(stats): try to fix stats export, again again again 2025-07-08 20:21:46 +00:00
perf3ct 2a59651fb9 fix(stats): try to fix stats export, again again 2025-07-08 20:16:33 +00:00
perf3ct a7e9f75eab fix(stats): try to fix stats export, again 2025-07-08 20:03:55 +00:00
perf3ct 03555ed756 fix(tests): fix the crazy metrics collection issue 2025-07-08 16:52:23 +00:00
perf3ct 58b8a71404 fix(tests): and resolve missing endpoint 2025-07-08 04:37:33 +00:00
perf3ct faf3299119 debug(tests): add some debug lines to see why CI is upset 2025-07-08 00:03:30 +00:00
perf3ct a4b9626616 fix(web_upload): resolve issue that caused files that were uploaded via the web, to not be added to the queue 2025-07-07 19:28:08 +00:00
perf3ct b356017484 feat(server): implement better error checking for sources 2025-07-07 19:10:45 +00:00
perf3ct c6089dd1b2 feat(webdav): resolve failing etag unit tests 2025-07-05 21:16:15 +00:00
perf3ct 26c618f984 feat(webdav): resolve failing etag unit tests 2025-07-05 19:47:21 +00:00
perf3ct 5b21c87675 feat(tests): move integration and unit tests to correct locations 2025-07-04 19:37:43 +00:00
perf3ct fbd7d561c3 fix(tests): binary and tests at least compile now 2025-07-04 19:07:53 +00:00
Jon Fuller ac6b4a522f Merge pull request #96 from readur/feat/deduplicate-test-utils-1
feat(tests): deduplicate test functionalities
2025-07-04 09:12:40 -07:00
perf3ct 545046509f fix(server): fix axum groups 2025-07-04 03:07:28 +00:00
perf3ct 922478d995 fix(server): resolve compilation errors due to splitting up the large files 2025-07-04 03:06:29 +00:00
perf3ct 497b34ce0a fix(server): resolve type issues and functions for compilation issues 2025-07-04 00:53:32 +00:00
perf3ct 51fb3a7e48 fix(tests): resolve broken test utils 2025-07-04 00:31:53 +00:00
perf3ct 0e84993afa fix(server): resolve import issues 2025-07-03 23:58:11 +00:00
perf3ct f862df9a90 feat(dev): break up the large sources.rs file into smaller ones 2025-07-03 23:44:49 +00:00
perf3ct 3a7c8e8bda feat(dev): break up the large documents.rs file, again 2025-07-03 23:33:53 +00:00
perf3ct b9e0e5b905 feat(dev): also break up the large webdav_service.rs file into smaller ones 2025-07-03 19:57:31 +00:00
perf3ct ed942d02c7 feat(dev): break up the large documents.rs file 2025-07-03 19:47:31 +00:00
perf3ct 86bcd613e4 feat(dev): split up large models.rs file to smaller ones 2025-07-03 19:35:36 +00:00
perf3ct a3f49f9bd7 feat(tests): try to deduplicate test code even more 2025-07-03 19:17:33 +00:00
perf3ct 7993786e18 feat(tests): deduplicate tests too 2025-07-03 17:21:39 +00:00
perf3ct 7074a8d868 feat(webdav): add validation statuses to sources 2025-07-03 14:03:26 +00:00
perf3ct 459b8622bb feat(webdav): also add some crazy source automatic validation 2025-07-03 05:26:36 +00:00
perf3ct 15b1f40cc1 feat(webdav): make sure to have scanned all subdirectories 2025-07-03 05:02:17 +00:00
perf3ct 69c40c10fa feat(webdav): gracefully recover webdav from stops/crashes 2025-07-03 04:45:25 +00:00
perf3ct 2297eb8261 feat(webdav): also set up deep scanning button and fix unit tests 2025-07-03 04:24:26 +00:00
perf3ct b8dd23655d feat(webdav): directory etag smart checking and all that 2025-07-03 00:26:56 +00:00
perf3ct be29316ff4 fix(tests): resolve compilation error in tests and source scheduler 2025-07-02 23:49:46 +00:00
perf3ct d26b9e386b fix(webdav): resolve issue with webdav subdirectories not being discovered 2025-07-02 23:37:39 +00:00
perf3ct f7414af15c fix(tests): resolve silly new ocr retry tests 2025-07-02 22:51:09 +00:00
perf3ct 6d40feadb3 fix(server): resolve issues with the retry ocr tests 2025-07-02 22:47:51 +00:00
perf3ct ab03b8d73d fix(server): resolve ocr test functionality failing due to db trigger 2025-07-02 22:38:13 +00:00
perf3ct dd4bd03af6 feat(tests): fix ocr_retry issues in tests 2025-07-02 21:48:01 +00:00
perf3ct ffad8c4561 feat(tests): fix ocr_retry issues in tests 2025-07-02 21:30:36 +00:00
perf3ct 3c4e06fa77 feat(tests): fix ocr_retry issues in tests 2025-07-02 18:48:26 +00:00
perf3ct 8cea916abf feat(server): allow also completed documents to be retried 2025-07-02 18:15:41 +00:00
perf3ct 6bdd6f4a56 feat(server): implement DEBUG environment variable 2025-07-02 17:57:57 +00:00
perf3ct 68aa492a96 fix(server): resolve NUMERIC db type and f64 rust type 2025-07-02 02:26:11 +00:00
perf3ct d4b57d2ae0 feat(server/client): implement retry functionality for both successful and failed documents 2025-07-02 00:06:47 +00:00
perf3ct a381cdd12c feat(webdav): also fix the parser to include directories, and add tests 2025-07-01 22:03:06 +00:00
perf3ct c1dbd06df2 feat(tests): add unit tests for new webdav functionality 2025-07-01 21:39:31 +00:00
perf3ct 92b21350db feat(webdav): track directory etags
✅ Core Optimizations Implemented

  1. 📊 New Database Schema: Added webdav_directories table to track
directory ETags, file counts, and metadata
  2. 🔍 Smart Directory Checking: Before deep scans, check directory
ETags with lightweight Depth: 0 PROPFIND requests
  3. ΓÜí Skip Unchanged Directories: If directory ETag matches, skip the
entire deep scan
  4. 🗂️ N-Depth Subdirectory Tracking: Recursively track all
subdirectories found during scans
  5. 🎯 Individual Subdirectory Checks: When parent unchanged, check
each known subdirectory individually

  🚀 Performance Benefits

  Before: Every sync = Full Depth: infinity scan of entire directory
treeAfter:
  - First sync: Full scan + directory tracking setup
  - Subsequent syncs: Quick ETag checks → skip unchanged directories
entirely
  - Changed directories: Only scan the specific changed subdirectories

  📁 How It Works

  1. Initial Request: PROPFIND Depth: 0 on /Documents → get directory
ETag
  2. Database Check: Compare with stored ETag for /Documents
  3. If Unchanged: Check each known subdirectory (/Documents/2024,
/Documents/Archive) individually
  4. If Changed: Full recursive scan + update all directory tracking
data
2025-07-01 21:22:16 +00:00
perf3ct 6a23a407bf feat(client): update swagger ui endpoints 2025-07-01 20:54:45 +00:00
perf3ct df281f3b26 feat(pdf): implement ocrmypdf to extract text from PDFs 2025-07-01 00:56:48 +00:00
Jon Fuller 706e20f35c Merge branch 'main' into feat/debug-page 2025-06-30 17:19:31 -07:00
perf3ct 231f88f038 feat(debug): debug page actually works and does something 2025-07-01 00:15:48 +00:00