Commit Graph

51 Commits

Author SHA1 Message Date
renovate[bot] ccbf02aee7
fix(deps): update rust crate testcontainers-modules to 0.14 2025-12-10 11:11:02 +00:00
renovate[bot] 8f7857fb09
fix(deps): update rust crate base64ct to v1.8.1 (#377)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-12-10 05:51:55 +00:00
renovate[bot] 46d0eca517
chore(deps): update rust crate rust_xlsxwriter to 0.92 2025-11-20 01:37:52 +00:00
renovate[bot] fcfa0e101b
chore(deps): update rust crate rust_xlsxwriter to 0.91 2025-10-30 03:52:56 +00:00
renovate[bot] e4c4a0fd2e
chore(deps): update rust crate rust_xlsxwriter to 0.90 2025-10-09 16:42:46 +00:00
perf3ct aa5bd77753
feat(webdav): get rid of complex loop detection 2025-09-09 02:11:57 +00:00
perf3ct 88c376f655
feat(webdav): add some stress test utilities 2025-09-09 01:38:36 +00:00
perf3ct d5d6d2edb4
feat(office): xml extraction seems to work now 2025-09-02 01:22:19 +00:00
perf3ct 774efd1140
refactor(server): remove XML vs library comparison functionality
Remove all comparison-related code used to evaluate XML vs library-based
Office document extraction. The XML approach has proven superior, so the
comparison functionality is no longer needed.

Changes:
- Remove extraction_comparator.rs (entire comparison engine)
- Remove test_extraction_comparison.rs binary
- Remove comparison mode logic from enhanced.rs
- Simplify fallback_strategy.rs to use XML extraction only
- Update OCR service to use XML extraction as primary method
- Clean up database migration to remove comparison-specific settings
- Remove test_extraction binary from Cargo.toml
- Update integration tests to work with simplified extraction

The Office document extraction now flows directly to XML-based
extraction
without any comparison checks, maintaining the superior extraction
quality
while removing unnecessary complexity.
2025-09-02 01:22:19 +00:00
perf3ct 325731aa04
feat(office): create legitimate office files for testing 2025-09-01 22:07:59 +00:00
perf3ct 78af7e7861
feat(office): use actual packages for extraction 2025-09-01 21:21:22 +00:00
perf3ct 546b41b462
feat(office): try to resolve docx/doc not working 2025-09-01 19:58:06 +00:00
perf3ct b7dd64c8f6
feat(webdav): try to do better webdav errors to not slam webdav endpoints 2025-08-20 21:59:14 +00:00
Jon Fuller ed708cf16f
Merge pull request #141 from readur/renovate/infer-0.x
fix(deps): update rust crate infer to 0.19
2025-08-14 15:13:02 -07:00
renovate[bot] 906c627524
fix(deps): update rust crate sysinfo to 0.37 2025-08-14 19:00:53 +00:00
renovate[bot] 88d38aa2d6
fix(deps): update rust crate infer to 0.19 2025-08-14 19:00:32 +00:00
perf3ct 7da99cd992 feat(server): implement websockets over sse 2025-07-30 02:04:44 +00:00
perf3ct d7a0a1f294 feat(server): do a *much* better job at determining file types thanks to infer rust package 2025-07-29 21:28:33 +00:00
perf3ct cfeb6c5c93 feat(tests): wrap the tests so that even if they fail, they still close their db connections 2025-07-28 18:15:08 +00:00
perf3ct c37014f924 feat(tests): work on resolving tests that don't pass given the large rewrite 2025-07-28 04:13:14 +00:00
perf3ct 023d424293 feat(server/client): I have no words, hopefully this lesser abstraction and webdav tracking works now 2025-07-27 19:29:45 +00:00
perf3ct a3f33140ee feat(dev): drop pdf_extract in favor of ocrmypdf 2025-07-15 14:50:17 +00:00
perf3ct 721a348888 feat(dev): update references of readur to newer versions 2025-07-14 17:17:24 +00:00
renovate[bot] 2371d3fade fix(deps): update rust crate sysinfo to 0.36 2025-07-13 17:17:11 +00:00
perf3ct 26c618f984 feat(webdav): resolve failing etag unit tests 2025-07-05 19:47:21 +00:00
perf3ct f686bc7692 feat(tests): move integration and unit tests to correct locations 2025-07-04 19:50:29 +00:00
perf3ct 5b21c87675 feat(tests): move integration and unit tests to correct locations 2025-07-04 19:37:43 +00:00
perfectra1n 582617ab88 fix(server/client): fix incorrect OCR measurements 2025-06-27 20:23:59 -07:00
renovate[bot] 0678332f0f chore(deps): update rust crate wiremock to 0.6 2025-06-27 17:50:05 +00:00
perf3ct e9496b921e feat(server): set up oidc system and migrations 2025-06-26 18:52:57 +00:00
perf3ct a5ca6e33f2 feat(server): decrease logging verbosity for ingestion 2025-06-25 21:41:46 +00:00
renovate[bot] 7c8d8e95d4 fix(deps): update rust crate base64ct to v1.8.0 2025-06-24 22:53:35 +00:00
perf3ct 555bd9a746 feat(ci): don't use debug or incremental in ci 2025-06-24 21:56:40 +00:00
perf3ct a0e75d4619 feat(server/client): implement feature of ignoring already deleted files, and add failed OCR queue tests 2025-06-24 17:20:33 +00:00
perf3ct 14af90c657 feat(tests): fix the vast majority of both server and client tests 2025-06-17 22:06:12 +00:00
perf3ct c0d3b52865 feat(server): also update versions for deps 2025-06-16 01:28:32 +00:00
perf3ct e5aaf31fdd feat(server/client): working s3 and local source types 2025-06-15 17:51:04 +00:00
perf3ct 7feec817d0 feat(server): fix recursively scanning the uploads folder, and the quick search bar 2025-06-15 04:37:49 +00:00
perf3ct cfc6c85261 feat(server): upgrade all versions and resolve breaking changes 2025-06-15 02:23:35 +00:00
perf3ct d21e51436b feat(server): rewrite nearly everything to be async/follow best practices 2025-06-15 02:06:17 +00:00
perf3ct 9fa45f8891 feat(server): implement better ocr failure and guardrails 2025-06-14 22:13:04 +00:00
perf3ct aa45cd06e0 feat(server): webdav integration nearly done 2025-06-14 16:21:28 +00:00
perf3ct e3f1855711 feat(client/server): add nextcloud/webdav capability, add integration tests 2025-06-13 17:09:05 +00:00
perf3ct e1e949cf65 feat(migrations): try to fix the migrations service 2025-06-13 14:27:31 +00:00
perfectra1n 1a1f886f04 feat(client/server): update search tests, and upgrade OCR 2025-06-12 22:00:14 -07:00
perfectra1n 0abc8f272a feat(client): update the search functionality 2025-06-12 21:31:46 -07:00
perf3ct 52d006d403 feat(unit): fixed the unit tests 2025-06-13 01:32:47 +00:00
perf3ct 90599eed74 feat(server): implement queue system 2025-06-12 20:34:51 +00:00
perfectra1n aa8af7e018 feat(server/client): also update the tests and settings pages 2025-06-11 21:58:25 -07:00
perf3ct 488003c426 fix(everything): wow, it runs 2025-06-12 00:05:43 +00:00