perf3ct
|
67ae68745c
|
fix(dev): remove unneeded docs
|
2025-08-13 20:51:13 +00:00 |
perf3ct
|
862c36aa72
|
feat(storage): further support the s3 storage backend
|
2025-08-01 17:57:09 +00:00 |
perf3ct
|
abd55ef419
|
feat(storage): abstract storage to also support s3, along with local filesystem still
|
2025-08-01 04:33:08 +00:00 |
perf3ct
|
65f42c2cd7
|
fix(ocr): use proper failure reasons to avoid constraint violations in failed_documents table
|
2025-07-21 20:43:37 +00:00 |
perf3ct
|
45ec99a031
|
feat(ocr): get rid of managing TESSDATA_PREFIX
|
2025-07-20 02:23:06 +00:00 |
perf3ct
|
ccc3bc2ce4
|
feat(ocr): use ocrmypdf and pdftotext to get OCR layer if it already exists
|
2025-07-15 15:59:29 +00:00 |
perf3ct
|
a3f33140ee
|
feat(dev): drop pdf_extract in favor of ocrmypdf
|
2025-07-15 14:50:17 +00:00 |
perf3ct
|
862eb3217a
|
fix(tests): resolve issues in integration tests for the new multiple ocr languages
|
2025-07-14 21:28:55 +00:00 |
perf3ct
|
7317fd5ebb
|
Merge branch 'feat/multiple-ocr-languages' of https://github.com/readur/readur into feat/multiple-ocr-languages
|
2025-07-14 19:33:51 +00:00 |
perf3ct
|
849c9f91c7
|
feat(lang): update backend to support multiple languages at the same time during OCR
|
2025-07-14 19:33:43 +00:00 |
Jon Fuller
|
f0e39d155e
|
Merge branch 'main' into feat/multiple-ocr-languages
|
2025-07-14 11:29:46 -07:00 |
perf3ct
|
6165148e4d
|
feat(ocr): gracefully handle problematic PDFs in all the ways, create tests so that it doesn't happen again
|
2025-07-14 16:36:32 +00:00 |
perf3ct
|
e6fd8424d2
|
fix(dev): merge main into feature
|
2025-07-13 17:15:59 +00:00 |
perf3ct
|
b31e1a672d
|
feat(server): gracefully manage requeue requests for the same document
|
2025-07-11 21:27:12 +00:00 |
perf3ct
|
f2a050458b
|
fix(stats): create new get_queue_statistics function to avoid conflicts
|
2025-07-09 00:27:43 +00:00 |
perf3ct
|
a6f2b6df09
|
fix(stats): try to fix the stats extraction, again
|
2025-07-08 21:18:21 +00:00 |
perf3ct
|
e628b0d4d5
|
fix(server): resolve incorrect document failure titles
|
2025-07-08 20:24:52 +00:00 |
perf3ct
|
a7e9f75eab
|
fix(stats): try to fix stats export, again
|
2025-07-08 20:03:55 +00:00 |
perf3ct
|
03555ed756
|
fix(tests): fix the crazy metrics collection issue
|
2025-07-08 16:52:23 +00:00 |
perf3ct
|
58b8a71404
|
fix(tests): and resolve missing endpoint
|
2025-07-08 04:37:33 +00:00 |
perf3ct
|
a4b9626616
|
fix(web_upload): resolve issue that caused files that were uploaded via the web, to not be added to the queue
|
2025-07-07 19:28:08 +00:00 |
perf3ct
|
497b34ce0a
|
fix(server): resolve type issues and functions for compilation issues
|
2025-07-04 00:53:32 +00:00 |
perf3ct
|
44aaaca5c5
|
feat(ocr): add even more about the multiple ocr languages
|
2025-07-03 19:20:19 +00:00 |
perf3ct
|
6bdd6f4a56
|
feat(server): implement DEBUG environment variable
|
2025-07-02 17:57:57 +00:00 |
Jon Fuller
|
2e1a05fc8d
|
Merge branch 'main' into feat/multiple-ocr-languages
|
2025-07-01 11:53:42 -07:00 |
perf3ct
|
df281f3b26
|
feat(pdf): implement ocrmypdf to extract text from PDFs
|
2025-07-01 00:56:48 +00:00 |
perf3ct
|
0052032772
|
fix(pdf): resolve PDF wordcount error
|
2025-07-01 00:10:49 +00:00 |
perf3ct
|
830f9d0b38
|
feat(server): mark documents with 0 words as failed, and fix webdav unit tests
|
2025-06-30 22:43:25 +00:00 |
perf3ct
|
fef28a33c6
|
feat(server): continue to try to wrangle the failed and ignored documents
|
2025-06-29 23:27:51 +00:00 |
perf3ct
|
87cfab9ff8
|
fix(tests): resolve compilation error in the multiple OCR functionality
|
2025-06-29 23:21:42 +00:00 |
perf3ct
|
197afc19f4
|
feat(tests): implement and update tests for multiple OCR languages
|
2025-06-29 23:03:37 +00:00 |
perf3ct
|
6b6890d529
|
feat(server/client): support multiple OCR languages
|
2025-06-29 22:51:06 +00:00 |
perf3ct
|
84577806ef
|
feat(server/client): add failed_documents table to handle failures, and move logic of failures
|
2025-06-28 20:52:58 +00:00 |
perfectra1n
|
582617ab88
|
fix(server/client): fix incorrect OCR measurements
|
2025-06-27 20:23:59 -07:00 |
perf3ct
|
9a8bf72ff7
|
feat(server): reorganize components into their own modules and fix imports
|
2025-06-27 18:27:42 +00:00 |