perf3ct
|
564c564613
|
feat(ocr): use ocrmypdf and pdftotext to get OCR layer if it already exists
|
2025-07-15 15:59:29 +00:00 |
perf3ct
|
549c2f8a16
|
feat(dev): drop pdf_extract in favor of ocrmypdf
|
2025-07-15 14:50:17 +00:00 |
perf3ct
|
a393bd030f
|
fix(tests): resolve issues in integration tests for the new multiple ocr languages
|
2025-07-14 21:28:55 +00:00 |
perf3ct
|
fd152acb91
|
Merge branch 'feat/multiple-ocr-languages' of https://github.com/readur/readur into feat/multiple-ocr-languages
|
2025-07-14 19:33:51 +00:00 |
perf3ct
|
9d9488954c
|
feat(lang): update backend to support multiple languages at the same time during OCR
|
2025-07-14 19:33:43 +00:00 |
Jon Fuller
|
8edc4759f1
|
Merge branch 'main' into feat/multiple-ocr-languages
|
2025-07-14 11:29:46 -07:00 |
perf3ct
|
9c051b6f55
|
feat(ocr): gracefully handle problematic PDFs in all the ways, create tests so that it doesn't happen again
|
2025-07-14 16:36:32 +00:00 |
perf3ct
|
5671038bd1
|
fix(dev): merge main into feature
|
2025-07-13 17:15:59 +00:00 |
perf3ct
|
f18696d4f8
|
feat(server): gracefully manage requeue requests for the same document
|
2025-07-11 21:27:12 +00:00 |
perf3ct
|
ea94dff8ba
|
fix(stats): create new get_queue_statistics function to avoid conflicts
|
2025-07-09 00:27:43 +00:00 |
perf3ct
|
36b5330622
|
fix(stats): try to fix the stats extraction, again
|
2025-07-08 21:18:21 +00:00 |
perf3ct
|
8b1cf027a3
|
fix(server): resolve incorrect document failure titles
|
2025-07-08 20:24:52 +00:00 |
perf3ct
|
d51d1f1c78
|
fix(stats): try to fix stats export, again
|
2025-07-08 20:03:55 +00:00 |
perf3ct
|
05a0355796
|
fix(tests): fix the crazy metrics collection issue
|
2025-07-08 16:52:23 +00:00 |
perf3ct
|
7d48480cd6
|
fix(tests): and resolve missing endpoint
|
2025-07-08 04:37:33 +00:00 |
perf3ct
|
bf2162ad89
|
fix(web_upload): resolve issue that caused files that were uploaded via the web, to not be added to the queue
|
2025-07-07 19:28:08 +00:00 |
perf3ct
|
1b984a12c2
|
fix(server): resolve type issues and functions for compilation issues
|
2025-07-04 00:53:32 +00:00 |
perf3ct
|
bdf4f5f8fe
|
feat(ocr): add even more about the multiple ocr languages
|
2025-07-03 19:20:19 +00:00 |
perf3ct
|
8ed8701d5b
|
feat(server): implement DEBUG environment variable
|
2025-07-02 17:57:57 +00:00 |
Jon Fuller
|
a88f387aeb
|
Merge branch 'main' into feat/multiple-ocr-languages
|
2025-07-01 11:53:42 -07:00 |
perf3ct
|
f7018575d8
|
feat(pdf): implement ocrmypdf to extract text from PDFs
|
2025-07-01 00:56:48 +00:00 |
perf3ct
|
f26ab1e367
|
fix(pdf): resolve PDF wordcount error
|
2025-07-01 00:10:49 +00:00 |
perf3ct
|
dd90e48fd2
|
feat(server): mark documents with 0 words as failed, and fix webdav unit tests
|
2025-06-30 22:43:25 +00:00 |
perf3ct
|
5f10a8b82c
|
feat(server): continue to try to wrangle the failed and ignored documents
|
2025-06-29 23:27:51 +00:00 |
perf3ct
|
8d1a886139
|
fix(tests): resolve compilation error in the multiple OCR functionality
|
2025-06-29 23:21:42 +00:00 |
perf3ct
|
e0b0f49ba2
|
feat(tests): implement and update tests for multiple OCR languages
|
2025-06-29 23:03:37 +00:00 |
perf3ct
|
b4ddf034b0
|
feat(server/client): support multiple OCR languages
|
2025-06-29 22:51:06 +00:00 |
perf3ct
|
34bc207e39
|
feat(server/client): add failed_documents table to handle failures, and move logic of failures
|
2025-06-28 20:52:58 +00:00 |
perfectra1n
|
7f69cd2e5f
|
fix(server/client): fix incorrect OCR measurements
|
2025-06-27 20:23:59 -07:00 |
perf3ct
|
cdad6477ed
|
feat(server): reorganize components into their own modules and fix imports
|
2025-06-27 18:27:42 +00:00 |