Detect a book spine with an oriented bounding box (OpenCV) → deskew → run any OCR engine on demand.
Detection (new): classical CV via OpenCV.js — Canny edges, contour finding, minAreaRect for an oriented rectangle. Works for tilted spines. Frame the spine clearly; the largest elongated rectangle wins. ~9MB lazy-loaded on first capture.
Rotation: portrait crops are rotated 90° counter-clockwise so top-of-spine reads as left-of-text. Whether the title ends up right-side-up depends on how the book was shot — there's no way to tell without OCR. Use Flip 180° on the OCR input if the result is upside-down.
Tesseract Fast vs Best: same engine, different traineddata files (~11MB vs ~25MB).
PaddleOCR PP-OCRv4 via esearch-ocr with detector + recognizer ONNX models from paddleocr-browser. Loads onnxruntime-web.
PaddleOCR PP-OCRv5 via ppu-paddle-ocr — newer mobile model. Loads its own onnxruntime-web build internally.
TrOCR models are q8-quantized (~120MB each). Printed for typeset spines, handwritten for hand-lettered ones.
All client-side. Nothing leaves the device.