← All work
Engineering · 2024–25

Guten OCR, Cross-Platform JS OCR Library (Evaluation/Extension)

Overview

Guten OCR is an open-source JavaScript OCR library that runs on Node.js, the browser, React Native, and C++, built on PaddleOCR’s PP-OCRv4 models and ONNX Runtime. This repository is the studio’s working copy of the gutenye/ocr project, evaluated and extended across a long-lived branch (~230 commits) as a candidate engine for OCR features.

Why It Exists

Many OCR options require a cloud service or Python runtime. Guten OCR offered on-device, cross-platform text detection and recognition in pure JS/TS, making it worth evaluating, integrating, and extending for embedding OCR directly into Node and browser products.

What We Built

We worked within the Bun-based monorepo (packages/): common (shared detection/recognition pipeline), node (@gutenye/ocr-node, using onnxruntime-node and sharp for image handling), browser (@gutenye/ocr-browser, loading ONNX detection/recognition models and the ppocr dictionary), react-native, and models. Tooling includes Biome for lint/format and Lefthook for git hooks. The API is a simple Ocr.create() then ocr.detect(image).

Technologies & Approach

TypeScript across a Bun workspace; ONNX Runtime executes the PP-OCRv4 detection and recognition models; sharp handles Node-side image decoding. The package split lets the same core pipeline target Node, browser, and React Native runtimes.

Outcome / Impact

Hands-on evaluation and extension of a production-grade, on-device OCR engine, proving the studio can integrate ONNX-based vision models into JavaScript products without a cloud dependency, and operate confidently inside a multi-package OSS codebase. Framed as fork/evaluation and extension of an open-source project.

Capabilities Demonstrated

  • On-device OCR with ONNX Runtime and PaddleOCR PP-OCRv4
  • Cross-platform JS/TS library design (Node, Browser, React Native)
  • Image preprocessing pipelines with sharp
  • Working within and extending a Bun-based OSS monorepo
More work See all →