← All work
Product · 2022

Passport MRZ Scanner & Parser (Tesseract.js)

Overview

A browser-based build that reads the Machine-Readable Zone (MRZ) of a passport from an uploaded photo using Tesseract.js, then parses the two MRZ lines into structured identity fields. Evaluated and adapted from the public MRZ-Scanner-JS project.

Why It Exists

Onboarding and KYC flows often need to read passport/ID data from a photo without server round-trips. This build validates that MRZ capture and parsing can run entirely client-side in the browser.

What We Built

A single-page app (index.html) that loads an image, brightens and rasterises it through a Canvas (brightness(140%)) to aid recognition, runs Tesseract.js OCR, and feeds the result to a hand-written mrz-parser.js. The parser implements the ICAO 9303 TD3 layout, slicing document type, issuing country, surname/given names, document number, nationality, date of birth, sex, expiry, personal number, and the associated check digits.

Technologies & Approach

Pure client-side JavaScript with Tesseract.js for OCR and the Canvas API for preprocessing; no backend. The MRZ parser encodes the fixed-position field layout and check-digit structure of machine-readable travel documents.

Outcome / Impact

Demonstrated a fully in-browser passport-reading flow, image preprocessing, OCR, and standards-based MRZ parsing, proving the approach for privacy-friendly, server-free identity capture. Positioned as evaluation/R&D adapted from a public project.

Capabilities Demonstrated

  • In-browser OCR with Tesseract.js
  • ICAO 9303 MRZ parsing and check-digit handling
  • Canvas-based image preprocessing for recognition accuracy
  • Privacy-friendly, server-free document capture
More work See all →