Every organization has at least one workflow that still starts on paper: a signup form, field inspection note, school permission slip, donation form, clinic intake form, KYC form, survey sheet, maintenance report, or old scanned record. The problem is not only reading the handwriting. The real problem is turning that handwriting into structured data that software can use.
I first ran into this through OmmSai, a healthcare automation project built around roughly 15,000 handwritten prescription files for a charitable healthcare event. That project is the origin story and proof point, not the product boundary. Prescriptions were one example of a much broader pattern: paper workflows that need to become automation-ready JSON.
The Broader Workflow Problem
Handwriting JSON is an open-source Python package and CLI for automating handwritten document workflows. It converts forms, notes, and scanned paperwork into structured JSON using vision LLMs and optional schema guidance. The target use cases are intentionally broad: signup forms, field inspection notes, school permission slips, donation forms, clinic intake forms, KYC forms, surveys, maintenance reports, and scanned records.
Why OCR Alone Is Not Enough
OCR can return text, but automation usually needs fields. A CRM, spreadsheet import, compliance workflow, student database, ticketing system, or review queue does not just need a transcription. It needs predictable keys, values, arrays, booleans, and missing-field behavior that software can act on.
Handwriting JSON asks a more useful question than "what text is visible?" It asks, "what structured data should this document become?" That is the difference between reading a scanned form and making the form usable in an automated workflow.
Install
pip install handwriting-jsonPython API
from handwriting_json import extract
result = extract(
"handwritten_registration_form.jpg",
model="anthropic/claude-sonnet-4-5",
schema={
"full_name": "",
"phone": "",
"email": "",
"address": "",
"date": "",
"notes": "",
"signature_present": False,
},
)
print(result.data)Why Schema Guidance Matters
Without a schema, a vision model can describe a document, but the output may not be stable enough for automation. With an example JSON object or JSON Schema, the model has a target contract. The workflow changes from "read this image" to "extract this document into this shape."
CLI Usage
handwriting-json --help
handwriting-json extract \
--input handwritten_signup_form.jpg \
--schema examples/signup_form_schema.json \
--output result.json \
--model anthropic/claude-sonnet-4-5Why LiteLLM
Handwritten document automation depends heavily on vision LLM quality, cost, latency, and provider availability. LiteLLM gives Handwriting JSON a thin provider abstraction without forcing users into one model vendor. A developer can route extraction through Anthropic, OpenAI, or another supported vision model while the package stays focused on inputs, prompts, schema guidance, JSON parsing, and validation warnings.
Why Not LangChain or LangGraph in v0.1
LangChain and LangGraph were intentionally deferred for v0.1 because the first release is not a multi-step agent workflow. It is a focused library: normalize a document input, build a schema-guided extraction prompt, call a vision LLM, parse JSON, and report validation warnings. Adding orchestration before the package needs retries, repair loops, human review routing, OCR fallback, or multi-model routing would make the public API heavier without improving the core handwritten forms-to-JSON use case.
What v0.1 Is and Is Not
Version 0.1.x is a Python package and Typer CLI for local paths, URLs, base64 strings, bytes, and file-like inputs. It supports PDF, PNG, JPG/JPEG, and WebP detection, JSON Schema guidance, example JSON guidance, generic extraction prompts, LiteLLM-backed provider calls, JSON response parsing, and validation warnings.
Docker, a REST API, hosted SaaS, OCR fallback, and heavier orchestration are roadmap candidates, not current features. The public package is deliberately small: handwritten forms, notes, and scanned paperwork in; automation-ready JSON out.