penquify

OCR, but backwards.
Make your documents worse.

Generate photorealistic smartphone photos of any document β€” with verified ground truth for every field. For OCR training, vision benchmarks, and agentic pipeline testing.

Try the demo
Open sourceΒ·Ground truth verifiedΒ·API Β· CLI Β· MCP
Interactive Demo

Try the pipeline

Configure your input, select variations, and watch penquify run β€” step by step.

Photo Variations (2 selected)

Using cached example β€” modify the input to call the live API

πŸ”¬

Select your input, pick presets, and hit Run

2 free runs Β· No sign-up required

One command

From zero to verified dataset in 30 seconds

~ penquify
# Install
pip install penquify && playwright install chromium

# Full demo: PDF + 8 verified photo variations
penquify demo

# Upload any PDF, generate realistic photos
penquify photos --image invoice.pdf --presets blurry coffee_stain

# Describe what you want in plain English
penquify config --text "folded paper with grease, shot on old Motorola"

# Batch: 50 documents Γ— 4 variations = 200 verified images
penquify dataset --count 50 --presets full_picture blurry coffee_stain cropped_header

β€œI needed to test my vision pipeline but I had 12 real documents. I needed 1,200 variations β€” blurry, folded, stained, cropped β€” with known ground truth for every field. I also needed the agent to do lookups after extraction, so the data in the photo had to be real and verifiable.”

The pipeline

From structured data to verified synthetic photos in one call. Every generated image has a ground-truth manifest.

01

Input

APIUI

JSON schema, uploaded PDF, or raw image. Define the data you need in the output: OC numbers, item names, quantities, dates. Or upload an existing document and we detect the schema.

02

Render clean document

Jinja2 HTML templates produce a pixel-perfect PDF. Dispatch guides, invoices, POs, BOLs β€” or bring your own template. Every field maps to a schema key.

03

Apply photo variations

Gemini generates a realistic photo using your variation config. Camera model, paper deformation, stains, blur, angle β€” every variable is controllable or randomizable.

04

Verify ground truth

Key

A second model pass reads the generated photo back and checks every schema field against the source data. If a field is wrong, it gets flagged and re-generated. Output: a verified image + ground truth JSON.

05

Occlusion manifest

Key

If a variation intentionally occludes data (crop, stain, fold), the manifest reports exactly which schema fields are affected and why. Your benchmark knows what should fail.

06

Output dataset

Clean PDF + N verified photos + ground truth JSON + occlusion manifest per image. Ready for training, benchmarking, or feeding into your agentic pipeline.

Same document. Different nightmares.

Every photo generated from the same clean PDF. Each preset targets a different real-world failure mode.

full_picture preset
full_picture
full_picture
Clean handheld, 90% frame
folded_skewed preset
folded_skewed
folded_skewed
Dog-ear, crease, 6deg tilt
zoomed_detail preset
zoomed_detail
zoomed_detail
Close-up OCR, oblique 25-30deg
blurry preset
blurry
blurry
Motion blur, partial legibility
coffee_stain preset
coffee_stain
coffee_stain
Stain over text, partial obstruction
cropped_header preset
cropped_header
cropped_header
Top 10-15% cut off
Case Study

Automating receipt with only purchase orders

A real scenario where penquify was built out of necessity. We had POs in the ERP. We had zero dispatch guides. We needed hundreds of realistic test photos.

The problem

  • Extract data from photos of dispatch guides taken by warehouse workers
  • Match items against purchase orders in the ERP
  • Handle name mismatches β€” supplier says β€œPAPA PREFRITA CONGELADA”, ERP says β€œPAPAS FRITAS 10MM”
  • Handle unit mismatches β€” guide says 12 CJ, PO says 150 KG, no weight-per-case given
  • Handle quantity discrepancies at two levels: guide vs PO, then physical count vs guide
  • Create ERP material documents with the correct batch, matching PO position

We had POs. We had zero dispatch guides. We needed hundreds of realistic test photos to validate the pipeline end-to-end.

What penquify did

1
Generated dispatch guides from PO data
Read the PO from the ERP (items, quantities, supplier). Generated a realistic dispatch guide PDF with different names β€” the way a real supplier writes them, not ERP master data names.
2
Introduced realistic discrepancies
Some items with different units (CJ vs KG, UN vs KG β€” no conversion on the guide). Some with 5% less quantity. Some with supplier-specific jargon.
3
Generated warehouse photos
6 variations per document: clean handheld, folded, blurry, coffee-stained, cropped header, zoomed detail. Each looks like a warehouse worker snapped it at a loading dock.
4
Verified every field
Ground truth JSON for each photo. Occlusion manifest saying exactly which fields are hidden in the coffee-stained or cropped versions.
5
Fed to the agent pipeline
Photo β†’ vision extraction β†’ PO matching β†’ discrepancy detection β†’ ERP write. Tested 100+ scenarios in hours instead of months of real operations.

The kind of mismatches real documents have

Penquify generates these programmatically. The ground truth knows the correct mapping.

Dispatch Guide (supplier)Purchase Order (ERP)Challenge
PAPA PREFRITA CONGELADA CORTE GRUESO
12 CJ
PAPAS FRITAS 10MM
150 KG
Different name + different unit. Guide doesn't say weight per case. Agent must ask.
MOZZARELLA RALLADA PREMIUM
115 KG
QUESO MOZZARELLA RALLADO
120 KG
Different name + 5kg short. Agent must flag discrepancy.
JENGIBRE FRESCO PELADO
2 UN
JENGIBRE
0.5 KG
Unit mismatch (UN vs KG). No weight per unit on guide. Agent must ask.
LIMON SUTIL FRESCO
24 KG
LIMON SUTIL
25 L
KG vs L + 1 unit short. Two problems at once.
MENTA FRESCA ATADO
10 UN
MENTA FRESCA
2 KG
Atados vs KG. No weight per atado. Agent must ask.

Everything is configurable

Camera model is free text. So is the angle. And the stain type. And the background. Every field is a knob.

⌁

22 camera presets + free text

Samsung Galaxy S7 through iPhone 14, budget Androids, rugged field devices. Or write anything: "Nokia 3310 with cracked screen"

β—«

Paper physics

Curvature, folds (dog-ear, middle vertical, multiple), wrinkles, corner bends, edge curl. Paper behaves like paper.

β—‰

Damage & contamination

Coffee stains, water damage, grease marks, ink smudges, torn edges, dirt. Each with type, location, opacity, and text obstruction level.

⊿

Geometric distortion

Perspective skew, keystone distortion, rotation (0-15 deg), oblique angles up to 45 deg. Simulate every handheld capture angle.

◐

Lighting & artifacts

Glare hotspots, uneven lighting, shadow bands, finger shadows, JPEG compression (none to heavy), motion blur with direction control.

⊞

Natural language config

"Blurry photo with coffee stain, strong angle, old Samsung, paper folded in half" β€” AI converts your description to variation JSON.

βœ“

Ground truth verification

Every generated image is read back and verified against the source schema. Garbled fields get flagged and regenerated. You get verified data.

β—Ž

Occlusion tracking

When a crop, stain, or fold hides data, the manifest says exactly which fields are affected and why. Your benchmark knows what should fail.

βœ‹

Hand & context

Hand visible or not, grip type (thumb corner, pinched edge, both hands), gloves (warehouse, latex, none). Background as blurred context.

Every entry point

Start from wherever you are in your pipeline.

From scratch

JSON Schema β†’ Photos

Send document data as JSON. Get back PDF + verified photos with ground truth.

Existing document

Upload PDF β†’ Variations

Upload any PDF or image. We detect the schema, then generate N realistic variations with known ground truth.

Reference photo

Seed Image β†’ Style Transfer

Provide a reference photo (lighting, background, camera style). New documents match that visual style.

Batch

N Docs Γ— M Variations

Generate thousands of document-variation combinations. Progress tracking, S3 upload, parallel generation.

Fine-tuning

Training Pairs

Output image + ground truth pairs in formats ready for fine-tuning vision models: JSONL, COCO, or custom schema.

Pipeline testing

E2E Agent Tests

Generate documents with known data, feed them to your agent, verify extractions + downstream lookups match ground truth.

Built for real problems

Not another demo tool. Built because we needed it.

OCR Training

Generate 10,000 training pairs

N documents Γ— M variations with ground truth labels. Export as JSONL for fine-tuning. Systematically vary difficulty: clean β†’ blurry β†’ stained β†’ cropped.

Vision Benchmark

Measure extraction accuracy

Run your OCR/vision model on a controlled dataset. Know exactly which fields should be extractable vs occluded. Compare models objectively.

Agentic Testing

End-to-end agent validation

Generate documents with known data. Feed photos to your agent. Verify it extracts correctly AND does the right downstream actions.

Cold Start

Not enough real documents

You have 12 real invoices but need 1,200 test cases. Generate synthetic variations that cover every edge case your production pipeline will see.

Every interface

Terminal
CLI
penquify demo
Import
Python
from penquify import ...
HTTP
REST API
FastAPI + Swagger
AI Agent
MCP Server
5 tools for Claude
Plugin
Agent SDK
Drop-in tool list
Cloud
penquify.com
UI + API credits

Pricing that scales with you

Self-host free forever. Cloud for convenience. Enterprise for compliance.

Open Source
$0forever

Self-hosted. Bring your own Gemini key.

  • All document templates
  • All photo presets + custom
  • CLI + Python library
  • REST API (self-hosted)
  • MCP server + Agent SDK
  • Ground truth verification
  • Community support
Most Popular
Pro
$29/mo

500 credits/mo. UI dashboard. No API key needed.

  • Everything in Open Source
  • 500 image credits/month
  • Web UI with live preview
  • Hosted API (no infra needed)
  • Dataset history + downloads
  • Priority generation queue
  • Team credit pool (shared)
  • Email support
Enterprise
Custom

Volume credits, SLA, VPC deployment, SSO.

  • Everything in Pro
  • Volume credit pricing
  • Custom document templates
  • VPC / on-premise deployment
  • SSO (SAML/OIDC) + RBAC
  • SOC 2 Type II compliance
  • Zero data retention policy
  • Dedicated support + SLA

Need more credits?

Buy additional credit packs anytime. Credits never expire and stack with your subscription.

MIT LicensedΒ·No credit card required for trialΒ·BYOK on all tiersΒ·Cancel anytime

Ready to build better vision AI?

Start generating verified document datasets in 30 seconds. Open source, or let us handle the infra.