OCR, but backwards.
Make your documents worse.
Generate photorealistic smartphone photos of any document β with verified ground truth for every field. For OCR training, vision benchmarks, and agentic pipeline testing.


Same data. Drag to compare.
Try the pipeline
Configure your input, select variations, and watch penquify run β step by step.
Using cached example β modify the input to call the live API
Select your input, pick presets, and hit Run
2 free runs Β· No sign-up required
One command
From zero to verified dataset in 30 seconds
βI needed to test my vision pipeline but I had 12 real documents. I needed 1,200 variations β blurry, folded, stained, cropped β with known ground truth for every field. I also needed the agent to do lookups after extraction, so the data in the photo had to be real and verifiable.β
The pipeline
From structured data to verified synthetic photos in one call. Every generated image has a ground-truth manifest.
Input
APIUIJSON schema, uploaded PDF, or raw image. Define the data you need in the output: OC numbers, item names, quantities, dates. Or upload an existing document and we detect the schema.
Render clean document
Jinja2 HTML templates produce a pixel-perfect PDF. Dispatch guides, invoices, POs, BOLs β or bring your own template. Every field maps to a schema key.
Apply photo variations
Gemini generates a realistic photo using your variation config. Camera model, paper deformation, stains, blur, angle β every variable is controllable or randomizable.
Verify ground truth
KeyA second model pass reads the generated photo back and checks every schema field against the source data. If a field is wrong, it gets flagged and re-generated. Output: a verified image + ground truth JSON.
Occlusion manifest
KeyIf a variation intentionally occludes data (crop, stain, fold), the manifest reports exactly which schema fields are affected and why. Your benchmark knows what should fail.
Output dataset
Clean PDF + N verified photos + ground truth JSON + occlusion manifest per image. Ready for training, benchmarking, or feeding into your agentic pipeline.
Same document. Different nightmares.
Every photo generated from the same clean PDF. Each preset targets a different real-world failure mode.






Automating receipt with only purchase orders
A real scenario where penquify was built out of necessity. We had POs in the ERP. We had zero dispatch guides. We needed hundreds of realistic test photos.
The problem
- Extract data from photos of dispatch guides taken by warehouse workers
- Match items against purchase orders in the ERP
- Handle name mismatches β supplier says βPAPA PREFRITA CONGELADAβ, ERP says βPAPAS FRITAS 10MMβ
- Handle unit mismatches β guide says 12 CJ, PO says 150 KG, no weight-per-case given
- Handle quantity discrepancies at two levels: guide vs PO, then physical count vs guide
- Create ERP material documents with the correct batch, matching PO position
We had POs. We had zero dispatch guides. We needed hundreds of realistic test photos to validate the pipeline end-to-end.
What penquify did
The kind of mismatches real documents have
Penquify generates these programmatically. The ground truth knows the correct mapping.
| Dispatch Guide (supplier) | Purchase Order (ERP) | Challenge |
|---|---|---|
PAPA PREFRITA CONGELADA CORTE GRUESO 12 CJ | PAPAS FRITAS 10MM 150 KG | Different name + different unit. Guide doesn't say weight per case. Agent must ask. |
MOZZARELLA RALLADA PREMIUM 115 KG | QUESO MOZZARELLA RALLADO 120 KG | Different name + 5kg short. Agent must flag discrepancy. |
JENGIBRE FRESCO PELADO 2 UN | JENGIBRE 0.5 KG | Unit mismatch (UN vs KG). No weight per unit on guide. Agent must ask. |
LIMON SUTIL FRESCO 24 KG | LIMON SUTIL 25 L | KG vs L + 1 unit short. Two problems at once. |
MENTA FRESCA ATADO 10 UN | MENTA FRESCA 2 KG | Atados vs KG. No weight per atado. Agent must ask. |
Everything is configurable
Camera model is free text. So is the angle. And the stain type. And the background. Every field is a knob.
22 camera presets + free text
Samsung Galaxy S7 through iPhone 14, budget Androids, rugged field devices. Or write anything: "Nokia 3310 with cracked screen"
Paper physics
Curvature, folds (dog-ear, middle vertical, multiple), wrinkles, corner bends, edge curl. Paper behaves like paper.
Damage & contamination
Coffee stains, water damage, grease marks, ink smudges, torn edges, dirt. Each with type, location, opacity, and text obstruction level.
Geometric distortion
Perspective skew, keystone distortion, rotation (0-15 deg), oblique angles up to 45 deg. Simulate every handheld capture angle.
Lighting & artifacts
Glare hotspots, uneven lighting, shadow bands, finger shadows, JPEG compression (none to heavy), motion blur with direction control.
Natural language config
"Blurry photo with coffee stain, strong angle, old Samsung, paper folded in half" β AI converts your description to variation JSON.
Ground truth verification
Every generated image is read back and verified against the source schema. Garbled fields get flagged and regenerated. You get verified data.
Occlusion tracking
When a crop, stain, or fold hides data, the manifest says exactly which fields are affected and why. Your benchmark knows what should fail.
Hand & context
Hand visible or not, grip type (thumb corner, pinched edge, both hands), gloves (warehouse, latex, none). Background as blurred context.
Every entry point
Start from wherever you are in your pipeline.
JSON Schema β Photos
Send document data as JSON. Get back PDF + verified photos with ground truth.
Upload PDF β Variations
Upload any PDF or image. We detect the schema, then generate N realistic variations with known ground truth.
Seed Image β Style Transfer
Provide a reference photo (lighting, background, camera style). New documents match that visual style.
N Docs Γ M Variations
Generate thousands of document-variation combinations. Progress tracking, S3 upload, parallel generation.
Training Pairs
Output image + ground truth pairs in formats ready for fine-tuning vision models: JSONL, COCO, or custom schema.
E2E Agent Tests
Generate documents with known data, feed them to your agent, verify extractions + downstream lookups match ground truth.
Built for real problems
Not another demo tool. Built because we needed it.
Generate 10,000 training pairs
N documents Γ M variations with ground truth labels. Export as JSONL for fine-tuning. Systematically vary difficulty: clean β blurry β stained β cropped.
Measure extraction accuracy
Run your OCR/vision model on a controlled dataset. Know exactly which fields should be extractable vs occluded. Compare models objectively.
End-to-end agent validation
Generate documents with known data. Feed photos to your agent. Verify it extracts correctly AND does the right downstream actions.
Not enough real documents
You have 12 real invoices but need 1,200 test cases. Generate synthetic variations that cover every edge case your production pipeline will see.
Every interface
Pricing that scales with you
Self-host free forever. Cloud for convenience. Enterprise for compliance.
Self-hosted. Bring your own Gemini key.
- All document templates
- All photo presets + custom
- CLI + Python library
- REST API (self-hosted)
- MCP server + Agent SDK
- Ground truth verification
- Community support
500 credits/mo. UI dashboard. No API key needed.
- Everything in Open Source
- 500 image credits/month
- Web UI with live preview
- Hosted API (no infra needed)
- Dataset history + downloads
- Priority generation queue
- Team credit pool (shared)
- Email support
Volume credits, SLA, VPC deployment, SSO.
- Everything in Pro
- Volume credit pricing
- Custom document templates
- VPC / on-premise deployment
- SSO (SAML/OIDC) + RBAC
- SOC 2 Type II compliance
- Zero data retention policy
- Dedicated support + SLA
Need more credits?
Buy additional credit packs anytime. Credits never expire and stack with your subscription.
Ready to build better vision AI?
Start generating verified document datasets in 30 seconds. Open source, or let us handle the infra.