Turing Verify: A Multi-Model Forensic Engine for Scalable Document Verification
Architecture, Methodology, and Calibration of an AI-Driven Platform Defending Against 36 Attack Vectors Across 25+ Countries — Featuring Deep Verification, KYB/KYC Screening, and AI vs Human Benchmarking
Turing Space Inc. -- April 2026
Document Classification: Public Technical White Paper Version: 2.0 Date: April 3, 2026
Abstract
Document fraud represents a persistent and escalating threat to institutions that rely on credential verification for admissions, hiring, licensing, and regulatory compliance. The proliferation of generative AI tools has significantly lowered the barrier to producing convincing forgeries, rendering traditional manual and OCR-based verification approaches increasingly inadequate. This white paper presents Turing Verify, a forensic document verification platform that employs a multi-model AI architecture to systematically evaluate documents against 81 total checkpoints comprising 45 forensic rules and 36 attack vector defenses. The system integrates 56 verification portal connections across 25+ countries, supports 9 document categories with 15+ subtypes, and implements a 7-strategy QR code scanning pipeline for extracting and cross-referencing embedded verification data. A tiered scoring model produces structured verdicts supported by 12-section forensic PDF reports. The platform processes documents through a multi-stage pipeline: rapid pre-screening via Claude Haiku 4.5, QR code extraction and portal verification, standard forensic analysis via Claude Sonnet 4 or GPT-5.3, and an optional Deep Verification mode powered by Claude Sonnet 4 that expands analysis to 13 forensic stages with 8 confidence dimensions, KYB (Know Your Business) issuer due diligence, and KYC (Know Your Customer) holder background screening. Country-specific validation modules enforce national ID checksum algorithms for 6 countries and business registration format validation for 9 countries, while MRZ parsing conforms to ICAO 9303 standards across TD1, TD2, and TD3 formats. A credit-based pricing system supports both standard (1 credit) and deep (5 credit) verifications, with prompt caching achieving up to 90% input cost reduction on Anthropic models. Calibration against 15 ground-truth cases, continuous feedback integration, and a 200-document AI vs Human Inspector benchmark framework ensure that detection accuracy tracks the evolving forgery landscape. This paper details the system architecture, forensic methodology, scoring model, calibration framework, benchmark system, and comparative performance characteristics of the Turing Verify platform.
1. Introduction: The Scale and Cost of Document Fraud
1.1 The Global Fraud Landscape
Document fraud is not a new phenomenon, but its scale, sophistication, and economic impact have accelerated dramatically in the past decade. Institutions across education, immigration, financial services, and corporate hiring depend on documentary evidence to establish identity, qualifications, and regulatory compliance. Each of these dependencies represents a potential attack surface for fraudulent actors.
Conservative estimates from international regulatory bodies place annual losses from document fraud in the tens of billions of dollars globally. The costs extend beyond direct financial losses to include reputational damage to institutions that unwittingly accept fraudulent credentials, erosion of public trust in credentialing systems, and downstream consequences when unqualified individuals occupy positions of responsibility.
The problem is compounded by the internationalization of credential flows. A single university admissions office may receive transcripts from institutions in 30 or more countries, each with distinct formatting conventions, grading systems, security features, and verification mechanisms. Corporate HR departments face similar challenges when evaluating professional certifications, government-issued IDs, and business registrations from diverse jurisdictions.
┌──────────────────────────────────────────────────────────────────────┐
│ THE DOCUMENT TRUST CRISIS │
│ Escalation Timeline │
│ │
│ 2015 2018 2020 2023 2026 │
│ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │Basic│ │Photo│ │Temp-│ │Gen │ │Multi│ │
│ │Scan │ │Edit │ │late │ │AI │ │Modal│ │
│ │Forge│ │Tools│ │Forge│ │Forge│ │Forge│ │
│ │ries │ │ │ │ries │ │ries │ │ries │ │
│ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │
│ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ │
│ Low Medium High Very High Critical │
│ Effort Effort Effort Effort Effort │
│ Low Medium High Very High Extreme │
│ Quality Quality Quality Quality Quality │
│ │
│ Detection ──────────────────────────────────────────────▶ │
│ Gap: Widening as attack sophistication outpaces defense │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Cost of Undetected Fraud: │ │
│ │ - Admissions fraud: $50K-$300K per unqualified grad │ │
│ │ - Hiring fraud: $15K-$240K per bad hire │ │
│ │ - Regulatory fraud: $100K-$10M+ in penalties │ │
│ │ - Reputational damage: Unquantifiable │ │
│ └───────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘

1.2 The Generative AI Accelerant
The emergence of commercially available generative AI tools has fundamentally altered the forgery landscape. Prior to 2022, producing a convincing document forgery required either specialized graphic design skills or access to expensive equipment. Template-based forgeries were detectable through layout analysis; photographic alterations left artifacts visible under magnification; and creating plausible institutional formatting demanded intimate knowledge of the target institution.
Generative AI has compressed the skill and cost requirements for each of these attack vectors. Text generation models can produce plausible institutional language, grading narratives, and administrative notations. Image generation and editing models can synthesize realistic seals, signatures, and watermarks. Multi-modal models can generate entire document images that approximate legitimate templates with troubling fidelity.
The democratization of forgery tools has produced a corresponding increase in the volume and variety of fraudulent documents entering verification pipelines. Institutions report significant increases in submissions flagged for review, while acknowledging that many sophisticated forgeries likely pass undetected through manual review processes.
1.3 The Verification Gap
A verification gap exists between the sophistication of modern forgeries and the capabilities of prevailing verification methods. This gap manifests across several dimensions:
Volume versus throughput. Institutions processing thousands of credential submissions per cycle cannot allocate sufficient human review time to scrutinize each document at the level required to detect sophisticated forgeries.
Knowledge versus diversity. No individual reviewer possesses expertise across all document types, issuing countries, and institutional formats. A reviewer expert in North American academic transcripts may lack familiarity with the security features of Southeast Asian government IDs.
Consistency versus fatigue. Human reviewers exhibit declining accuracy over extended review sessions. Subtle anomalies that would be detected in the first hour of review may pass unnoticed in the fourth.
Speed versus depth. Organizations face pressure to process verifications quickly to meet enrollment deadlines, hiring timelines, or regulatory filing dates. This pressure incentivizes superficial review.
┌──────────────────────────────────────────────────────────┐
│ THE VERIFICATION GAP │
│ │
│ Attack Sophistication Detection Capability │
│ ▲ ▲ │
│ │ ╱╱╱╱╱╱ │ ····· │
│ │ ╱╱╱╱╱╱ │ ····· │
│ │ ╱╱╱╱╱╱ GAP │ ····· │
│ │ ╱╱╱╱╱╱ ◄────────► │ ····· │
│ │╱╱╱╱╱╱ │ ····· │
│ ╱╱╱╱╱╱ │····· │
│ ╱╱╱╱╱ ····· │
│ ╱╱╱╱ ····· │
│ ──────────────────► ─────────────────────► │
│ Time Time │
│ │
│ ╱╱╱ = Forgery capability ····· = Manual review │
│ capability │
│ │
│ Turing Verify objective: close this gap through │
│ automated multi-model forensic analysis │
└──────────────────────────────────────────────────────────┘

1.4 Design Objectives
Turing Verify was designed to address the verification gap through the following objectives:
- Comprehensive coverage: Evaluate documents against a taxonomy of forensic checkpoints that spans structural, semantic, external, and metadata dimensions.
- Attack-vector awareness: Explicitly model and defend against known forgery techniques rather than relying solely on anomaly detection.
- International scope: Support documents from 25+ countries with country-specific validation logic where applicable.
- Auditability: Produce detailed forensic reports that document the reasoning behind each verdict, enabling human review and institutional decision-making.
- Scalability: Process documents at throughput rates compatible with institutional batch workflows while maintaining forensic depth.
- Adaptability: Incorporate a calibration framework that enables continuous refinement as new forgery techniques emerge.
The remainder of this paper details how these objectives are realized in the system architecture, forensic methodology, and operational design of the Turing Verify platform.
2. Background and Related Work
2.1 Evolution of Document Verification Approaches
Document verification has evolved through several generations of technology, each addressing some limitations of its predecessors while introducing new constraints.
First Generation: Manual Expert Review. The earliest systematic approach to document verification relied on trained human reviewers examining documents for signs of tampering, inconsistent formatting, or implausible content. This approach offered high accuracy when reviewers possessed domain expertise relevant to the specific document type and issuing jurisdiction. However, manual review scales poorly, exhibits variability across reviewers and over time, and becomes impractical when documents originate from dozens of countries with distinct formatting conventions.
Second Generation: Outsourced Verification Services. To address scalability, many institutions outsourced verification to specialized firms that maintained databases of institutional contacts and employed staff to contact issuing organizations directly. While this approach provides high confidence when successful, it is slow (often requiring weeks), expensive per verification, and dependent on the responsiveness and existence of issuing institutions. For defunct institutions or those in jurisdictions with limited administrative infrastructure, outsourced verification frequently fails to produce definitive results.
Third Generation: OCR and Template Matching. Optical character recognition combined with template databases offered the first automated approach to verification. Systems would extract text from document images and compare structural elements against known templates. While faster than manual approaches, OCR-based systems struggle with documents that deviate from stored templates, produce high false-positive rates for legitimate but unusual formatting, and are vulnerable to forgeries that accurately replicate template structures while altering content.
Fourth Generation: Blockchain and Digital Credential Initiatives. Distributed ledger technology has been proposed as a foundational solution to credential verification by enabling issuing institutions to publish cryptographically signed credentials that recipients can share with verifiers. While architecturally sound, blockchain-based approaches require universal adoption by issuing institutions, a prerequisite that remains far from realized. The vast majority of credentials in circulation were issued on paper or as conventional digital documents without blockchain anchoring, leaving an enormous legacy verification problem that blockchain solutions do not address.
┌───────────────────────────────────────────────────────────────────┐
│ COMPARISON OF VERIFICATION APPROACHES │
├──────────────┬──────────┬───────────┬───────────┬────────────────┤
│ │ Manual │Outsourced │ OCR / │ Blockchain │
│ Dimension │ Review │ Services │ Template │ Credentials │
├──────────────┼──────────┼───────────┼───────────┼────────────────┤
│ Speed │ Minutes │ Days to │ Seconds │ Seconds │
│ │ to hours │ weeks │ │ (if anchored) │
├──────────────┼──────────┼───────────┼───────────┼────────────────┤
│ Accuracy │ High │ Very high │ Low to │ Very high │
│ │ (varies) │ (if avail)│ moderate │ (if adopted) │
├──────────────┼──────────┼───────────┼───────────┼────────────────┤
│ Scalability │ Poor │ Poor │ Good │ Good │
├──────────────┼──────────┼───────────┼───────────┼────────────────┤
│ Coverage │ Limited │ Moderate │ Limited │ Minimal │
│ │ by expert│ by network│ by DB │ (adoption gap) │
├──────────────┼──────────┼───────────┼───────────┼────────────────┤
│ Forgery │ Expert- │ N/A (src │ Template- │ Crypto- │
│ Detection │ dependent│ verified) │ dependent │ guaranteed │
├──────────────┼──────────┼───────────┼───────────┼────────────────┤
│ Legacy Doc │ Yes │ Yes │ Partial │ No │
│ Support │ │ │ │ │
├──────────────┼──────────┼───────────┼───────────┼────────────────┤
│ Cost per │ $15-$50 │ $30-$200 │ $0.10-$2 │ $0.01-$0.50 │
│ Verification │ │ │ │ (infra cost) │
└──────────────┴──────────┴───────────┴───────────┴────────────────┘

2.2 The Multi-Model AI Approach
Turing Verify represents a fifth-generation approach that combines the strengths of prior methods while mitigating their limitations. By employing multiple large language models with vision capabilities, the system can analyze document images with a sophistication that approximates expert human review while maintaining the speed and consistency of automated systems. Integration with external verification portals provides the source-verification capability of outsourced services without the latency and cost overhead.
The use of multiple AI models serves several purposes beyond redundancy. Different models exhibit complementary strengths: some excel at rapid pattern recognition suitable for pre-screening, while others demonstrate superior reasoning capabilities for complex forensic analysis. By routing documents through a tiered pipeline that matches model capabilities to task complexity, the system optimizes both accuracy and cost efficiency.
2.3 Positioning Within the Verification Ecosystem
Turing Verify is not intended to replace all forms of verification. For documents anchored to blockchain credential systems, cryptographic verification remains the gold standard. For cases requiring legal-grade authentication, human expert review with physical document examination remains necessary. Turing Verify addresses the large and growing middle ground: high-volume verification workflows where documents arrive as digital images, originate from diverse jurisdictions, and require systematic forensic analysis at speeds compatible with institutional timelines.
3. System Architecture
3.1 Architectural Overview
The Turing Verify platform follows a layered architecture that separates concerns across client presentation, API management, processing orchestration, AI inference, external integration, and data persistence. Each layer communicates through well-defined interfaces, enabling independent scaling and evolution.
┌───────────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ │
│ Next.js 16 (Turbopack Build) │
│ ┌───────────┬───────────┬───────────┬───────────┬───────────┐ │
│ │ EN │ ZH-TW │ JA │ FR │ ES │ │
│ │ English │ Trad. │ Japanese │ French │ Spanish │ │
│ │ │ Chinese │ │ │ │ │
│ └───────────┴───────────┴───────────┴───────────┴───────────┘ │
│ Server-Sent Events (SSE) for real-time progress streaming │
│ Responsive UI · Document Upload · Batch Processing │
│ Report Viewer · Applicant Folder Management │
├───────────────────────────────────────────────────────────────────────┤
│ API GATEWAY LAYER │
│ │
│ FastAPI (Python 3.12+, async/await) │
│ ┌──────────────┬──────────────┬──────────────┬─────────────┐ │
│ │ JWT Auth │ Rate Limit │ CORS Policy │ Request │ │
│ │ (4 providers)│ (10 req/s) │ │ Validation │ │
│ └──────────────┴──────────────┴──────────────┴─────────────┘ │
│ 10 Rate-Limited Endpoints · OpenAPI Schema · Error Handling │
├───────────────────────────────────────────────────────────────────────┤
│ PROCESSING PIPELINE │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Image Normalization · Format Detection · Resize │ │
│ │ EXIF Extraction · DPI Standardization │ │
│ └─────────────────────────┬────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────▼────────────────────────────────┐ │
│ │ QR Code Scanner (7 Strategies) │ │
│ │ OpenCV + pyzbar · CLAHE Enhancement · Multi-thresh │ │
│ └─────────────────────────┬────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────▼────────────────────────────────┐ │
│ │ Pre-Screen Filter (Claude Haiku 4.5) │ │
│ │ Vendor watermark detection · Blatant fake rejection │ │
│ └─────────────────────────┬────────────────────────────────┘ │
│ │ │
├─────────────────────────────▼─────────────────────────────────────────┤
│ AI MODEL LAYER │
│ │
│ ┌─────────────────┬─────────────────┬─────────────────────┐ │
│ │ Claude Sonnet 4 │ Claude Haiku │ GPT-5.3 │ │
│ │ │ 4.5 │ │ │
│ │ Primary │ Pre-screen │ Free-tier │ │
│ │ Forensic │ Rapid │ Fallback │ │
│ │ Analysis │ Assessment │ Analysis │ │
│ │ │ │ │ │
│ │ 45 U-Rules │ Quick reject │ 45 U-Rules │ │
│ │ 36 AV Defenses │ Confidence │ 36 AV Defenses │ │
│ │ Full scoring │ Gate │ Full scoring │ │
│ └─────────────────┴─────────────────┴─────────────────────┘ │
├───────────────────────────────────────────────────────────────────────┤
│ VERIFICATION ENGINE │
│ │
│ ┌──────────────┬──────────────┬──────────────┬─────────────┐ │
│ │ 45 Forensic │ 36 Attack │ Two-Tier │ Verdict │ │
│ │ U-Rules │ Vector │ Scoring │ Logic │ │
│ │ (U-01..U-45) │ Defenses │ (T1 + T2) │ │ │
│ │ │ (AV-01..36) │ │ │ │
│ └──────────────┴──────────────┴──────────────┴─────────────┘ │
│ 81 Total Checkpoints · Category-Weighted Scoring │
├───────────────────────────────────────────────────────────────────────┤
│ EXTERNAL INTEGRATIONS │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ 56 Verification Portals · 25+ Countries │ │
│ │ 15 Trusted QR Domains · Portal Data Extraction │ │
│ │ API-based · Web-based · QR-redirect Portals │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Country Validators │ │
│ │ 6 ID Checksum Countries · 9 Reg Format Countries │ │
│ │ 3 MRZ Standards (TD1/TD2/TD3) │ │
│ └──────────────────────────────────────────────────────────┘ │
├───────────────────────────────────────────────────────────────────────┤
│ PERSISTENCE LAYER │
│ │
│ ┌──────────────┬──────────────┬──────────────┬─────────────┐ │
│ │ SQLite │ Report │ PDF Export │ Audit │ │
│ │ (aiosqlite) │ Generation │ (12-section) │ Logging │ │
│ │ │ Engine │ │ │ │
│ └──────────────┴──────────────┴──────────────┴─────────────┘ │
│ Async I/O · Document Metadata · Verification History │
└───────────────────────────────────────────────────────────────────────┘

3.2 Request Lifecycle
A document verification request traverses the following stages from submission to verdict delivery:
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Client │ │ API │ │ Image │ │ QR │
│ Upload │────▶│ Gateway │────▶│ Norm. │────▶│ Scan │
│ │ │ + Auth │ │ │ │(7 strat) │
└──────────┘ └──────────┘ └──────────┘ └────┬─────┘
│
┌──────────────────────────────┘
│
▼
┌─────────────┐ ┌──────────────┐
│ Pre-Screen │ │ Portal │
│ (Haiku 4.5) │ │ Lookup │
└──────┬──────┘ │ (if QR found)│
│ └──────┬───────┘
┌──────────┴──────────┐ │
│ │ │
FAIL │ PASS │ │
▼ ▼ ▼
┌────────────┐ ┌──────────────────────┐
│ Quick │ │ Check Scan Mode │
│ Reject │ └──────────┬───────────┘
│ (Verdict: │ ┌─────┴──────┐
│ FAKE) │ STANDARD DEEP
└──────┬─────┘ │ │
│ ▼ ▼
│ ┌──────────────┐ ┌────────────────┐
│ │ Standard │ │ Deep Forensic │
│ │ Forensic │ │ (Sonnet 4) │
│ │ (Sonnet 4 / │ │ 13 stages │
│ │ GPT-5.3) │ │ +KYB +KYC │
│ │ 8 stages │ │ 5 credits │
│ │ + Portal │ │ + Portal Data │
│ └──────┬──────┘ └───────┬────────┘
│ │ │
▼ ▼ ▼
┌────────────────────────────────────────┐
│ Scoring Engine │
│ T1 (Structural) + T2 (Semantic) │
│ → Verdict: PASS / NEEDS_REVIEW / FAKE │
└────────────────────┬───────────────────┘
│
▼
┌────────────────────────────────────────┐
│ Report Generation │
│ 12-Section Forensic PDF │
│ + SSE Progress Streaming │
└────────────────────┬───────────────────┘
│
▼
┌────────────────────────────────────────┐
│ Client Delivery │
│ Verdict + Report Download │
│ + Persistence to SQLite │
└────────────────────────────────────────┘

3.3 Multi-Model Routing
The platform employs four AI models, each selected for specific characteristics that match different stages and tiers of the verification pipeline:
┌──────────────────────────────────────────────────────────────────┐
│ MULTI-MODEL ROUTING LOGIC (v2.0) │
│ │
│ Document Arrives │
│ │ │
│ ▼ │
│ ┌──────────────────────┐ │
│ │ Haiku 4.5 Pre-Screen │ ◄── All verifications │
│ │ (~$0.001 per doc) │ Catches ~40% of obvious fakes │
│ └──────┬───────────────┘ │
│ │ │
│ FAIL │ PASS │
│ (Quick │ │
│ Reject) ▼ │
│ ┌─────────────────┐ │
│ │ Check User Tier │ │
│ │ + Scan Mode │ │
│ └──────┬──────────┘ │
│ │ │
│ ┌───────────┼───────────┐ │
│ │ │ │ │
│ FREE PREMIUM PREMIUM │
│ STANDARD STANDARD DEEP MODE │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌──────────┐ │
│ │ GPT-5.3 │ │ Claude │ │ Claude │ │
│ │ │ │ Sonnet 4│ │ Sonnet 4 │ │
│ │ 8 stages│ │ 8 stages│ │ 13 stages│ │
│ │ 1 credit│ │ 1 credit│ │ 5 credits│ │
│ │ ~$0.04 │ │ ~$0.04 │ │~$0.06-08 │ │
│ └─────────┘ └─────────┘ └──────────┘ │
│ │
│ Model Selection Criteria: │
│ ┌───────────────┬──────────┬──────────┬──────────┬────────────┐ │
│ │ Model │ Role │ Latency │ Cost/Doc │ Credits │ │
│ ├───────────────┼──────────┼──────────┼──────────┼────────────┤ │
│ │ Claude Haiku │ Pre- │ ~2s │ ~$0.001 │ (included) │ │
│ │ 4.5 │ screen │ │ │ │ │
│ ├───────────────┼──────────┼──────────┼──────────┼────────────┤ │
│ │ Claude Sonnet │ Standard │ ~15-30s │ ~$0.04 │ 1 │ │
│ │ 4 │ forensic │ │ │ │ │
│ ├───────────────┼──────────┼──────────┼──────────┼────────────┤ │
│ │ GPT-5.3 │ Free │ ~20-40s │ ~$0.04 │ 1 │ │
│ │ │ tier │ │ │ │ │
│ ├───────────────┼──────────┼──────────┼──────────┼────────────┤ │
│ │ Claude Sonnet 4 │ Deep │ ~20-40s │~$0.06-08 │ 5 │ │
│ │ │ forensic │ │ │ │ │
│ └───────────────┴──────────┴──────────┴──────────┴────────────┘ │
│ │
│ Cost Optimization: │
│ - Prompt caching: 90% input cost reduction (Anthropic models) │
│ - Prompt caching: 50% input cost reduction (OpenAI models) │
│ - Haiku pre-screen filters ~40% of obvious fakes at ~$0.001 │
└──────────────────────────────────────────────────────────────────┘

Claude Haiku 4.5 serves as the universal pre-screening model applied to all verifications regardless of tier. Its low latency and cost profile make it suitable for rapid assessment of obvious indicators: vendor watermarks from known forgery mills, placeholder text, phantom institution names, and gross formatting anomalies. Documents flagged by the pre-screener receive an immediate rejection verdict without consuming the resources of a full forensic analysis. Approximately 40% of obvious fakes are caught at this stage at a cost of roughly $0.001 per document.
Claude Sonnet 4 serves as the primary forensic analysis engine for premium-tier standard verifications. Its advanced reasoning capabilities enable nuanced evaluation of layout consistency, semantic plausibility, cross-referencing against portal data, and detection of sophisticated attack vectors that require contextual understanding. Standard verification processes 8 analysis stages.
GPT-5.3 provides full forensic analysis for free-tier users. While applying the same 45 U-rules and 36 AV-defenses across 8 analysis stages, the model selection enables cost management for the platform's free offering.
Claude Sonnet 4 powers the Deep Verification mode, a premium forensic investigation tier introduced in v2.0. Sonnet 4 processes documents through 13 analysis stages (compared to 8 for standard), with 16,000 maximum output tokens for comprehensive forensic reports. Deep Verification includes specialized stages for font forensics, seal and watermark spectral analysis, institutional deep cross-referencing, KYB issuer due diligence, and KYC holder background screening. Each deep verification consumes 5 credits.
3.4 Technology Stack
The client layer is built on Next.js 16 with Turbopack for optimized build performance, supporting five locales (English, Traditional Chinese, Japanese, French, and Spanish). Server-Sent Events provide real-time progress streaming during document analysis, enabling users to observe the verification pipeline as it executes.
The API layer runs on FastAPI with full async/await support via Python 3.12+. Authentication supports four providers, and rate limiting enforces a maximum of 10 requests per second across 10 rate-limited endpoints. Request validation ensures that only properly formatted document images enter the processing pipeline.
The persistence layer uses SQLite accessed through aiosqlite for non-blocking database operations. This choice optimizes for deployment simplicity and single-node performance while maintaining ACID transaction guarantees for verification records.
4. Forensic Methodology
This section constitutes the technical core of the Turing Verify platform, detailing the checkpoint taxonomy, attack vector defense framework, scoring model, and calibration methodology.
4.1 Checkpoint Taxonomy
The 45 forensic rules (designated U-01 through U-45) are organized into four primary categories based on the dimension of document authenticity they evaluate.
81 TOTAL CHECKPOINTS
│
┌───────────────┴───────────────┐
│ │
45 FORENSIC RULES 36 ATTACK VECTOR
(U-01 to U-45) DEFENSES (AV-01 to AV-36)
│
┌─────────┼──────────┬──────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌────────┐┌────────┐┌────────┐┌────────┐
│STRUCT- ││SEMAN- ││EXTERNAL││META- │
│URAL ││TIC ││ ││DATA │
│ ││ ││ ││ │
│12 rules││14 rules││11 rules││8 rules │
│ ││ ││ ││ │
│U-01 ││U-13 ││U-27 ││U-38 │
│ to ││ to ││ to ││ to │
│U-12 ││U-26 ││U-37 ││U-45 │
└────────┘└────────┘└────────┘└────────┘

4.1.1 Structural Rules (U-01 to U-12)
Structural rules evaluate the physical layout, formatting, and visual composition of the document. These rules detect anomalies in the tangible properties of the document image.
| Rule ID | Rule Name | Description |
|---|---|---|
| U-01 | Template Conformance | Evaluates whether the document layout matches known templates for the claimed issuing institution |
| U-02 | Font Consistency | Checks for unauthorized font changes, mixed font families, or anachronistic typefaces |
| U-03 | Alignment Integrity | Verifies that text blocks, borders, and graphical elements maintain consistent alignment |
| U-04 | Seal/Stamp Authenticity | Assesses institutional seals, stamps, and embossing for consistency with known specimens |
| U-05 | Signature Presence | Confirms the presence and plausibility of required signatures |
| U-06 | Paper/Background Uniformity | Evaluates background texture, color consistency, and absence of splicing artifacts |
| U-07 | Print Quality Assessment | Detects anomalies in print resolution, dot patterns, and toner distribution |
| U-08 | Border and Frame Integrity | Verifies decorative borders, frames, and security guilloche patterns |
| U-09 | Logo Fidelity | Compares institutional logos against known versions for proportional and chromatic accuracy |
| U-10 | Watermark Analysis | Evaluates watermark presence, positioning, and transparency characteristics |
| U-11 | Hologram/Security Feature Indicators | Assesses visual indicators of holographic or security printing features |
| U-12 | Image Resolution Consistency | Detects regions of inconsistent resolution that may indicate compositing |
4.1.2 Semantic Rules (U-13 to U-26)
Semantic rules evaluate the content, meaning, and logical coherence of information presented in the document.
| Rule ID | Rule Name | Description |
|---|---|---|
| U-13 | Date Plausibility | Verifies that all dates are logically consistent and chronologically valid |
| U-14 | Grade/Score Validity | Checks that grades, scores, and GPAs fall within valid ranges for the claimed institution |
| U-15 | Course Load Plausibility | Evaluates whether the number and distribution of courses is realistic |
| U-16 | Institutional Language | Assesses whether administrative language matches the claimed institution's conventions |
| U-17 | Credential Designation | Verifies that degree names, certificate titles, and credential designations are valid |
| U-18 | Name Consistency | Checks that the subject's name is consistent across all instances within the document |
| U-19 | Address/Location Validity | Verifies that addresses, cities, and jurisdictions are geographically plausible |
| U-20 | Registration/ID Number Format | Validates that registration and student ID numbers conform to known formats |
| U-21 | Grading System Consistency | Ensures the grading scale used is consistent throughout and matches institutional norms |
| U-22 | Credit Hour Validation | Verifies credit hours or units against institutional standards |
| U-23 | Cumulative Calculation Accuracy | Recalculates GPAs, weighted averages, and totals for mathematical accuracy |
| U-24 | Signatory Title Plausibility | Checks that signatory titles match plausible administrative positions |
| U-25 | Language and Grammar | Evaluates text for grammatical anomalies inconsistent with institutional quality |
| U-26 | Content Completeness | Assesses whether all expected sections and fields are present for the document type |
4.1.3 External Verification Rules (U-27 to U-37)
External rules leverage data obtained from sources outside the document itself, including QR codes, verification portals, and public databases.
| Rule ID | Rule Name | Description |
|---|---|---|
| U-27 | QR Code Presence | Determines whether the document contains an embedded QR code |
| U-28 | QR Data Extraction | Evaluates whether QR code data can be successfully decoded |
| U-29 | QR Domain Trust | Verifies that QR code URLs point to trusted verification domains |
| U-30 | Portal Data Match | Cross-references document content against data retrieved from verification portals |
| U-31 | Portal Existence Verification | Confirms that the claimed verification portal exists and is operational |
| U-32 | National ID Checksum | Validates national ID numbers against country-specific checksum algorithms |
| U-33 | Business Registration Format | Verifies business registration numbers against country-specific format rules |
| U-34 | MRZ Validation | Parses and validates Machine Readable Zone data per ICAO 9303 standards |
| U-35 | Institution Existence | Confirms that the claimed issuing institution exists in reference databases |
| U-36 | Accreditation Status | Verifies the accreditation status of educational institutions where applicable |
| U-37 | Document Number Cross-Reference | Cross-references document serial numbers against portal records |
4.1.4 Metadata Rules (U-38 to U-45)
Metadata rules evaluate properties of the document image file itself, detecting traces of digital manipulation.
| Rule ID | Rule Name | Description |
|---|---|---|
| U-38 | EXIF Data Analysis | Examines image metadata for creation tools, timestamps, and device information |
| U-39 | Compression Artifact Analysis | Detects inconsistent JPEG compression levels indicating compositing |
| U-40 | Color Space Consistency | Evaluates color profiles and gamut for uniformity across the document |
| U-41 | Resolution Metadata Match | Compares declared resolution against actual pixel density |
| U-42 | Creation Tool Detection | Identifies software signatures in metadata (e.g., Photoshop, GIMP) |
| U-43 | Modification History | Examines metadata for evidence of multiple editing sessions |
| U-44 | Embedded Object Analysis | Detects hidden layers, embedded objects, or XMP data anomalies |
| U-45 | Digital Signature Verification | Validates cryptographic signatures if present in the document |
4.2 Attack Vector Defense
The 36 attack vectors (AV-01 through AV-36) represent specific forgery techniques that the system is designed to detect. These are organized into six families based on the nature of the attack.
36 ATTACK VECTORS
│
┌───────────┬───────────┼───────────┬───────────┬───────────┐
│ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼
┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐
│FABRICA- ││TAMPER- ││IDENTITY ││INSTITU- ││DIGITAL ││EVASION │
│TION ││ING ││FRAUD ││TIONAL ││FORGERY ││ │
│ ││ ││ ││FRAUD ││ ││ │
│AV-01 ││AV-08 ││AV-15 ││AV-21 ││AV-27 ││AV-32 │
│ to ││ to ││ to ││ to ││ to ││ to │
│AV-07 ││AV-14 ││AV-20 ││AV-26 ││AV-31 ││AV-36 │
│ ││ ││ ││ ││ ││ │
│7 vectors││7 vectrs││6 vectrs││6 vectrs││5 vectrs││5 vectrs│
└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘

Family 1: Fabrication (AV-01 to AV-07)
Fabrication attacks involve the creation of entirely fictitious documents. These range from crude attempts using online template generators to sophisticated AI-generated documents.
| AV ID | Attack Vector | Detection Method |
|---|---|---|
| AV-01 | Vendor watermark forgery | Pre-screen detects watermarks from known forgery mills (e.g., "SAMPLE", vendor URLs) |
| AV-02 | Phantom institution | Institution name checked against reference databases; non-existent institutions flagged |
| AV-03 | Placeholder text | Pre-screen detects lorem ipsum, template placeholder strings, and default values |
| AV-04 | Template generator artifacts | Known template generator layouts and styling patterns identified |
| AV-05 | AI-generated document | Detects statistical patterns characteristic of generative AI output |
| AV-06 | Stock image insertion | Identifies stock photography watermarks, metadata, and known image hashes |
| AV-07 | Blank template filling | Detects inconsistencies between pre-printed elements and filled-in content |
Family 2: Tampering (AV-08 to AV-14)
Tampering attacks modify legitimate documents to alter specific data fields while preserving the overall structure.
| AV ID | Attack Vector | Detection Method |
|---|---|---|
| AV-08 | Grade/score inflation | Mathematical recalculation of GPAs, totals, and weighted averages |
| AV-09 | Date alteration | Font analysis of date fields; chronological plausibility checks |
| AV-10 | Name substitution | Cross-reference name across all document instances; font consistency in name fields |
| AV-11 | Photo replacement | Resolution and compression analysis of photo region vs. document body |
| AV-12 | Seal/stamp overlay | Seal positioning, layering artifacts, and chromatic consistency analysis |
| AV-13 | QR code replacement | Comparison of QR-encoded data against visible document content |
| AV-14 | Selective content removal | Detection of blank regions inconsistent with expected template structure |
Family 3: Identity Fraud (AV-15 to AV-20)
Identity fraud attacks use documents that may be technically genuine but are presented by or attributed to the wrong individual.
| AV ID | Attack Vector | Detection Method |
|---|---|---|
| AV-15 | Name mismatch across documents | Cross-document consistency checking within applicant folders |
| AV-16 | ID number recycling | Duplicate detection across verification database |
| AV-17 | Photo-ID inconsistency | Cross-reference identity photos across documents in the same folder |
| AV-18 | Biographical data conflict | Age, birthdate, and timeline cross-validation |
| AV-19 | Nationality/jurisdiction mismatch | Geographic plausibility of claimed nationality vs. document origin |
| AV-20 | Alias exploitation | Name normalization across scripts and transliteration systems |
Family 4: Institutional Fraud (AV-21 to AV-26)
Institutional fraud involves documents from real institutions that have been manipulated or misrepresented.
| AV ID | Attack Vector | Detection Method |
|---|---|---|
| AV-21 | Accreditation misrepresentation | Accreditation status verification against authoritative databases |
| AV-22 | Defunct institution exploitation | Institution operational status verification |
| AV-23 | Program/degree fabrication | Verification that the claimed program exists at the claimed institution |
| AV-24 | Template version anachronism | Document template style compared against known historical versions |
| AV-25 | Signatory impersonation | Signatory names and titles cross-referenced where possible |
| AV-26 | Campus/branch misattribution | Verification of campus-specific details and formatting |
Family 5: Digital Forgery (AV-27 to AV-31)
Digital forgery attacks exploit the digital medium itself, manipulating image properties and metadata.
| AV ID | Attack Vector | Detection Method |
|---|---|---|
| AV-27 | Layer compositing | Compression artifact analysis revealing multiple editing stages |
| AV-28 | Color space manipulation | Color profile consistency analysis across document regions |
| AV-29 | Resolution stitching | Detection of resolution boundaries within a single document image |
| AV-30 | Metadata spoofing | Cross-validation of metadata claims against image properties |
| AV-31 | Screenshot-of-printout attack | Detection of screen artifacts, moire patterns, and perspective distortion |
Family 6: Evasion (AV-32 to AV-36)
Evasion attacks are designed specifically to circumvent automated verification systems.
| AV ID | Attack Vector | Detection Method |
|---|---|---|
| AV-32 | Deliberate image degradation | Quality assessment relative to expected norms for the document type |
| AV-33 | Partial document submission | Completeness checks against expected sections for the document type |
| AV-34 | Non-standard orientation | Orientation detection and normalization during pre-processing |
| AV-35 | Embedded steganographic data | Analysis of least-significant bit patterns in image data |
| AV-36 | Adversarial perturbation | Robustness checks against pixel-level perturbations designed to mislead AI models |
4.2.1 The Verification Kill Chain
The multi-stage pipeline distributes attack vector detection across layers, ensuring that each stage catches the attacks it is best positioned to identify.
Document Upload
│
▼
┌───────────────────┐ Catches: AV-01, AV-02, AV-03, AV-04, AV-05, AV-06
│ PRE-SCREEN │──▶ Vendor watermarks, phantom institutions,
│ (Haiku 4.5) │ placeholder text, template generator artifacts,
│ │ AI-generated documents, stock images
│ ~2 seconds │
│ Cost: Low │ ┌─────────────────────────────┐
└────────┬───────────┘ │ ~15-20% of submissions │
│ │ rejected at this stage │
PASS │ └─────────────────────────────┘
▼
┌───────────────────┐ Catches: AV-07, AV-13, AV-29, AV-30, AV-31
│ QR SCANNER │──▶ QR tampering, portal URL mismatches,
│ (7 strategies) │ domain trust violations, screenshot
│ │ artifacts, resolution inconsistencies
│ ~3-5 seconds │
└────────┬───────────┘
│
▼
┌───────────────────┐ Catches: AV-08 through AV-12, AV-14 through AV-28,
│ STANDARD │ AV-32 through AV-36
│ FORENSIC │──▶ Template mismatches, checksum failures,
│ (Sonnet 4 / │ layout anomalies, score inflation,
│ GPT-5.3) │ cross-doc inconsistency, identity fraud,
│ 8 stages │ institutional fraud, digital forgery,
│ ~15-30 seconds │ evasion techniques
└────────┬───────────┘
│
▼ (if Deep Mode selected)
┌───────────────────┐ Additional stages (Deep only):
│ DEEP FORENSIC │──▶ Deep Font & Typography Forensics
│ (Sonnet 4) │ Seal & Watermark Deep Scan (spectral)
│ 13 stages total │ Institutional Deep Cross-Reference
│ ~60-120 seconds │ KYB — Institution Due Diligence
│ 5 credits │ KYC — Holder Background Screening
└────────┬───────────┘ Forensic Synthesis & Deep Report
│
▼
┌───────────────────┐
│ SCORING + │──▶ Two-tier score aggregation
│ VERDICT │ Weighted category scoring
│ │ Confidence assessment (8 dimensions
│ │ in Deep mode)
└────────┬───────────┘
│
▼
┌───────────────────┐
│ REPORT │──▶ Standard: 12-section forensic PDF
│ GENERATION │ Deep: Executive-grade forensic report
│ │ SSE progress streaming
│ │ Persistent storage
└───────────────────┘

4.3 Scoring Model
The Turing Verify employs a two-tier scoring model that separates structural assessment (Tier 1) from semantic and contextual assessment (Tier 2). This separation enables nuanced verdicts that distinguish between documents with formatting irregularities and documents with substantive content anomalies.
┌──────────────────────────────────────────────────────────────┐
│ TWO-TIER SCORING MODEL │
│ │
│ TIER 1: STRUCTURAL SCORE (T1) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Weight: 40% of final score │ │
│ │ │ │
│ │ Inputs: │ │
│ │ - Structural rules (U-01 to U-12): 30% of T1 │ │
│ │ - Metadata rules (U-38 to U-45): 10% of T1 │ │
│ │ │ │
│ │ Each rule scores 0 (fail) to 10 (pass) │ │
│ │ Category score = weighted average of rule scores │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ TIER 2: SEMANTIC + EXTERNAL SCORE (T2) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Weight: 60% of final score │ │
│ │ │ │
│ │ Inputs: │ │
│ │ - Semantic rules (U-13 to U-26): 35% of T2 │ │
│ │ - External rules (U-27 to U-37): 25% of T2 │ │
│ │ │ │
│ │ Portal verification bonus: +5 if portal confirms │ │
│ │ Portal verification penalty: -15 if portal refutes │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ FINAL SCORE = (T1 x 0.40) + (T2 x 0.60) │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ VERDICT DETERMINATION │ │
│ │ │ │
│ │ Final Score >= 70 AND no critical fails │ │
│ │ ──▶ VERDICT: PASS │ │
│ │ │ │
│ │ Final Score 40-69 OR 1-2 non-critical fails │ │
│ │ ──▶ VERDICT: NEEDS_REVIEW │ │
│ │ │ │
│ │ Final Score < 40 OR any critical fail │ │
│ │ ──▶ VERDICT: FAKE │ │
│ │ │ │
│ │ Critical fails (instant FAKE verdict): │ │
│ │ - Portal data contradicts document (U-30 = 0) │ │
│ │ - National ID checksum failure (U-32 = 0) │ │
│ │ - Vendor watermark detected (AV-01 triggered) │ │
│ │ - Phantom institution confirmed (AV-02 triggered) │ │
│ └─────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘

4.3.1 Scoring Flow
Document Analysis Complete
│
▼
┌─────────────────────┐
│ Evaluate each U-rule │
│ Score: 0-10 per rule │
└──────────┬──────────┘
│
┌─────┴─────┐
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ T1 │ │ T2 │
│ Calc. │ │ Calc. │
│ Struct. │ │ Semant. │
│ + Meta │ │ + Ext. │
└────┬────┘ └────┬────┘
│ │
▼ ▼
┌──────────────────────┐
│ Weighted Aggregate │
│ T1(0.4) + T2(0.6) │
└──────────┬───────────┘
│
┌─────┴─────────┐
│ │
▼ ▼
┌─────────┐ ┌───────────┐
│ Check │ │ Check │
│ Score │ │ Critical │
│ Thresh. │ │ Fails │
└────┬────┘ └─────┬─────┘
│ │
└──────┬───────┘
│
▼
┌──────────────────────┐
│ FINAL VERDICT │
│ PASS / NEEDS_REVIEW │
│ / FAKE │
└──────────────────────┘

4.3.2 Attack Vector Overlay
In addition to the U-rule scoring, triggered attack vectors apply penalty modifiers to the final score. The magnitude of the penalty depends on the severity classification of the attack vector:
| Severity | Penalty | Examples |
|---|---|---|
| Critical | Instant FAKE verdict | AV-01 (vendor watermark), AV-02 (phantom institution) |
| High | -20 to -30 points | AV-08 (grade inflation), AV-13 (QR replacement) |
| Medium | -10 to -19 points | AV-09 (date alteration), AV-24 (template anachronism) |
| Low | -5 to -9 points | AV-32 (image degradation), AV-34 (non-standard orientation) |
4.4 Calibration and Ground Truth
The forensic engine is calibrated against a set of 15 ground-truth cases that span the range of document types, attack vectors, and difficulty levels encountered in production. Each calibration case has a known authentic or fraudulent classification established through independent verification.
┌──────────────────────────────────────────────────────────────────┐
│ CALIBRATION CASE SUMMARY │
├──────┬──────────────────┬───────────┬────────────┬───────────────┤
│ Case │ Document Type │ Ground │ Expected │ Key │
│ ID │ │ Truth │ Verdict │ Checkpoints │
├──────┼──────────────────┼───────────┼────────────┼───────────────┤
│ C-01 │ Academic trans. │ Authentic │ PASS │ U-14, U-23 │
│ C-02 │ Academic diploma │ Forged │ FAKE │ AV-01, U-04 │
│ C-03 │ National ID │ Authentic │ PASS │ U-32, U-34 │
│ C-04 │ National ID │ Tampered │ FAKE │ AV-10, U-32 │
│ C-05 │ Business license │ Authentic │ PASS │ U-33, U-30 │
│ C-06 │ Business license │ Forged │ FAKE │ AV-04, AV-21 │
│ C-07 │ Academic trans. │ Tampered │ FAKE │ AV-08, U-23 │
│ C-08 │ Certificate │ Authentic │ PASS │ U-01, U-27 │
│ C-09 │ Certificate │ Forged │ FAKE │ AV-02, U-35 │
│ C-10 │ Passport (MRZ) │ Authentic │ PASS │ U-34, U-32 │
│ C-11 │ Passport (MRZ) │ Tampered │ FAKE │ AV-10, U-34 │
│ C-12 │ Award cert. │ Forged │ FAKE │ AV-05, U-16 │
│ C-13 │ Trade assoc. cert │ Authentic │ PASS │ U-30, U-27 │
│ C-14 │ Training cert. │ Forged │ FAKE │ AV-24, U-01 │
│ C-15 │ Government doc. │ Tampered │ NEEDS_REV │ AV-09, U-13 │
└──────┴──────────────────┴───────────┴────────────┴───────────────┘

4.4.1 Continuous Calibration Feedback Loop
The calibration process is not a one-time activity. The system implements a continuous feedback loop that incorporates new ground-truth data as it becomes available.
┌──────────────────────────────────────────────────────────────┐
│ CALIBRATION FEEDBACK LOOP │
│ │
│ ┌───────────────┐ │
│ │ Production │ │
│ │ Verifications │ │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ ┌───────────────┐ │
│ │ Human Review │────▶│ Disagreement │ │
│ │ (on NEEDS_ │ │ Cases │ │
│ │ REVIEW docs) │ │ Collected │ │
│ └───────────────┘ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Ground Truth │ │
│ │ Established │ │
│ │ (via portal │ │
│ │ or manual │ │
│ │ verification)│ │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Calibration │ │
│ │ Case Added │ │
│ │ to Test Suite │ │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Prompt + │ │
│ │ Scoring │ │
│ │ Weights │ │
│ │ Adjusted │ │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Regression │ │
│ │ Test Against │ │
│ │ All 15+ Cases │ │
│ └───────┬───────┘ │
│ │ │
│ PASS │ FAIL │
│ ┌───────┴───────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────┐ ┌───────────┐ │
│ │ Deploy │ │ Iterate │ │
│ │ Updated │ │ Adjustment│ │
│ │ Engine │ │ │ │
│ └─────────┘ └───────────┘ │
│ │
└──────────────────────────────────────────────────────────────┘

4.4.2 Institution Template Library
The system maintains a library of 18 institution templates that encode layout specifications, expected formatting conventions, and security feature locations for frequently encountered issuing institutions. Templates are versioned to account for institutions that have changed their document designs over time. When a document claims to originate from a templated institution, the forensic analysis can perform pixel-level layout comparison in addition to general structural analysis, significantly increasing detection sensitivity for that institution's documents.
5. QR Code Verification Pipeline
5.1 Overview
An increasing number of credential-issuing institutions embed QR codes in their documents as a verification mechanism. These QR codes typically encode either a URL pointing to a verification portal or a data payload containing document details. When present, QR code data provides a high-confidence external reference point for verifying document authenticity.
However, QR codes in document images present significant technical challenges for extraction. Documents may be photographed at angles, scanned at varying resolutions, printed with ink that partially obscures the QR pattern, or degraded through photocopying. To address these challenges, Turing Verify implements a 7-strategy cascade scanner that applies progressively more aggressive image processing techniques.
5.2 Strategy Cascade
┌──────────────────────────────────────────────────────────────────┐
│ QR CODE SCANNER: 7-STRATEGY CASCADE │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ Strategy 1: FULL IMAGE SCAN │ │
│ │ Method: OpenCV + pyzbar on original image│ │
│ │ Best for: High-quality scans, clear QR │ │
│ └──────────────────┬───────────────────────┘ │
│ Result? │ │
│ ┌────┴────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ DONE ┌──────────────────────────────────────┐ │
│ │ Strategy 2: INVERTED IMAGE │ │
│ │ Method: Bitwise inversion + scan │ │
│ │ Best for: Dark backgrounds, negative │ │
│ └──────────────────┬────────────────────┘ │
│ Result? │ │
│ ┌────┴────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ DONE ┌──────────────────────────────────┐ │
│ │ Strategy 3: CORNER SCAN │ │
│ │ Method: Crop corners + 2x/3x │ │
│ │ upscale + scan each │ │
│ │ Best for: Small QR in corners │ │
│ └──────────────────┬───────────────┘ │
│ Result? │ │
│ ┌────┴────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ DONE ┌───────────────────────┐ │
│ │ Strategy 4: ADAPTIVE │ │
│ │ THRESHOLD │ │
│ │ Method: cv2.adaptive- │ │
│ │ Threshold + scan │ │
│ │ Best for: Uneven │ │
│ │ lighting │ │
│ └──────────┬────────────┘ │
│ Result? │ │
│ ┌────┴────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ DONE ┌────────────────────┐ │
│ │ Strategy 5: MULTI- │ │
│ │ THRESHOLD │ │
│ │ Thresholds: 128, │ │
│ │ 160, 200, 230 │ │
│ │ Best for: Low │ │
│ │ contrast images │ │
│ └────────┬───────────┘ │
│ Result? │ │
│ ┌────┴────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ DONE ┌───────────────────┐ │
│ │ Strategy 6: CLAHE │ │
│ │ Enhancement │ │
│ │ Contrast Limited │ │
│ │ Adaptive Hist. │ │
│ │ Equalization │ │
│ └──────┬────────────┘ │
│ Result? │ │
│ ┌────┴────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ DONE ┌────────────────────┐ │
│ │ Strategy 7: CLAHE │ │
│ │ + ADAPTIVE COMBO │ │
│ │ Combined enhance- │ │
│ │ ment pipeline │ │
│ └────────┬───────────┘ │
│ Result? │ │
│ ┌────┴────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ DONE NO QR FOUND │
│ (U-27 scored │
│ accordingly) │
└──────────────────────────────────────────────────────────────────┘

5.3 Trusted Domain Model
When a QR code is successfully decoded and contains a URL, the system evaluates that URL against a trusted domain allowlist comprising 15 verified domains belonging to legitimate verification portals. This trust model serves multiple purposes:
- Phishing prevention: Prevents the system from following URLs to malicious sites that could impersonate legitimate verification portals.
- Forgery detection: A QR code pointing to an untrusted domain strongly suggests document manipulation, as legitimate institutions use established verification infrastructure.
- Data quality assurance: Data extracted from trusted portals has higher reliability than data from unknown sources.
┌────────────────────────────────────────────────────────────┐
│ TRUSTED DOMAIN EVALUATION │
│ │
│ QR URL Extracted │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Parse Domain │ │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ ┌─────────────────────────┐ │
│ │ Domain in │ YES │ Proceed to Portal │ │
│ │ Trusted List? │─────▶│ Data Extraction │ │
│ │ (15 domains) │ └─────────────────────────┘ │
│ └────────┬────────┘ │
│ │ NO │
│ ▼ │
│ ┌─────────────────┐ ┌─────────────────────────┐ │
│ │ Known forgery │ YES │ Flag as AV-13 │ │
│ │ domain? │─────▶│ (QR replacement) │ │
│ └────────┬────────┘ └─────────────────────────┘ │
│ │ NO │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Log as untrusted│ │
│ │ Score U-29 = 0 │ │
│ └─────────────────┘ │
└────────────────────────────────────────────────────────────┘

5.4 Portal Data Extraction Pipeline
When a QR code resolves to a trusted verification portal, the system extracts structured data from the portal response and cross-references it against the document content.
QR URL (trusted domain)
│
▼
┌─────────────────┐
│ HTTP Request │
│ to Portal │
└────────┬────────┘
│
▼
┌─────────────────┐ ┌──────────────────┐
│ Response Type? │ │ │
│ │ │ Structured JSON │──▶ Direct field mapping
│ ┌──────────┐ │ │ │
│ │JSON/HTML │ │ └──────────────────┘
│ │/PDF/Text │ │
│ └──────────┘ │ ┌──────────────────┐
│ │ │ │
└────────────────┘ │ HTML Page │──▶ Content extraction
│ │ + field parsing
└──────────────────┘
│
▼
┌─────────────────┐
│ Field Mapping │
│ Name, ID, │
│ Dates, Scores, │
│ Institution │
└────────┬────────┘
│
▼
┌─────────────────────────────────────┐
│ Cross-Reference vs. Document Content│
│ │
│ Portal Name ←→ Document Name │
│ Portal ID ←→ Document ID │
│ Portal Date ←→ Document Date │
│ Portal Score ←→ Document Score │
└────────┬────────────────────────────┘
│
▼
┌─────────────────┐
│ Score U-30 │
│ (Portal Match) │
│ │
│ Match: U-30=10 │
│ Partial: U-30=5 │
│ Mismatch: U-30=0│ ◀── Critical fail triggers FAKE verdict
└─────────────────┘

6. Country-Specific Validation
6.1 Overview
Document verification cannot be performed in a jurisdiction-agnostic manner. Each country has distinct conventions for national identification numbers, business registration formats, document layouts, and verification infrastructure. Turing Verify implements country-specific validation modules that apply jurisdiction-appropriate checks.
6.2 National ID Checksum Validation
Six countries have implemented national ID systems with algorithmic checksums that enable mathematical verification of ID number validity. The Turing Verify implements checksum validation for each of these systems.
┌───────────────────────────────────────────────────────────────────┐
│ NATIONAL ID CHECKSUM VALIDATION │
├─────────────────┬──────────────────────┬──────────────────────────┤
│ Country │ Algorithm │ ID Format │
├─────────────────┼──────────────────────┼──────────────────────────┤
│ Taiwan (ROC) │ Weighted sum mod 10 │ 1 letter + 9 digits │
│ │ with letter mapping │ (e.g., A123456789) │
├─────────────────┼──────────────────────┼──────────────────────────┤
│ South Korea │ Weighted sum mod 11 │ 13 digits (YYMMDD- │
│ │ │ GXXXXXXC) │
├─────────────────┼──────────────────────┼──────────────────────────┤
│ Singapore │ Weighted sum mod 11 │ 1 letter + 7 digits │
│ │ with prefix-dependent│ + 1 check letter │
│ │ check letter table │ (e.g., S1234567D) │
├─────────────────┼──────────────────────┼──────────────────────────┤
│ Hong Kong │ Weighted sum mod 11 │ 1-2 letters + 6 digits │
│ │ with letter-to-number│ + 1 check digit │
│ │ conversion │ (e.g., A123456(7)) │
├─────────────────┼──────────────────────┼──────────────────────────┤
│ Malaysia │ Date + state code │ 12 digits (YYMMDD- │
│ │ validation │ SS-XXXX) │
├─────────────────┼──────────────────────┼──────────────────────────┤
│ Thailand │ Weighted sum mod 11 │ 13 digits with │
│ │ │ positional weights │
└─────────────────┴──────────────────────┴──────────────────────────┘

6.3 Business Registration Format Validation
Nine countries have sufficiently standardized business registration numbering systems to enable format validation. The system checks that registration numbers conform to the expected pattern for the claimed jurisdiction.
┌───────────────────────────────────────────────────────────────────┐
│ BUSINESS REGISTRATION FORMAT VALIDATION │
├──────────────────┬───────────────────┬────────────────────────────┤
│ Country │ Format Pattern │ Validation Rules │
├──────────────────┼───────────────────┼────────────────────────────┤
│ Taiwan │ 8-digit UBN │ Numeric, checksum modulo │
├──────────────────┼───────────────────┼────────────────────────────┤
│ Hong Kong │ CR-XXXXXXXX │ Prefix + 8 digits │
├──────────────────┼───────────────────┼────────────────────────────┤
│ Singapore │ YYYYNNNNNX │ Year + seq + check char │
├──────────────────┼───────────────────┼────────────────────────────┤
│ Japan │ 13-digit Corp # │ Numeric, check digit │
├──────────────────┼───────────────────┼────────────────────────────┤
│ South Korea │ XXX-XX-XXXXX │ 10 digits, regional prefix │
├──────────────────┼───────────────────┼────────────────────────────┤
│ United Kingdom │ 8-digit CRN │ Numeric or SC/NI prefix │
├──────────────────┼───────────────────┼────────────────────────────┤
│ United States │ EIN: XX-XXXXXXX │ 9 digits, prefix range │
├──────────────────┼───────────────────┼────────────────────────────┤
│ Australia │ 11-digit ABN │ Weighted checksum mod 89 │
├──────────────────┼───────────────────┼────────────────────────────┤
│ Canada │ 9-digit BN │ Luhn algorithm check digit │
└──────────────────┴───────────────────┴────────────────────────────┘

6.4 MRZ (Machine Readable Zone) Validation
Machine Readable Zones on travel documents and national IDs follow the ICAO 9303 standard, which defines three formats based on document type.
┌──────────────────────────────────────────────────────────────────┐
│ MRZ FORMAT STANDARDS (ICAO 9303) │
│ │
│ TD1 (ID Cards) — 3 lines x 30 characters │
│ ┌──────────────────────────────┐ │
│ │ I<UTOERIKSSON<<ANNA<MARIA<< │ Line 1: Doc type, country, │
│ │ L898902C<3UTO6908061F940623 │ Line 2: Doc#, nationality, │
│ │ <<<<<<<<<<<<<<<<<<<<<<<<<<< │ DOB, sex, expiry │
│ └──────────────────────────────┘ Line 3: Optional data │
│ │
│ TD2 (Larger ID Cards) — 2 lines x 36 characters │
│ ┌────────────────────────────────────┐ │
│ │ I<UTOERIKSSON<<ANNA<MARIA<<<<<<<< │ Line 1: Doc type, │
│ │ L898902C<3UTO6908061F9406236<<<<<< │ country, name │
│ └────────────────────────────────────┘ Line 2: Doc#, DOB, etc. │
│ │
│ TD3 (Passports) — 2 lines x 44 characters │
│ ┌────────────────────────────────────────────┐ │
│ │ P<UTOERIKSSON<<ANNA<MARIA<<<<<<<<<<<<<<<<< │ Line 1: Type, │
│ │ L898902C<3UTO6908061F9406236ZE184226B<<<<<< │ name │
│ └────────────────────────────────────────────┘ Line 2: All data │
│ │
│ Validation Checks: │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ 1. Character set compliance (A-Z, 0-9, <) │ │
│ │ 2. Check digit verification (weighted sum mod 10) │ │
│ │ 3. Composite check digit over multiple fields │ │
│ │ 4. Date format validity (YYMMDD) │ │
│ │ 5. Country code against ISO 3166-1 alpha-3 │ │
│ │ 6. Document type indicator correctness │ │
│ │ 7. Name field parsing and consistency with visual zone │ │
│ └──────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

6.5 Country Coverage Matrix
┌──────────────────────────────────────────────────────────────────────────┐
│ COUNTRY VALIDATION COVERAGE MATRIX │
├────────────────┬────────┬──────────┬────────┬──────────┬─────────────────┤
│ Country │ ID │ Business │ MRZ │ Portals │ Templates │
│ │Checksum│ Reg Fmt │ │ │ │
├────────────────┼────────┼──────────┼────────┼──────────┼─────────────────┤
│ Taiwan │ X │ X │ TD1 │ 6+ │ 3 │
│ Hong Kong │ X │ X │ TD1 │ 4+ │ 2 │
│ Singapore │ X │ X │ TD1 │ 3+ │ 2 │
│ South Korea │ X │ X │ TD1 │ 3+ │ 2 │
│ Malaysia │ X │ │ TD1 │ 2+ │ 1 │
│ Thailand │ X │ │ TD1 │ 2+ │ 1 │
│ Japan │ │ X │ TD3 │ 4+ │ 2 │
│ United Kingdom │ │ X │ TD3 │ 3+ │ 1 │
│ United States │ │ X │ TD3 │ 5+ │ 2 │
│ Australia │ │ X │ TD3 │ 3+ │ 1 │
│ Canada │ │ X │ TD3 │ 3+ │ 1 │
│ Others (15+) │ │ │ Varies│ 18+ │ 0 │
├────────────────┼────────┼──────────┼────────┼──────────┼─────────────────┤
│ TOTAL │ 6 │ 9 │ 3 │ 56 │ 18 │
│ │countries│countries │formats │ portals │ templates │
└────────────────┴────────┴──────────┴────────┴──────────┴─────────────────┘

7. Cross-Document Consistency Checking
7.1 Applicant Folder Model
In many verification workflows, multiple documents are submitted by the same individual as part of a single application. An admissions office, for example, may receive a passport, academic transcript, diploma, and language proficiency certificate from a single applicant. Turing Verify organizes related documents into applicant folders, enabling cross-document consistency analysis that individual document verification cannot provide.
┌──────────────────────────────────────────────────────────────────┐
│ APPLICANT FOLDER MODEL │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Applicant Folder │ │
│ │ │ │
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │
│ │ │ Document │ │ Document │ │ Document │ ... │ │
│ │ │ 1: │ │ 2: │ │ 3: │ │ │
│ │ │ Passport │ │ Transcript│ │ Diploma │ │ │
│ │ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │ │
│ │ │ │ │ │ │
│ │ └──────────────┼──────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌────────────────┐ │ │
│ │ │ Cross-Document │ │ │
│ │ │ Consistency │ │ │
│ │ │ Engine │ │ │
│ │ └────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ Consistency Checks Performed: │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 1. Name consistency across all documents │ │
│ │ 2. Date of birth consistency │ │
│ │ 3. Nationality/citizenship consistency │ │
│ │ 4. ID number cross-reference │ │
│ │ 5. Timeline plausibility (graduation → certification) │ │
│ │ 6. Institutional cross-reference │ │
│ │ 7. Photo consistency (if multiple ID photos) │ │
│ └──────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

7.2 Name Normalization Across Scripts
A significant challenge in cross-document consistency checking is name matching across different scripts and transliteration systems. An applicant's name may appear in Latin script on a passport, in CJK characters on an academic transcript, and in a different romanization on a professional certificate.
The system implements multi-script name normalization that:
- Identifies the script of each name instance (Latin, CJK, Cyrillic, Arabic, Devanagari, etc.)
- Applies standard transliteration mappings between scripts
- Normalizes Latin-script names for common variations (diacritics, hyphenation, ordering)
- Computes similarity scores that account for expected transliteration variation
┌──────────────────────────────────────────────────────────────────┐
│ NAME NORMALIZATION PIPELINE │
│ │
│ Document 1 Name Document 2 Name │
│ "WANG, Xiao-Ming" "王小明" │
│ │ │ │
│ ▼ ▼ │
│ ┌────────────┐ ┌────────────┐ │
│ │ Script │ │ Script │ │
│ │ Detection │ │ Detection │ │
│ │ → Latin │ │ → CJK │ │
│ └─────┬──────┘ └─────┬──────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────────┐ ┌────────────┐ │
│ │ Latin │ │ CJK-to- │ │
│ │ Normalize │ │ Pinyin │ │
│ │ - Remove │ │ Conversion │ │
│ │ diacritics│ │ │ │
│ │ - Standard │ │ → "Wang │ │
│ │ ordering │ │ Xiaoming"│ │
│ │ → "wang │ └─────┬──────┘ │
│ │ xiaoming"│ │ │
│ └─────┬──────┘ │ │
│ │ │ │
│ └───────────┬───────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────┐ │
│ │ Similarity │ │
│ │ Comparison │ │
│ │ │ │
│ │ Score: 0.95 │ │
│ │ (HIGH MATCH) │ │
│ └────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

7.3 Cross-Document Check Matrix
The system performs pairwise checks between all documents in a folder, evaluating each applicable consistency dimension.
┌──────────────────────────────────────────────────────────────────────┐
│ CROSS-DOCUMENT CHECK MATRIX (Example) │
├──────────────┬──────────┬──────────┬──────────┬──────────┬──────────┤
│ │ Passport │ Trans- │ Diploma │ Language │ Profess- │
│ │ │ cript │ │ Cert │ ional ID │
├──────────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Passport │ -- │ Name,DOB │ Name,DOB │ Name,DOB │ Name,DOB │
│ │ │ Nat. │ Nat. │ Nat. │ │
├──────────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Transcript │ Name,DOB │ -- │ Name, │ Timeline │ Name │
│ │ Nat. │ │ Inst. │ │ │
├──────────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Diploma │ Name,DOB │ Name, │ -- │ Timeline │ Name │
│ │ Nat. │ Inst. │ │ │ │
├──────────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Language │ Name,DOB │ Timeline │ Timeline │ -- │ Name │
│ Cert │ Nat. │ │ │ │ │
├──────────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ Profess. │ Name,DOB │ Name │ Name │ Name │ -- │
│ ID │ │ │ │ │ │
├──────────────┴──────────┴──────────┴──────────┴──────────┴──────────┤
│ Legend: Name = name match, DOB = date of birth, Nat. = nationality, │
│ Inst. = institution match, Timeline = date plausibility │
└─────────────────────────────────────────────────────────────────────┘

8. Verification Portal Integration
8.1 Portal Landscape
Turing Verify integrates with 56 verification portals operated by educational institutions, government agencies, professional certification bodies, and credential verification services across 25+ countries. These portals represent the most reliable external data sources for confirming document authenticity.
8.2 Integration Types
Portal integrations fall into three categories based on the technical mechanism used to retrieve verification data.
┌───────────────────────────────────────────────────────────────────┐
│ PORTAL INTEGRATION TYPES │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ TYPE 1: API-BASED (12 portals) │ │
│ │ │ │
│ │ System ──[REST/JSON]──▶ Portal API │ │
│ │ ◀──[JSON response]── │ │
│ │ │ │
│ │ Advantages: Structured data, fast, reliable │ │
│ │ Examples: Government verification APIs, │ │
│ │ centralized credential registries │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ TYPE 2: WEB-BASED (28 portals) │ │
│ │ │ │
│ │ System ──[HTTP GET]──▶ Portal Web Page │ │
│ │ ◀──[HTML]── Parse + Extract │ │
│ │ │ │
│ │ Advantages: Wide availability │ │
│ │ Challenges: HTML parsing fragility, │ │
│ │ layout changes require updates │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ TYPE 3: QR-REDIRECT (16 portals) │ │
│ │ │ │
│ │ QR Code ──[URL]──▶ Portal Landing Page │ │
│ │ Parse + Extract │ │
│ │ │ │
│ │ Advantages: Document-initiated verification │ │
│ │ Challenges: QR extraction required first │ │
│ └─────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘

8.3 Regional Distribution
┌──────────────────────────────────────────────────────────────────┐
│ VERIFICATION PORTAL DISTRIBUTION BY REGION │
│ │
│ East Asia ███████████████████████ 18 portals │
│ Southeast Asia ██████████████ 11 portals │
│ South Asia ██████████ 8 portals │
│ Europe ███████ 6 portals │
│ North America ██████ 5 portals │
│ Oceania ████ 4 portals │
│ Middle East ███ 3 portals │
│ Africa █ 1 portal │
│ ├────┼────┼────┼────┤ │
│ 0 5 10 15 20 │
│ │
│ Total: 56 portals across 25+ countries │
│ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Portal Categories: │ │
│ │ - University verification portals: 22 (39.3%) │ │
│ │ - Government document registries: 14 (25.0%) │ │
│ │ - Professional certification bodies: 9 (16.1%) │ │
│ │ - Centralized credential verification: 7 (12.5%) │ │
│ │ - Accreditation bodies: 4 ( 7.1%) │ │
│ └───────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

8.4 Portal Data Reliability
Not all portal data carries equal weight in the verification process. The system classifies portal data reliability based on the portal operator and data freshness:
| Reliability Tier | Portal Type | Scoring Impact |
|---|---|---|
| Tier 1 (Highest) | Government-operated registries, direct institutional APIs | Full scoring weight; portal contradiction = critical fail |
| Tier 2 (High) | University-operated verification pages, centralized credential services | High scoring weight; contradiction = major penalty |
| Tier 3 (Moderate) | Third-party aggregators, professional body portals | Moderate weight; contradiction = flag for review |
9. Report Generation and User Experience
9.1 Forensic PDF Report Structure
Every verification produces a 12-section forensic PDF report that documents the analysis in sufficient detail to support institutional decision-making and auditing requirements.
┌──────────────────────────────────────────────────────────────────┐
│ 12-SECTION FORENSIC PDF REPORT │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Section 1: EXECUTIVE SUMMARY │ │
│ │ Verdict, confidence level, critical findings │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 2: DOCUMENT OVERVIEW │ │
│ │ Document type, claimed institution, date range │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 3: STRUCTURAL ANALYSIS │ │
│ │ Template conformance, layout, visual element review │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 4: SEMANTIC ANALYSIS │ │
│ │ Content plausibility, date logic, grade validation │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 5: QR CODE ANALYSIS │ │
│ │ QR detection, extraction, domain trust, portal match │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 6: EXTERNAL VERIFICATION │ │
│ │ Portal results, checksum validation, MRZ parsing │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 7: METADATA ANALYSIS │ │
│ │ EXIF data, compression analysis, creation tool info │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 8: ATTACK VECTOR ASSESSMENT │ │
│ │ Triggered AV defenses, severity, evidence │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 9: SCORING BREAKDOWN │ │
│ │ T1 and T2 scores, category breakdowns, penalties │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 10: CROSS-DOCUMENT FINDINGS │ │
│ │ Consistency results (if part of applicant folder) │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 11: COUNTRY-SPECIFIC RESULTS │ │
│ │ ID checksum, registration format, MRZ validation │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ Section 12: RECOMMENDATIONS │ │
│ │ Suggested actions, areas for manual review │ │
│ └────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

9.2 SSE Streaming Timeline
The verification process involves multiple stages that execute over a period of 15 to 45 seconds for a typical document. Rather than leaving users without feedback during processing, the system streams progress updates via Server-Sent Events (SSE).
┌──────────────────────────────────────────────────────────────────┐
│ SSE PROGRESS STREAMING TIMELINE │
│ │
│ Time Event Client Display │
│ ───── ───── ────────────── │
│ 0s upload_received "Document received" │
│ 1s image_normalized "Processing image..." │
│ 2s qr_scan_started "Scanning for QR..." │
│ 4s qr_scan_complete "QR found / not found" │
│ 5s prescreen_started "Pre-screening..." │
│ 7s prescreen_complete "Pre-screen passed" │
│ 8s portal_lookup_started "Checking portals..." │
│ 12s portal_lookup_complete "Portal data retrieved" │
│ 13s forensic_analysis_started "Analyzing document..." │
│ 25s forensic_analysis_progress "Evaluating rules..." │
│ 35s forensic_analysis_complete "Analysis complete" │
│ 36s scoring_complete "Scoring complete" │
│ 38s report_generation_started "Generating report..." │
│ 42s report_generation_complete "Report ready" │
│ 42s verdict_delivered "VERDICT: [result]" │
│ │
│ ──────────────────────────────────────────────────▶ Time │
│ 0s 5s 10s 15s 20s 25s 30s 35s 40s │
│ ├─────┤──────┤──────┤──────┤──────┤──────┤──────┤──────┤ │
│ │ Pre │ QR + │ Forensic Analysis │Rpt │ │
│ │proc.│Portal│ │Gen │ │
│ ├─────┴──────┴──────────────────────────────────────┴────┤ │
│ │ Total: ~40 seconds (typical) │ │
│ └────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

9.3 Batch Processing Workflow
For institutions processing large numbers of documents, Turing Verify supports batch upload with parallel processing and aggregated reporting.
┌──────────────────────────────────────────────────────────────────┐
│ BATCH PROCESSING WORKFLOW │
│ │
│ ┌──────────────┐ │
│ │ Batch Upload │ │
│ │ (N documents) │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ Queue Manager │ │
│ │ (Rate-limited │ │
│ │ concurrency) │ │
│ └──────┬───────┘ │
│ │ │
│ ┌────┼────┬────┬────┐ │
│ │ │ │ │ │ Parallel Processing │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌───┐┌───┐┌───┐┌───┐┌───┐ │
│ │D-1││D-2││D-3││D-4││D-5│ ... │
│ └─┬─┘└─┬─┘└─┬─┘└─┬─┘└─┬─┘ │
│ │ │ │ │ │ │
│ └────┼────┴────┼────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────┐ │
│ │ Batch Summary Report │ │
│ │ - Total processed │ │
│ │ - PASS / NEEDS_REVIEW │ │
│ │ / FAKE counts │ │
│ │ - Critical findings │ │
│ │ - Individual PDFs │ │
│ └──────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

10. Security and Privacy
10.1 Authentication Architecture
The platform implements a multi-provider authentication system supporting four identity providers. JSON Web Tokens (JWT) are used for session management across all authenticated endpoints.
┌──────────────────────────────────────────────────────────────────┐
│ AUTHENTICATION ARCHITECTURE │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Provider │ │ Provider │ │ Provider │ │ Provider │ │
│ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │
│ │ (OAuth) │ │ (OAuth) │ │ (Email) │ │ (API Key)│ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │ │
│ └─────────────┼─────────────┼─────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────┐ │
│ │ Auth Middleware │ │
│ │ - Token validation │ │
│ │ - Role extraction │ │
│ │ - Rate limit binding │ │
│ └───────────┬─────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────┐ │
│ │ JWT Token Issuance │ │
│ │ - Short-lived access │ │
│ │ - Refresh rotation │ │
│ │ - Audience binding │ │
│ └───────────┬─────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────┐ │
│ │ Protected Endpoints │ │
│ │ (10 rate-limited) │ │
│ └─────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

10.2 Rate Limiting Model
Rate limiting is enforced at 10 requests per second per authenticated user across 10 protected endpoints. The rate limiter uses a sliding window algorithm that provides smooth throttling without the burst characteristics of fixed-window approaches.
┌──────────────────────────────────────────────────────────────────┐
│ RATE LIMITING MODEL │
│ │
│ Request Flow: │
│ │
│ Client Request ──▶ Rate Limit Check ──▶ Endpoint │
│ │ │
│ ┌─────┴─────┐ │
│ │ │ │
│ ALLOWED REJECTED │
│ │ (429) │
│ ▼ │
│ ┌───────────┐ │
│ │ Process │ │
│ │ Request │ │
│ └───────────┘ │
│ │
│ Protected Endpoints (10): │
│ ┌──────────────────────────────────────────────────┐ │
│ │ 1. POST /api/verify (single doc) │ │
│ │ 2. POST /api/verify/batch (batch upload) │ │
│ │ 3. GET /api/reports/{id} (report fetch) │ │
│ │ 4. GET /api/reports/{id}/pdf (PDF download) │ │
│ │ 5. GET /api/folders (list folders) │ │
│ │ 6. POST /api/folders (create folder) │ │
│ │ 7. GET /api/history (verify history)│ │
│ │ 8. POST /api/auth/token (token refresh) │ │
│ │ 9. GET /api/portals/status (portal health) │ │
│ │ 10. POST /api/feedback (verdict fdbk) │ │
│ └──────────────────────────────────────────────────┘ │
│ │
│ Rate: 10 requests/second per user (sliding window) │
└──────────────────────────────────────────────────────────────────┘

10.3 Data Lifecycle
Document images and verification data follow a defined lifecycle that balances operational requirements with privacy considerations.
┌──────────────────────────────────────────────────────────────────┐
│ DATA LIFECYCLE │
│ │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ UPLOAD │──▶│PROCESS │──▶│ STORE │──▶│ PURGE │ │
│ └────────┘ └────────┘ └────────┘ └────────┘ │
│ │
│ Stage 1: UPLOAD │
│ ┌──────────────────────────────────────────┐ │
│ │ - Image received via TLS-encrypted conn │ │
│ │ - Validated for format and size │ │
│ │ - Assigned unique verification ID │ │
│ │ - Stored in temporary processing buffer │ │
│ └──────────────────────────────────────────┘ │
│ │
│ Stage 2: PROCESS │
│ ┌──────────────────────────────────────────┐ │
│ │ - Image normalized and analyzed │ │
│ │ - QR codes extracted │ │
│ │ - Portal data retrieved │ │
│ │ - AI models process document │ │
│ │ - All processing in-memory │ │
│ └──────────────────────────────────────────┘ │
│ │
│ Stage 3: STORE │
│ ┌──────────────────────────────────────────┐ │
│ │ - Verdict and scores persisted to SQLite │ │
│ │ - PDF report generated and stored │ │
│ │ - Original image reference maintained │ │
│ │ - Audit trail recorded │ │
│ └──────────────────────────────────────────┘ │
│ │
│ Stage 4: PURGE │
│ ┌──────────────────────────────────────────┐ │
│ │ - Configurable retention policy │ │
│ │ - Original images purged first │ │
│ │ - Reports retained per policy │ │
│ │ - Audit logs retained longest │ │
│ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘

10.4 Privacy Considerations
The system is designed with several privacy safeguards:
- No biometric storage: The system does not extract, store, or process biometric data (fingerprints, facial recognition, iris patterns) from identity documents.
- Minimal data retention: Only verification results and metadata are retained long-term; original document images are purged according to configurable policies.
- No cross-user data sharing: Verification data from one user or institution is never shared with or visible to other users or institutions.
- Encrypted transit: All data in transit is encrypted via TLS. API communications with AI model providers are similarly encrypted.
- Audit trail: All verification actions are logged for compliance and accountability purposes.
11. Comparative Analysis: Automated vs Manual Verification
11.1 Performance Dimensions
A structured comparison between Turing Verify's automated approach and traditional manual verification reveals distinct advantages and trade-offs across multiple performance dimensions.
┌───────────────────────────────────────────────────────────────────────┐
│ AUTOMATED (TURING VERIFIER) vs. MANUAL VERIFICATION │
├───────────────────┬─────────────────────┬─────────────────────────────┤
│ Dimension │ Turing Verify │ Manual Review │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Time per document │ 15-45 seconds │ 15-60 minutes │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Throughput │ ~100-200 docs/hour │ 4-8 docs/hour per reviewer │
│ (per worker) │ │ │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Consistency │ Deterministic per │ Varies by reviewer, │
│ │ model version │ time of day, fatigue │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Country coverage │ 25+ countries, │ Limited to reviewer's │
│ │ 56 portals │ jurisdiction expertise │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Checkpoint depth │ 81 checkpoints per │ ~10-20 checks typical │
│ │ document │ per manual review │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Auditability │ 12-section PDF with │ Written notes, variable │
│ │ full reasoning │ detail and format │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Cost per doc │ $0.50-$3.00 │ $15-$50+ │
│ (estimated) │ │ │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ QR verification │ 7-strategy auto │ Manual URL entry, │
│ │ extraction + lookup │ if attempted at all │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ ID checksum │ Algorithmic (6 │ Rarely performed │
│ validation │ countries) │ manually │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Cross-doc checks │ Systematic, all │ Depends on reviewer │
│ │ pairwise │ thoroughness │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ 24/7 availability │ Yes │ Business hours only │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Physical document │ No (digital only) │ Yes (UV, watermark, paper) │
│ examination │ │ │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Legal testimony │ Report as evidence │ Expert witness testimony │
├───────────────────┼─────────────────────┼─────────────────────────────┤
│ Novel attack │ Dependent on model │ Expert intuition may catch │
│ detection │ training + rules │ unprecedented patterns │
└───────────────────┴─────────────────────┴─────────────────────────────┘

11.2 Throughput Comparison
┌──────────────────────────────────────────────────────────────────┐
│ THROUGHPUT COMPARISON (per 8-hour shift) │
│ │
│ Documents Verified: │
│ │
│ Turing Verify ████████████████████████████████ 800-1600 │
│ (single instance) │
│ │
│ Expert Reviewer ████ 32-64 │
│ (single person) │
│ │
│ Junior Reviewer ██ 16-32 │
│ (single person) │
│ │
│ ├──────┼──────┼──────┼──────┤ │
│ 0 400 800 1200 1600 │
│ │
│ Cost per 1000 documents: │
│ │
│ Turing Verify ██ $500-$3,000 │
│ │
│ Expert Reviewers ████████████████████████████ $15,000-$50,000│
│ │
│ ├──────┼──────┼──────┼──────┤ │
│ $0 $12.5K $25K $37.5K $50K │
└──────────────────────────────────────────────────────────────────┘

11.3 AI vs Human Inspector Benchmark System
Version 2.0 introduces a comprehensive benchmark framework for quantitatively comparing AI verification performance against human document inspectors. The benchmark provides empirical evidence for the platform's effectiveness and identifies areas where human expertise remains superior.
11.3.1 Benchmark Methodology
The benchmark employs a 200-document dataset stratified across four difficulty tiers: trivial (obvious fakes detectable by untrained observers), easy (fakes with clear indicators visible to trained inspectors), medium (sophisticated forgeries requiring expert analysis), and hard (state-of-the-art forgeries designed to evade both human and automated detection). Documents span all 9 supported document categories and originate from 15+ countries.
Binary classification framework. Both AI and human inspectors produce verdicts that are mapped to a binary classification for benchmark comparison. REJECTED and SUSPECT verdicts are classified as "fraud detected," while VERIFIED verdicts are classified as "no fraud detected." This mapping enables standard binary classification metrics.
Human inspector protocol. Human inspectors participate under controlled conditions: blind testing (no prior knowledge of document provenance), randomized presentation order, automatic timing of each inspection, and fatigue monitoring with mandatory breaks. Inspectors are classified into three tiers based on experience: Junior (0-2 years), Senior (3-7 years), and Expert (8+ years).
11.3.2 Metric Categories
The benchmark evaluates performance across five metric categories, each weighted in a composite score:
| Category | Weight | Metrics | Description |
|---|---|---|---|
| Accuracy | 40% | F1 score, precision, recall | Correctness of fraud/no-fraud classification |
| Speed | 15% | Latency per document, throughput per hour | Time efficiency of the verification process |
| Cost | 15% | Cost per document, cost per fraud caught | Economic efficiency of the approach |
| Consistency | 15% | Multi-run agreement (AI), inter-rater agreement (human) | Reproducibility of results across repeated evaluations |
| Coverage | 15% | Abstention rate | Proportion of documents for which the system provides a definitive verdict |
Statistical significance testing. Benchmark comparisons employ McNemar's test for paired binary classification comparison and bootstrap confidence intervals (1000 resamples) for all metric estimates. Differences are reported as statistically significant at p < 0.05.
11.3.3 Key Findings
The benchmark dashboard presents results through confusion matrices, ROC curves, and cost-efficiency plots for each inspector tier and AI model configuration.
| Metric | AI (Standard) | AI (Deep) | Human Expert | Human Senior | Human Junior |
|---|---|---|---|---|---|
| F1 Score | 0.91 | 0.95 | 0.93 | 0.85 | 0.72 |
| Precision | 0.94 | 0.97 | 0.96 | 0.89 | 0.75 |
| Recall | 0.88 | 0.93 | 0.90 | 0.82 | 0.69 |
| Avg. latency | ~30s | ~90s | ~25 min | ~18 min | ~12 min |
| Cost/doc | ~$0.04 | ~$0.30 | ~$45 | ~$30 | ~$18 |
| Consistency | 98% | 99% | 87% | 79% | 68% |
| Abstention | 5% | 2% | 3% | 8% | 15% |
The benchmark demonstrates that AI verification processes documents approximately 400 times faster than human inspectors at roughly 150 times lower cost per fraud caught. Deep Verification mode approaches or exceeds expert-level accuracy while maintaining the speed and consistency advantages of automated analysis.
11.4 Complementary Roles
The comparison is not intended to suggest that automated verification should entirely replace human review. The two approaches serve complementary roles in a comprehensive verification strategy:
- Automated verification (Turing Verify) excels at high-volume initial screening, systematic checkpoint evaluation, external portal cross-referencing, and consistent documentation of findings.
- Manual expert review excels at evaluating physical document properties, applying contextual judgment to ambiguous cases, providing expert testimony, and detecting truly novel attack patterns that fall outside known categories.
The NEEDS_REVIEW verdict category in Turing Verify's output is explicitly designed to bridge these approaches, flagging documents that warrant human expert attention while providing the automated analysis as a starting point for the reviewer.
12. Case Studies
The following case studies are drawn from the system's operational history. All identifying details have been anonymized, and institution names have been replaced with generic descriptors to protect the privacy of all parties involved.
12.1 Case A: Forged Academic Transcript -- Portal Mismatch Detection
Document type: Academic transcript (undergraduate) Claimed origin: A university in East Asia Attack vector family: Fabrication + Tampering
The submitted transcript appeared visually consistent with known templates from the claimed institution. Font usage, layout structure, and seal placement were within expected parameters. The structural analysis (Tier 1) produced a score in the acceptable range.
However, the document contained an embedded QR code that resolved to the institution's legitimate verification portal. When the system extracted data from the portal, the name and student ID matched, but the grades retrieved from the portal differed significantly from those displayed on the submitted transcript. Specifically, the cumulative GPA shown on the document was substantially higher than the portal-confirmed figure, and three course grades had been altered from their authentic values.
Checkpoints triggered: U-23 (Cumulative Calculation Accuracy), U-30 (Portal Data Match), AV-08 (Grade/Score Inflation), AV-13 (QR Code Replacement was not triggered, as the QR was authentic -- only the visible grades had been altered).
Outcome: Verdict FAKE. The portal data mismatch on U-30 constituted a critical fail. The forensic report detailed the specific grade discrepancies, enabling the reviewing institution to make an informed decision.
12.2 Case B: Sophisticated Template Forgery -- Layout Anomaly Detection
Document type: Professional certification Claimed origin: An international certification body in Europe Attack vector family: Fabrication
This case represented a high-sophistication forgery that replicated the overall appearance of a legitimate professional certification with considerable fidelity. The forger had used accurate institutional branding, plausible administrative language, and a realistic certificate number format.
The system detected the forgery through a combination of structural anomalies invisible to casual inspection. The margin widths deviated from the institution's template by several millimeters, the font used for the certificate number was from a different typographic family than the institution's standard, and the institution's seal was positioned at coordinates inconsistent with any known template version. Additionally, the metadata analysis revealed that the document had been created using consumer image editing software rather than the institutional printing system.
Checkpoints triggered: U-01 (Template Conformance), U-02 (Font Consistency), U-04 (Seal/Stamp Authenticity), U-09 (Logo Fidelity), U-42 (Creation Tool Detection), AV-04 (Template Generator Artifacts).
Outcome: Verdict FAKE. The accumulation of structural anomalies drove the Tier 1 score below the threshold even before semantic analysis. The report documented each deviation with specific measurements and comparisons.
12.3 Case C: Cross-Document Inconsistency in Batch Submission
Document type: Multi-document applicant folder (passport, transcript, diploma, language certificate) Claimed origin: Multiple institutions across two countries Attack vector family: Identity Fraud
A batch submission for a single applicant contained four documents. Each document, when analyzed individually, produced acceptable scores. The passport was genuine, the transcript appeared legitimate, the diploma format matched known templates, and the language certificate contained a verifiable QR code.
Cross-document consistency analysis revealed the forgery. The name on the passport (in Latin script) and the name on the transcript (in CJK characters) did not correspond under standard transliteration mappings. The date of birth on the passport differed from that on the diploma by a discrepancy of one year. Additionally, the graduation date on the transcript implied a timeline that was inconsistent with the issuance date of the language certificate, suggesting that documents from different individuals had been combined into a single application.
Checkpoints triggered: AV-15 (Name Mismatch Across Documents), AV-18 (Biographical Data Conflict), AV-20 (Alias Exploitation).
Outcome: Verdict FAKE for the folder, with individual document verdicts revised. The report identified which documents were likely authentic and which were likely substituted, providing the reviewing institution with actionable detail.
12.4 Case D: National ID Checksum Failure
Document type: National identity card Claimed origin: A country in Southeast Asia Attack vector family: Tampering
The submitted national ID card had been altered to change the identity number. The forger had carefully modified several digits in the ID number while maintaining the visual appearance of the original document. The structural analysis found no obvious visual tampering artifacts, and the general formatting matched expected templates.
The country-specific validation module applied the national checksum algorithm to the displayed ID number and determined that the check digit was invalid for the given sequence of digits. This mathematical certainty that the ID number was not a valid issuance eliminated the possibility that the discrepancy was due to image quality or OCR error.
Checkpoints triggered: U-32 (National ID Checksum), AV-10 (Name Substitution -- the number change correlated with a likely identity swap), U-20 (Registration/ID Number Format).
Outcome: Verdict FAKE. The checksum failure constituted a critical fail. The report included the mathematical calculation demonstrating the invalidity of the presented ID number.
12.5 Case Study Summary
┌───────────────────────────────────────────────────────────────────────┐
│ CASE STUDY SUMMARY TABLE │
├──────┬───────────────┬───────────────┬───────────────┬────────────────┤
│ Case │ Document Type │ Attack Family │ Key Detection │ Verdict │
│ │ │ │ Method │ │
├──────┼───────────────┼───────────────┼───────────────┼────────────────┤
│ A │ Academic │ Fabrication + │ Portal data │ FAKE │
│ │ transcript │ Tampering │ mismatch │ (critical: │
│ │ │ │ (U-30) │ portal refute)│
├──────┼───────────────┼───────────────┼───────────────┼────────────────┤
│ B │ Professional │ Fabrication │ Template + │ FAKE │
│ │ certification │ │ metadata │ (structural │
│ │ │ │ anomalies │ failure) │
├──────┼───────────────┼───────────────┼───────────────┼────────────────┤
│ C │ Multi-doc │ Identity │ Cross-doc │ FAKE │
│ │ folder (4 │ Fraud │ name + date │ (folder-level) │
│ │ documents) │ │ inconsistency │ │
├──────┼───────────────┼───────────────┼───────────────┼────────────────┤
│ D │ National ID │ Tampering │ Checksum │ FAKE │
│ │ card │ │ algorithm │ (critical: │
│ │ │ │ failure (U-32)│ checksum fail)│
└──────┴───────────────┴───────────────┴───────────────┴────────────────┘

13. Deep Verification Mode
13.1 Overview
Version 2.0 introduces Deep Verification, a premium forensic investigation tier that employs Claude Sonnet 4 to perform an expanded 13-stage analysis pipeline. Deep Verification is designed for high-stakes verification scenarios where standard analysis is insufficient, such as executive hiring, large financial transactions, immigration adjudication, and institutional accreditation reviews.
Deep Verification consumes 5 credits per document (compared to 1 credit for standard verification) and produces an executive-grade forensic report with significantly greater depth than the standard 12-section report.
13.2 Verification Stages
The Deep Verification pipeline includes all 8 stages of the standard pipeline plus 5 additional specialized stages:
| Stage | Name | Standard | Deep | Description |
|---|---|---|---|---|
| 1 | Document Classification | Yes | Yes | Identify document type, country, and issuing institution |
| 2 | Structural Analysis | Yes | Yes | Template conformance, layout, visual element review |
| 3 | Semantic Analysis | Yes | Yes | Content plausibility, date logic, grade validation |
| 4 | QR Code Analysis | Yes | Yes | QR detection, extraction, domain trust, portal match |
| 5 | External Verification | Yes | Yes | Portal results, checksum validation, MRZ parsing |
| 6 | Metadata Analysis | Yes | Yes | EXIF data, compression analysis, creation tool info |
| 7 | Attack Vector Assessment | Yes | Yes | Triggered AV defenses, severity, evidence |
| 8 | Scoring and Verdict | Yes | Yes | T1/T2 scoring, verdict determination |
| 9 | Deep Font & Typography Forensics | No | Yes | Typeface identification, glyph metrics, kerning analysis, baseline alignment |
| 10 | Seal & Watermark Deep Scan | No | Yes | Spectral signature analysis, embossing detection, holographic feature analysis |
| 11 | Institutional Deep Cross-Reference | No | Yes | Template matching, signatory verification, accreditation chain analysis |
| 12 | KYB -- Institution Due Diligence | No | Yes | Issuer credibility scoring, fraud rate assessment, regulatory standing |
| 13 | KYC -- Holder Background Screening | No | Yes | Sanctions check, PEP screening, adverse media, identity consistency |
13.3 Deep Confidence Dimensions
Standard verification produces a single-dimensional confidence score. Deep Verification expands this to 8 distinct confidence dimensions, each scored independently on a 0-100 scale:
| Dimension | Description | Weight |
|---|---|---|
| visual_integrity | Physical appearance, layout, print quality, and visual consistency | 15% |
| content_accuracy | Semantic correctness, date logic, grade validity, and content plausibility | 15% |
| institutional_match | Template conformance, signatory verification, accreditation chain | 15% |
| temporal_consistency | Date plausibility, timeline coherence, version anachronism detection | 10% |
| security_features | Seal authenticity, watermark analysis, holographic indicators, spectral signatures | 15% |
| holder_verification | Name consistency, identity coherence, cross-document agreement | 10% |
| kyb_credibility | Issuing institution credibility, accreditation status, fraud rate, regulatory standing | 10% |
| kyc_clearance | Holder sanctions screening, PEP status, adverse media, credibility risk | 10% |
13.4 Deep Forensic Report Structure
The Deep Verification report expands significantly beyond the standard 12-section report to include the following sections:
- Executive Summary -- High-level verdict, confidence level, and critical findings with business-ready language
- Confidence Breakdown -- Detailed scoring across all 8 confidence dimensions with radar visualization
- Font Forensics -- Typeface identification, glyph metric analysis, kerning consistency, baseline alignment measurements
- Seal & Watermark Analysis -- Spectral signatures, embossing depth assessment, holographic feature mapping
- Cross-Reference Report -- Template matching results, signatory verification, accreditation chain validation
- KYB Report -- Institution Due Diligence -- Credibility score (0-100), fraud rate assessment, accreditation status, ranking tier (Top 100 Global, National Tier 1, Regional, etc.), regulatory standing and compliance history
- KYC Report -- Holder Background Screening -- Sanctions check (OFAC, EU, UN), PEP screening, adverse media analysis, identity consistency assessment, credibility risk profiling (Low/Medium/High)
- Risk Assessment -- Attack vector analysis, forgery technique identification, professional forensic opinion
- Document Lineage -- Provenance analysis and document history reconstruction
- Comparative Analysis -- Comparison against known authentic specimens from the same institution
- Methodology Notes -- Detailed description of analysis techniques applied and their confidence contributions
- Standard Forensic Sections -- All 12 sections from the standard report are included as appendix material
13.5 Premium User Interface
Deep Verification features a distinct visual experience to differentiate it from standard analysis:
- Purple-themed scanning animation with orbiting particle effects during the 13 analysis stages
- Premium "OPUS" badge and stage-specific labels ("DEEP SCAN", "KYB", "KYC") displayed during processing
- Gradient shimmer effects on the deep report header
- Purple X-ray scan beam with grid overlay and corner brackets during document analysis
- Collapsible forensic report sections with purple accent borders for easy navigation of the expanded report
14. KYB -- Know Your Business (Issuer Verification)
14.1 Overview
KYB (Know Your Business) is a Deep Verification stage that evaluates the credibility and legitimacy of the document-issuing institution. While standard verification confirms that an institution exists (U-35) and checks its accreditation status (U-36), KYB performs a comprehensive due diligence assessment that mirrors the institutional vetting conducted by financial regulators and accreditation bodies.
14.2 Assessment Dimensions
The KYB module evaluates issuing institutions across the following dimensions:
Credibility Score (0-100). A composite score reflecting the overall trustworthiness of the issuing institution, derived from accreditation status, ranking position, regulatory history, and fraud incident records.
Accreditation Database Cross-Reference. The system checks against authoritative accreditation databases including CHEA (Council for Higher Education Accreditation), QS World Rankings, Times Higher Education, national government registries, and regional accreditation bodies. Institutions are classified by their highest confirmed accreditation level.
Ranking Tier Classification. Institutions are classified into ranking tiers that provide context for the credibility score:
| Tier | Description | Example Indicators |
|---|---|---|
| Top 100 Global | Institutions consistently ranked in major global rankings | QS/THE Top 100, high research output |
| National Tier 1 | Leading institutions within their country | Top 10 nationally, strong accreditation |
| National Tier 2 | Established institutions with solid credentials | National accreditation, moderate ranking |
| Regional | Institutions primarily serving regional populations | Regional accreditation, limited rankings |
| Unranked/Unaccredited | Institutions without recognized accreditation | No verified accreditation, potential diploma mill indicators |
Fraud Rate Assessment. The system evaluates the historical rate of fraudulent documents claiming to originate from the institution, drawing on internal verification data and publicly available reports of credential fraud.
Regulatory Standing. Assessment of the institution's compliance with relevant regulatory frameworks, including any known compliance issues, sanctions, investigations, scandals, or enforcement actions. Government deregistrations and license revocations are flagged as critical findings.
14.3 KYB Output
The KYB section of the Deep Verification report includes:
- Institution credibility score with confidence interval
- Accreditation status summary with source citations
- Ranking tier with supporting evidence
- Fraud rate assessment (Low/Medium/High/Critical)
- Regulatory standing summary
- Known compliance issues or fraud incidents, if any
- Professional opinion on institutional credibility risk
15. KYC -- Know Your Customer (Holder Screening)
15.1 Overview
KYC (Know Your Customer) is a Deep Verification stage that performs background screening on the document holder -- the individual whose name appears on the document. This screening is modeled on financial sector KYC practices and provides an additional layer of verification beyond document-centric analysis.
15.2 Screening Components
Name Consistency Verification. The KYC module extends the standard name consistency check (U-18) by analyzing name variations across all document fields, including printed name, signature block, MRZ data, QR-encoded data, and any secondary name fields. Discrepancies that fall within expected transliteration variation are distinguished from those that suggest identity manipulation.
Sanctions and Watchlist Screening. The holder's name is screened against major sanctions lists:
| List | Jurisdiction | Coverage |
|---|---|---|
| OFAC SDN | United States | Specially Designated Nationals and Blocked Persons |
| EU Consolidated | European Union | Persons subject to EU restrictive measures |
| UN Security Council | International | Individuals subject to UN sanctions |
Politically Exposed Person (PEP) Screening. The system checks whether the document holder matches profiles of politically exposed persons, defined as individuals who hold or have recently held prominent public functions. PEP status does not constitute a negative finding but is flagged as a risk factor requiring enhanced due diligence.
Adverse Media Screening. Analysis of publicly available information for negative news coverage, legal proceedings, regulatory actions, or other adverse information associated with the holder's name and identifying details.
Identity Consistency Analysis. Comprehensive cross-validation of all identity-related data points across the document and any associated documents in the applicant folder, including age plausibility, biographical timeline coherence, and geographic consistency.
15.3 Credibility Risk Profiling
The KYC module produces a credibility risk classification for the document holder:
| Risk Level | Criteria | Recommended Action |
|---|---|---|
| Low | No sanctions hits, no PEP status, no adverse media, consistent identity | Standard processing |
| Medium | Minor adverse media, PEP associate, or minor identity discrepancies | Enhanced review recommended |
| High | Sanctions match, PEP principal, significant adverse media, or identity inconsistencies | Manual review required |
16. Credit-Based Pricing System
16.1 Credit Model
Version 2.0 introduces a credit-based pricing system that provides flexible access to both standard and deep verification capabilities. Credits serve as the universal unit of consumption across all verification tiers.
| Verification Type | Credit Cost | AI Model | Stages | Report Type |
|---|---|---|---|---|
| Standard | 1 credit | Sonnet 4 / GPT-5.3 | 8 | 12-section PDF |
| Deep | 5 credits | Sonnet 4 | 13 | Executive forensic report |
Credit consumption is tracked in the database via the credit_cost column on each verification record, with quota usage calculated as SUM(COALESCE(credit_cost, 1)) to maintain backward compatibility with pre-v2.0 records.
16.2 Pricing Tiers
Subscription Plans:
| Plan | Monthly Price | Credits/Month | Per-Credit Cost | Target User |
|---|---|---|---|---|
| Free | $0 | 5 | N/A | Individual evaluation |
| Personal | $9 | 15 | $0.60 | Freelance recruiters, small offices |
| Pro | $29 | 200 | $0.145 | HR departments, admissions offices |
| Business | $99 | 1,000 | $0.099 | Large institutions, agencies |
Marketplace Credit Packs (one-time purchase):
| Pack | Price | Credits | Per-Credit Cost |
|---|---|---|---|
| Single | $0.50 | 1 | $0.50 |
| 10-Pack | $4.00 | 10 | $0.40 |
| 25-Pack | $7.50 | 25 | $0.30 |
16.3 Cost Economics
The platform achieves strong unit economics through the multi-model architecture and prompt caching optimizations:
| Metric | Standard Verification | Deep Verification |
|---|---|---|
| AI inference cost | ~$0.04 | ~$0.25-$0.35 |
| Infrastructure overhead | ~$0.005 | ~$0.01 |
| Total COGS | ~$0.045 | ~$0.26-$0.36 |
| Revenue (Pro plan rate) | $0.145 | $0.725 |
| Gross margin | ~69% | ~50-64% |
| Revenue (marketplace rate) | $0.30-$0.50 | $1.50-$2.50 |
| Gross margin (marketplace) | ~85-91% | ~76-86% |
In average usage scenarios where the mix of standard and deep verifications is considered, gross margins range from 87% to 96%.
17. Limitations and Future Work
17.1 Current Limitations
The Turing Verify platform, while comprehensive in its current implementation, operates within several known constraints:
Physical document analysis. The system processes document images, not physical documents. Security features that require physical inspection -- such as UV-reactive inks, tactile embossing, or microprinting below image resolution -- cannot be evaluated. Documents that pass automated verification may still warrant physical examination in high-stakes scenarios.
Template coverage. The 18 institution templates cover frequently encountered issuers but represent a small fraction of the global population of credential-issuing institutions. Documents from untemplated institutions receive general structural analysis without the enhanced detection sensitivity that template matching provides.
Portal availability. The 56 integrated portals provide verification coverage for many common document sources, but the majority of credential-issuing institutions worldwide do not operate public verification portals. For documents without portal verification, the system relies entirely on forensic analysis without external data corroboration.
Language limitations. While the platform supports 5 interface languages, the forensic analysis is conducted in the language of the underlying AI models. Documents in less common languages may receive less precise semantic analysis than documents in widely spoken languages.
Adversarial evolution. As with any security system, the effectiveness of specific checkpoints may degrade as forgers develop techniques to evade them. The calibration feedback loop mitigates this, but the system's detection capabilities at any point in time reflect its current training state.
AI model dependency. The system's forensic capabilities are fundamentally dependent on the capabilities of the underlying AI models. Changes in model behavior across versions, API rate limits, or service disruptions directly impact the system's operation.
17.2 Expansion Roadmap
┌──────────────────────────────────────────────────────────────────────┐
│ TECHNOLOGY ROADMAP │
│ │
│ 2026 Q2 2026 Q3 2026 Q4 2027 Q1 │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │Phase │ │Phase │ │Phase │ │Phase │ │
│ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │
│ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │- 20 addtl │ │- Dedicated │ │- Physical │ │- Blockchain│ │
│ │ portal │ │ fine-tuned│ │ security │ │ credential│ │
│ │ integratns│ │ document │ │ feature │ │ anchoring │ │
│ │ │ │ analysis │ │ detection │ │ support │ │
│ │- 3 addtl │ │ model │ │ via high- │ │ │ │
│ │ ID chksum │ │ │ │ res image │ │- Enterprise│ │
│ │ countries │ │- Batch │ │ analysis │ │ API tier │ │
│ │ │ │ analytics │ │ │ │ │ │
│ │- 5 addtl │ │ dashboard │ │- 3 addtl │ │- 10 addtl │ │
│ │ languages │ │ │ │ interface │ │ languages │ │
│ │ │ │- Webhook │ │ languages │ │ │ │
│ │- 30 addtl │ │ notifica- │ │ │ │- On-premise│ │
│ │ inst. │ │ tions │ │- Custom │ │ deploy- │ │
│ │ templates │ │ │ │ template │ │ ment │ │
│ │ │ │- SAML SSO │ │ builder │ │ option │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │
│ │
│ Ongoing: Calibration case expansion, AV catalog updates, │
│ portal maintenance, model version tracking │
│ │
│ ──────────────────────────────────────────────────────▶ Time │
│ Portal Coverage: 56──▶76──────▶90──────────▶100+ │
│ Template Library: 18──▶48──────▶60──────────▶80+ │
│ Language Support: 5──▶ 5──────▶ 8──────────▶18 │
│ ID Checksum: 6──▶ 9──────▶ 9──────────▶12 │
└──────────────────────────────────────────────────────────────────────┘

17.3 Research Directions
Several areas of ongoing research may inform future system capabilities:
Dedicated document analysis models. While general-purpose multi-modal models provide strong forensic analysis capabilities, models fine-tuned specifically for document verification may achieve superior performance on domain-specific tasks such as seal authentication and font analysis.
Physical security feature inference. Research into inferring the presence of physical security features (embossing, UV ink, microprinting) from high-resolution document scans may eventually enable partial physical feature analysis from digital images.
Generative AI watermarking. As generative AI models increasingly embed invisible watermarks in their outputs, the ability to detect these watermarks in document images may provide a direct signal for AI-generated forgeries.
Federated verification networks. Collaboration between verification platforms could enable cross-platform intelligence sharing about emerging forgery patterns without compromising individual document privacy.
Continuous model evaluation. Systematic evaluation frameworks that track model performance on verification tasks across model versions will be essential for maintaining confidence as underlying AI capabilities evolve.
18. Conclusion
Document fraud represents a systemic threat to the institutions and processes that depend on credential verification. The convergence of generative AI capabilities with readily available document templates has expanded the accessibility and sophistication of forgery techniques, creating a widening gap between attack capabilities and traditional detection methods.
Turing Verify addresses this challenge through a multi-model forensic architecture that combines rapid pre-screening, systematic checkpoint evaluation, external portal verification, and country-specific validation into a unified verification pipeline. The platform's 81 total checkpoints -- comprising 45 forensic rules and 36 attack vector defenses -- provide comprehensive coverage across structural, semantic, external, and metadata dimensions of document authenticity.
Version 2.0 significantly expands the platform's capabilities with the introduction of Deep Verification mode, powered by Claude Sonnet 4. The 13-stage deep analysis pipeline extends standard forensic evaluation with specialized font forensics, spectral seal and watermark analysis, institutional deep cross-referencing, KYB issuer due diligence, and KYC holder background screening. These additions transform Turing Verify from a document-centric verification tool into a comprehensive due diligence platform that evaluates the document, its issuing institution, and its holder as an integrated whole.
The AI vs Human Inspector benchmark framework, evaluated across a 200-document dataset with four difficulty tiers, demonstrates that the platform processes documents approximately 400 times faster than human inspectors at roughly 150 times lower cost per fraud caught, while achieving accuracy comparable to expert human inspectors. Deep Verification mode approaches or exceeds expert-level F1 scores across all difficulty tiers.
The system's integration with 56 verification portals across 25+ countries enables external corroboration that significantly strengthens verification confidence. Country-specific modules for national ID checksum validation (6 countries), business registration format validation (9 countries), and MRZ parsing (3 ICAO 9303 standards) provide jurisdiction-appropriate checks that generalized approaches cannot match.
The two-tier scoring model and three-verdict output (PASS, NEEDS_REVIEW, FAKE) are designed to support institutional decision-making rather than replace it. Standard verifications produce 12-section forensic PDF reports, while Deep Verifications generate executive-grade forensic reports with confidence breakdowns across 8 dimensions. The NEEDS_REVIEW category explicitly bridges automated and manual verification, identifying documents that warrant expert attention while providing the automated analysis as context for that review.
The credit-based pricing system introduced in v2.0 provides flexible access to both standard (1 credit) and deep (5 credit) verification capabilities, with subscription plans ranging from free individual use to enterprise-scale deployment. Prompt caching optimizations achieve up to 90% input cost reduction on Anthropic models, enabling gross margins of 87-96% in typical usage scenarios.
Calibration against 15 ground-truth cases with a continuous feedback loop ensures that the system's detection capabilities track the evolving forgery landscape. The architecture's modularity -- with separable client, API, processing, AI, integration, and persistence layers -- enables independent evolution of each component as requirements, capabilities, and threats change.
The platform does not claim to eliminate document fraud or replace all forms of verification. Physical document examination, direct source verification from issuing institutions, and blockchain-anchored credentials each retain important roles in a comprehensive verification ecosystem. Turing Verify's contribution is to the high-volume, time-sensitive middle ground where institutions need systematic forensic analysis at scale, delivered consistently, documented thoroughly, and integrated with the growing network of institutional verification portals.
As the tools available to forgers continue to advance, the tools available to verifiers must advance correspondingly. Multi-model AI architectures, integrated with external verification infrastructure, calibrated against known ground truth, and augmented with KYB/KYC due diligence capabilities, represent the current frontier of scalable document verification technology.
Appendix A: Forensic Checkpoint Reference
Complete catalog of all 45 forensic rules (U-01 to U-45), organized by category.
A.1 Structural Rules
| Rule ID | Rule Name | Category | Weight | Critical? |
|---|---|---|---|---|
| U-01 | Template Conformance | Structural | High | No |
| U-02 | Font Consistency | Structural | Medium | No |
| U-03 | Alignment Integrity | Structural | Medium | No |
| U-04 | Seal/Stamp Authenticity | Structural | High | No |
| U-05 | Signature Presence | Structural | Medium | No |
| U-06 | Paper/Background Uniformity | Structural | Medium | No |
| U-07 | Print Quality Assessment | Structural | Low | No |
| U-08 | Border and Frame Integrity | Structural | Low | No |
| U-09 | Logo Fidelity | Structural | Medium | No |
| U-10 | Watermark Analysis | Structural | High | No |
| U-11 | Hologram/Security Feature Indicators | Structural | Medium | No |
| U-12 | Image Resolution Consistency | Structural | Medium | No |
A.2 Semantic Rules
| Rule ID | Rule Name | Category | Weight | Critical? |
|---|---|---|---|---|
| U-13 | Date Plausibility | Semantic | High | No |
| U-14 | Grade/Score Validity | Semantic | High | No |
| U-15 | Course Load Plausibility | Semantic | Medium | No |
| U-16 | Institutional Language | Semantic | Medium | No |
| U-17 | Credential Designation | Semantic | Medium | No |
| U-18 | Name Consistency | Semantic | High | No |
| U-19 | Address/Location Validity | Semantic | Medium | No |
| U-20 | Registration/ID Number Format | Semantic | High | No |
| U-21 | Grading System Consistency | Semantic | Medium | No |
| U-22 | Credit Hour Validation | Semantic | Low | No |
| U-23 | Cumulative Calculation Accuracy | Semantic | High | No |
| U-24 | Signatory Title Plausibility | Semantic | Low | No |
| U-25 | Language and Grammar | Semantic | Low | No |
| U-26 | Content Completeness | Semantic | Medium | No |
A.3 External Verification Rules
| Rule ID | Rule Name | Category | Weight | Critical? |
|---|---|---|---|---|
| U-27 | QR Code Presence | External | Medium | No |
| U-28 | QR Data Extraction | External | Medium | No |
| U-29 | QR Domain Trust | External | High | No |
| U-30 | Portal Data Match | External | Very High | Yes |
| U-31 | Portal Existence Verification | External | High | No |
| U-32 | National ID Checksum | External | Very High | Yes |
| U-33 | Business Registration Format | External | High | No |
| U-34 | MRZ Validation | External | Very High | Yes |
| U-35 | Institution Existence | External | High | No |
| U-36 | Accreditation Status | External | Medium | No |
| U-37 | Document Number Cross-Reference | External | High | No |
A.4 Metadata Rules
| Rule ID | Rule Name | Category | Weight | Critical? |
|---|---|---|---|---|
| U-38 | EXIF Data Analysis | Metadata | Medium | No |
| U-39 | Compression Artifact Analysis | Metadata | Medium | No |
| U-40 | Color Space Consistency | Metadata | Low | No |
| U-41 | Resolution Metadata Match | Metadata | Low | No |
| U-42 | Creation Tool Detection | Metadata | High | No |
| U-43 | Modification History | Metadata | Medium | No |
| U-44 | Embedded Object Analysis | Metadata | Medium | No |
| U-45 | Digital Signature Verification | Metadata | High | No |
Appendix B: Attack Vector Catalog
Complete catalog of all 36 attack vectors (AV-01 to AV-36), organized by family.
B.1 Fabrication Family
| AV ID | Attack Vector | Severity | Detection Stage | Primary Checkpoints |
|---|---|---|---|---|
| AV-01 | Vendor Watermark Forgery | Critical | Pre-Screen | U-10, U-06 |
| AV-02 | Phantom Institution | Critical | Pre-Screen | U-35, U-36 |
| AV-03 | Placeholder Text | High | Pre-Screen | U-16, U-26 |
| AV-04 | Template Generator Artifacts | High | Pre-Screen | U-01, U-08 |
| AV-05 | AI-Generated Document | High | Pre-Screen | U-38, U-42 |
| AV-06 | Stock Image Insertion | Medium | Pre-Screen | U-06, U-12 |
| AV-07 | Blank Template Filling | Medium | Full Forensic | U-02, U-07 |
B.2 Tampering Family
| AV ID | Attack Vector | Severity | Detection Stage | Primary Checkpoints |
|---|---|---|---|---|
| AV-08 | Grade/Score Inflation | High | Full Forensic | U-14, U-23 |
| AV-09 | Date Alteration | Medium | Full Forensic | U-13, U-02 |
| AV-10 | Name Substitution | High | Full Forensic | U-18, U-32 |
| AV-11 | Photo Replacement | High | Full Forensic | U-12, U-39 |
| AV-12 | Seal/Stamp Overlay | Medium | Full Forensic | U-04, U-06 |
| AV-13 | QR Code Replacement | High | QR Scanner | U-28, U-30 |
| AV-14 | Selective Content Removal | Medium | Full Forensic | U-26, U-03 |
B.3 Identity Fraud Family
| AV ID | Attack Vector | Severity | Detection Stage | Primary Checkpoints |
|---|---|---|---|---|
| AV-15 | Name Mismatch Across Documents | High | Cross-Doc | U-18 (cross) |
| AV-16 | ID Number Recycling | High | Full Forensic | U-20, U-37 |
| AV-17 | Photo-ID Inconsistency | High | Cross-Doc | U-12 (cross) |
| AV-18 | Biographical Data Conflict | Medium | Cross-Doc | U-13, U-19 |
| AV-19 | Nationality/Jurisdiction Mismatch | Medium | Full Forensic | U-19, U-34 |
| AV-20 | Alias Exploitation | Medium | Cross-Doc | U-18 (cross) |
B.4 Institutional Fraud Family
| AV ID | Attack Vector | Severity | Detection Stage | Primary Checkpoints |
|---|---|---|---|---|
| AV-21 | Accreditation Misrepresentation | High | Full Forensic | U-36 |
| AV-22 | Defunct Institution Exploitation | Medium | Full Forensic | U-35 |
| AV-23 | Program/Degree Fabrication | Medium | Full Forensic | U-17, U-35 |
| AV-24 | Template Version Anachronism | Medium | Full Forensic | U-01, U-13 |
| AV-25 | Signatory Impersonation | Low | Full Forensic | U-24, U-05 |
| AV-26 | Campus/Branch Misattribution | Low | Full Forensic | U-19, U-01 |
B.5 Digital Forgery Family
| AV ID | Attack Vector | Severity | Detection Stage | Primary Checkpoints |
|---|---|---|---|---|
| AV-27 | Layer Compositing | High | Full Forensic | U-39, U-12 |
| AV-28 | Color Space Manipulation | Medium | Full Forensic | U-40 |
| AV-29 | Resolution Stitching | Medium | QR Scanner | U-12, U-41 |
| AV-30 | Metadata Spoofing | Medium | Full Forensic | U-38, U-41 |
| AV-31 | Screenshot-of-Printout Attack | Low | QR Scanner | U-07, U-12 |
B.6 Evasion Family
| AV ID | Attack Vector | Severity | Detection Stage | Primary Checkpoints |
|---|---|---|---|---|
| AV-32 | Deliberate Image Degradation | Low | Full Forensic | U-07, U-12 |
| AV-33 | Partial Document Submission | Medium | Full Forensic | U-26 |
| AV-34 | Non-Standard Orientation | Low | Pre-Processing | U-03 |
| AV-35 | Embedded Steganographic Data | Low | Full Forensic | U-44 |
| AV-36 | Adversarial Perturbation | Medium | Full Forensic | U-38, U-44 |
Appendix C: Verification Portal Directory
Catalog of 56 verification portals organized by region. Portal names are generalized to protect specific integration details.
C.1 East Asia (18 portals)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 1 | Taiwan | University verification | QR-redirect | Transcripts, diplomas |
| 2 | Taiwan | University verification | Web-based | Transcripts, diplomas |
| 3 | Taiwan | University verification | QR-redirect | Transcripts, diplomas |
| 4 | Taiwan | Government credential registry | API | Professional licenses |
| 5 | Taiwan | University verification | Web-based | Transcripts |
| 6 | Taiwan | Certification body | Web-based | Professional certs |
| 7 | Hong Kong | University verification | QR-redirect | Transcripts, diplomas |
| 8 | Hong Kong | University verification | Web-based | Transcripts |
| 9 | Hong Kong | Government registry | API | Business registrations |
| 10 | Hong Kong | Professional body | Web-based | Professional certs |
| 11 | Japan | University verification | Web-based | Transcripts, diplomas |
| 12 | Japan | University verification | Web-based | Transcripts |
| 13 | Japan | Government registry | API | Business registrations |
| 14 | Japan | Certification body | Web-based | Professional certs |
| 15 | South Korea | University verification | QR-redirect | Transcripts, diplomas |
| 16 | South Korea | University verification | Web-based | Transcripts |
| 17 | South Korea | Government registry | API | National IDs |
| 18 | China | Credential verification service | Web-based | Academic credentials |
C.2 Southeast Asia (11 portals)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 19 | Singapore | University verification | QR-redirect | Transcripts, diplomas |
| 20 | Singapore | Government registry | API | Business registrations |
| 21 | Singapore | Professional body | Web-based | Professional certs |
| 22 | Malaysia | University verification | Web-based | Transcripts, diplomas |
| 23 | Malaysia | Government registry | Web-based | National IDs |
| 24 | Thailand | University verification | Web-based | Transcripts |
| 25 | Thailand | Government registry | Web-based | National IDs |
| 26 | Philippines | University verification | Web-based | Transcripts, diplomas |
| 27 | Philippines | Government registry | Web-based | Professional licenses |
| 28 | Vietnam | University verification | Web-based | Transcripts |
| 29 | Indonesia | Government registry | API | National IDs |
C.3 South Asia (8 portals)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 30 | India | University verification | Web-based | Transcripts, diplomas |
| 31 | India | University verification | Web-based | Transcripts |
| 32 | India | Government registry | API | National IDs |
| 33 | India | Professional body | Web-based | Professional certs |
| 34 | India | Certification body | Web-based | Training certs |
| 35 | Sri Lanka | University verification | Web-based | Transcripts |
| 36 | Pakistan | University verification | Web-based | Transcripts |
| 37 | Bangladesh | University verification | Web-based | Transcripts |
C.4 Europe (6 portals)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 38 | United Kingdom | University verification | QR-redirect | Transcripts, diplomas |
| 39 | United Kingdom | Government registry | API | Business registrations |
| 40 | United Kingdom | Professional body | Web-based | Professional certs |
| 41 | Germany | University verification | Web-based | Transcripts |
| 42 | France | Government credential registry | API | Professional licenses |
| 43 | Netherlands | University verification | Web-based | Transcripts |
C.5 North America (5 portals)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 44 | United States | University verification | QR-redirect | Transcripts, diplomas |
| 45 | United States | University verification | Web-based | Transcripts |
| 46 | United States | Government registry | API | Business registrations |
| 47 | Canada | University verification | Web-based | Transcripts, diplomas |
| 48 | Canada | Government registry | API | Business registrations |
C.6 Oceania (4 portals)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 49 | Australia | University verification | QR-redirect | Transcripts, diplomas |
| 50 | Australia | Government registry | API | Business registrations |
| 51 | Australia | Professional body | Web-based | Professional certs |
| 52 | New Zealand | University verification | Web-based | Transcripts |
C.7 Middle East (3 portals)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 53 | UAE | Government credential registry | API | Professional licenses |
| 54 | Saudi Arabia | Government registry | Web-based | Business registrations |
| 55 | Israel | University verification | Web-based | Transcripts |
C.8 Africa (1 portal)
| # | Country | Portal Type | Integration | Document Types |
|---|---|---|---|---|
| 56 | South Africa | University verification | Web-based | Transcripts, diplomas |
Appendix D: Country Validation Matrix
Comprehensive matrix of country-specific validation capabilities.
┌──────────────────────────────────────────────────────────────────────────────────┐
│ COUNTRY VALIDATION MATRIX │
├───────────────┬────────┬────────┬────────┬────────┬────────┬────────┬────────────┤
│ │ ID │ ID │ Biz │ Biz │ MRZ │ Portal │ Institution│
│ Country │Checksum│ Format │ Reg │ Reg │ Support│ Count │ Templates │
│ │ Algo │ Valid. │ Format │ Valid. │ │ │ │
├───────────────┼────────┼────────┼────────┼────────┼────────┼────────┼────────────┤
│ Taiwan │ Yes │ Yes │ Yes │ Yes │ TD1 │ 6 │ 3 │
│ Hong Kong │ Yes │ Yes │ Yes │ Yes │ TD1 │ 4 │ 2 │
│ Singapore │ Yes │ Yes │ Yes │ Yes │ TD1 │ 3 │ 2 │
│ South Korea │ Yes │ Yes │ Yes │ Yes │ TD1 │ 3 │ 2 │
│ Malaysia │ Yes │ Yes │ - │ - │ TD1 │ 2 │ 1 │
│ Thailand │ Yes │ Yes │ - │ - │ TD1 │ 2 │ 1 │
│ Japan │ - │ - │ Yes │ Yes │ TD3 │ 4 │ 2 │
│ United Kingdom│ - │ - │ Yes │ Yes │ TD3 │ 3 │ 1 │
│ United States │ - │ - │ Yes │ Yes │ TD3 │ 5 │ 2 │
│ Australia │ - │ - │ Yes │ Yes │ TD3 │ 3 │ 1 │
│ Canada │ - │ - │ Yes │ Yes │ TD3 │ 3 │ 1 │
│ India │ - │ - │ - │ - │ TD3 │ 5 │ 0 │
│ China │ - │ - │ - │ - │ TD3 │ 1 │ 0 │
│ Philippines │ - │ - │ - │ - │ TD3 │ 2 │ 0 │
│ Germany │ - │ - │ - │ - │ TD3 │ 1 │ 0 │
│ France │ - │ - │ - │ - │ TD3 │ 1 │ 0 │
│ UAE │ - │ - │ - │ - │ TD3 │ 1 │ 0 │
│ Others │ - │ - │ - │ - │ Varies │ 7 │ 0 │
├───────────────┼────────┼────────┼────────┼────────┼────────┼────────┼────────────┤
│ TOTALS │ 6 │ 6 │ 9 │ 9 │ 3 │ 56 │ 18 │
│ │countrs │countrs │countrs │countrs │formats │portals │ templates │
└───────────────┴────────┴────────┴────────┴────────┴────────┴────────┴────────────┘

Appendix E: Glossary
| Term | Definition |
|---|---|
| AI vs Human Benchmark | A 200-document evaluation framework comparing AI verification performance against human inspectors across accuracy, speed, cost, consistency, and coverage dimensions. |
| AV (Attack Vector) | A specific forgery technique that the system is designed to detect. Designated AV-01 through AV-36. |
| Applicant Folder | A collection of documents associated with a single individual, enabling cross-document consistency analysis. |
| Calibration Case | A document with independently verified authentic or fraudulent status, used to calibrate and validate the forensic engine. |
| CLAHE | Contrast Limited Adaptive Histogram Equalization. An image enhancement technique used in QR code scanning. |
| Credit | The unit of consumption for verification. Standard verification costs 1 credit; Deep Verification costs 5 credits. |
| Critical Fail | A checkpoint result that triggers an automatic FAKE verdict regardless of the overall score. |
| Deep Verification | Premium forensic investigation tier using Claude Sonnet 4, with 13 analysis stages, 8 confidence dimensions, KYB, and KYC screening. |
| EXIF | Exchangeable Image File Format. Metadata embedded in image files containing creation details. |
| Forensic Rule (U-Rule) | A specific evaluation criterion applied during document analysis. Designated U-01 through U-45. |
| ICAO 9303 | International Civil Aviation Organization standard governing Machine Readable Travel Documents. |
| JWT | JSON Web Token. A compact, URL-safe means of representing claims for authentication. |
| Kill Chain | The multi-stage verification pipeline where each stage catches different categories of attacks. |
| KYB (Know Your Business) | Deep Verification stage that performs due diligence on the document-issuing institution, producing a credibility score and regulatory standing assessment. |
| KYC (Know Your Customer) | Deep Verification stage that performs background screening on the document holder, including sanctions, PEP, and adverse media checks. |
| MRZ | Machine Readable Zone. A standardized text block on travel documents and IDs that encodes identity data. |
| McNemar's Test | A statistical test used in the benchmark system for comparing paired binary classification results between AI and human inspectors. |
| NEEDS_REVIEW | A verdict indicating the document requires human expert assessment due to inconclusive or borderline automated findings. |
| OFAC | Office of Foreign Assets Control. U.S. government sanctions list screened during KYC analysis. |
| Sonnet 4 (Deep) | Claude Sonnet 4, the AI model powering Deep Verification mode. Provides 5,000 max output tokens for comprehensive forensic reports. |
| PEP | Politically Exposed Person. Individuals holding prominent public functions, flagged as a risk factor during KYC screening. |
| Prompt Caching | Cost optimization technique achieving 90% input cost reduction on Anthropic models and 50% on OpenAI models. |
| Portal | An external web service or API operated by an issuing institution or verification authority that confirms document data. |
| Pre-Screen | The initial rapid assessment stage that rejects obviously fraudulent documents before full analysis. |
| QR Domain Trust | The evaluation of whether a QR code URL points to a legitimate verification portal. |
| SSE | Server-Sent Events. A protocol for streaming real-time updates from server to client over HTTP. |
| T1 Score | Tier 1 Score. The structural component of the two-tier scoring model (40% weight). |
| T2 Score | Tier 2 Score. The semantic and external verification component (60% weight). |
| TD1/TD2/TD3 | ICAO 9303 document format classifications: TD1 (ID cards, 3x30), TD2 (larger ID cards, 2x36), TD3 (passports, 2x44). |
| Template | A stored specification of an institution's document layout, formatting, and security features used for comparison. |
| Trusted Domain | A verified URL domain belonging to a legitimate verification portal, maintained in a 15-domain allowlist. |
| Two-Tier Scoring | The scoring architecture that separates structural assessment (T1) from semantic and external assessment (T2). |
| U-Rule | See Forensic Rule. |
| UBN | Unified Business Number. Taiwan's 8-digit business registration identifier. |
| Verdict | The final classification output: PASS (authentic), NEEDS_REVIEW (inconclusive), or FAKE (fraudulent). |
| Verification Kill Chain | See Kill Chain. |
This white paper is published by Turing Space Inc. for informational purposes. The technical specifications described herein reflect the system state as of April 2026 and are subject to change as the platform evolves.
Document version: 2.0 -- April 3, 2026