Why the seven signals exist
For most of the last fifty years, spotting a fake diploma was a matter of looking. A registrar held the document up to the light, checked the seal, called the issuing university, and settled the matter within a week. That world is gone. The diploma you are about to evaluate may have been generated in fifteen minutes by software that did not exist last year, sold for forty dollars on a Telegram channel, and printed on archival paper that costs less than a takeout meal.
The seven signals below are how forensic engines and trained human investigators get the truth back. None is sufficient on its own. Together they decide the case.
The seven signals at a glance
Here is the full checklist, ordered the way our engine runs it. Use it as a manual triage guide for any diploma that lands on your desk.
Signal 1: Font kerning and weight
University typesetting is a deeply boring craft. The same registrar's office uses the same template, the same fonts, and the same kerning tables for decades. Real diplomas have mathematically consistent letter spacing, identical baselines across lines, and weight that does not waver between characters.
Fakes produced in Word, Canva, or by hand in Photoshop give themselves away. A capital W spaced wider than the rest of the line. A serif font where one letter sits heavier because it was pasted from a different source. Anti-aliasing that breaks across the document. Our engine measures kerning at sub-pixel precision. A trained eye catches the crude examples and misses the rest.
Signal 2: Vector seal geometry
Institutional seals are vector art. The concentric circles in a real seal land on the same center to within a fraction of a pixel. The arcs are smooth bezier curves. The line weights stay identical across the design.
Counterfeit seals fail this three ways. Concentric circles drift off-center because the forger eyeballed the alignment. Curves wobble because they were traced by hand or upscaled from a low-resolution source. Line weights vary because the original was a JPEG that lost its vector definition. AI-generated seals struggle here too. Diffusion models produce seals that look right at a glance and fail vector geometry on close inspection.
Signal 3: Registrar signature curvature
A real signature is a biomechanical artifact. Pressure varies with the writer's muscle tension. Curvature follows the physical limits of a human wrist. The pen lifts at predictable points. Even when the same registrar signs a thousand diplomas, the rhythm is recognizable, and the variations between signatures cluster within a known range.
Forgeries fall into two camps. Traced signatures have uniform line weight and unnaturally smooth curves. The biomechanical noise is missing. AI-generated signatures have the opposite problem: they reproduce the noise but get the global shape wrong, so the signature looks plausible and does not match the reference samples for that registrar.
Signal 4: PDF producer metadata
The cheapest, fastest, most underrated signal in the set. Open the PDF, look at the document properties, and read the producer string. A real diploma issued as a PDF comes from an enterprise tool the registrar uses every day: Adobe LiveCycle, an institutional registrar system, or a specific high-end print driver. The fonts embedded in the PDF are the institution's licensed fonts.
A diploma whose producer string says Microsoft Word,Canva, iLovePDF, or any generic web-to-PDF converter is far more likely to be fake. Our engine keeps a blocklist of producer strings tied to known forgery toolkits. That one check rules out a meaningful percentage of forgeries before any other signal runs.
The producer string is a fifteen-second check that catches fifteen percent of forgeries. No other signal has that ratio.
Signal 5: Internal data logic
This is where most AI-generated diplomas die. The graphics are flawless. The seal is gorgeous. The signature is convincing. Then the diploma says the candidate graduated in June 2024, and the transcript says their final course ended in August 2024. Or the GPA is 3.92, but the course weights cannot arithmetically produce that number. Or the document is from a British university but uses Latin honors phrasing that British universities do not use.
Generative models produce plausible-looking documents. They rarely produce internally consistent ones. A trained registrar catches these by intuition. Forensic software catches them by brute-force consistency checking, and it never gets tired.
Signal 6: Registry cross-check
The decisive signal. The single check that overrides the rest. The engine queries the issuing institution's authoritative registry: HESA in the UK, CHESICC in China, MOE in Taiwan, UGC and AICTE in India, the National Student Clearinghouse in the US, and dozens of country-specific equivalents. The question is simple: does the credential exist?
A perfectly rendered diploma from an institution that does not exist in any registry is still fake. A smudged, crumpled, poorly scanned diploma whose credential ID matches the registry record for the candidate's name and graduation date is almost certainly real. Visual signals narrow suspicion. The registry settles it.
See our companion guide on how AI document verification works end-to-end for the full pipeline.
Signal 7: Microtext and UV hints
On physical scans of high-security diplomas, two final signals remain. Microtext (sub-millimeter text printed at the edge of the seal or border) is a feature of high-security templates that no consumer printer can reproduce. UV-reactive features (watermarks visible only under ultraviolet light) require specialized inks and presses that forgers rarely have.
These signals are decisive when present. Most modern diplomas, especially those issued as PDFs, do not carry them. Treat them as bonus checks, not core ones. When you see them and they pass, confidence is very high. When they are absent, the verdict rests on signals one through six.
How to combine the signals
The seven signals are independent. None is sufficient. Together they decide the case. Here is the triage rule our team uses, and what we recommend for HR and admissions reviewers:
- 0 failures: authentic. Proceed with normal workflow.
- 1 failure: escalate. Request a higher-resolution scan or the original document. Many false alarms resolve here.
- 2 failures: suspicious. Hold the application pending a second forensic review or a direct registrar contact.
- 3 or more failures: almost certainly a forgery. Document the forensic evidence and route to your fraud-handling process.
- Registry failure (signal 6): regardless of other signals, treat the document as unverifiable. A missing registry record is the single most decisive negative signal in document forensics.
Frequently asked questions
Can I run these checks myself, by hand?
You can run signals 4 and 5 by hand in fifteen minutes per document. Open the PDF properties, scan the dates and GPA math. Signals 1, 2, 3, and 7 need pixel-level analysis that isn't realistic without software. Signal 6 requires registry API access.
What is the cheapest signal that catches the most forgeries?
The PDF producer-metadata check (signal 4). Near free, takes seconds, and rules out roughly fifteen percent of forgeries before any deeper analysis runs.
Why do some real diplomas fail one or two signals?
Real diplomas sometimes fail individual checks because of poor scan quality, archival aging, unusual templates, or registrar systems slow to update. That is why one failure is an escalation, not a rejection.
Is the seven-signal model also valid for transcripts and certificates?
The framework is the same. The weights shift. Transcripts lean on signals 5 and 6. Professional certificates lean on signals 4 and 6. The seven categories are the right map for nearly all academic and professional credentials.
How does this catch AI-generated diplomas specifically?
AI-generated diplomas are flawless on signals 1 and 2 and fail signals 3 (signature biomechanics), 5 (data logic), and 6 (registry). The combination catches them. The registry signal alone is usually decisive.
Should I share my findings with the candidate?
That depends on jurisdiction. In the US, FCRA-style disclosure rules apply when adverse action is taken on the basis of a consumer report. In the EU, GDPR rights of access apply. Either way, document the forensic evidence. “The AI said so” is not a defensible answer.