How We Test HumanizeAI: Methodology

Every score we publish comes from a documented test run: real sample texts, one pass through balanced mode, scored 0 to 100 by our own AI detector, with the date attached. This page is the standing record of how those tests work and what the numbers can and cannot tell you. Last updated: June 11, 2026.

The sample texts

We test with passages generated by a ChatGPT-style model, in the formats people actually paste into a humanizer. The June 11, 2026 run used three: an academic essay paragraph of 137 words, a blog intro of 99 words, and a marketing email of 102 words.

We keep passages in the 90-to-150-word range on purpose. Below roughly 50 words, detectors have too little signal to score reliably, and a test built on unscoreable inputs would flatter everyone.

The settings

Every published test uses balanced mode, single pass, default settings. No aggressive mode, no manual editing of the output, no rerunning until a flattering number appears. The score we publish is the score that run produced. When we test other modes in the future, the mode will be named next to the number.

How scoring works

We score the text before and after with our own AI detector on a 0-to-100 scale, where 0 reads fully human and 100 reads fully AI. The detector weighs three signals: burstiness (sentence-length variation), vocabulary variance (how word choices spread), and perplexity (how predictable each next word is).

The June 11, 2026 results: essay paragraph 99 to 1, blog intro 99 to 15, marketing email 99 to 14. All three flipped from likely AI to likely human in one pass.

What we never claim

Three standing rules.

  • We never publish competitor benchmark results we did not run ourselves. If a comparison page cites scores, they are ours, about our tool, or they do not appear.
  • We never present our detector's score as another detector's verdict. GPTZero, Turnitin, and Originality.ai use different models and will score the same text differently.
  • We never promise a pass. Detectors retrain constantly; no humanizer passes every detector every time. We show a live score on every rewrite precisely because the outcome can only be checked, not guaranteed.

When we run per-detector tests (for example against GPTZero), those pages cite that detector's actual scores from that run, with the date.

Test log

  • June 11, 2026: three ChatGPT-style passages, balanced mode, scored with our own detector. Academic essay paragraph (137 words): 99 to 1. Blog intro (99 words): 99 to 15. Marketing email (102 words): 99 to 14.

New runs are added here as they happen, and pages that cite a run link back to this log.

Try HumanizeAI free

3 anonymous tool uses per day, 150 per month with a free account, no credit card needed.

Open the humanizer