How to Investigate a High AI Detection Score Fairly

A student's work comes back with a high AI detection score. Your first instinct might be to act on it immediately — but the score alone is not enough. Treating a likelihood score as a verdict can cause real harm to students and, in the worst cases, put you outside your school's academic integrity policy. This guide gives you a structured, fair process for investigating a high score, talking to the student, and deciding what to do next.

What a High Score Actually Tells You

AI detection tools analyse the statistical and linguistic properties of a piece of text and compare them against patterns associated with AI-generated writing. A score of 0% suggests writing that closely resembles natural human output. A score approaching 100% suggests writing that closely resembles output from tools like ChatGPT or Claude. Everything between those extremes is probabilistic — not definitive.

A high score — anything above 70% — warrants attention. It does not warrant immediate accusation. The same patterns that flag AI-generated writing can appear in work produced by students who are highly proficient writers, students who write English as an additional language, students who have revised their work heavily after detailed teacher feedback, and students whose subjects demand a formal, structured register. GradeOrbit returns a likelihood score alongside a confidence label and a short explanation of the specific signals that contributed to the result. Reading that explanation carefully is your first step.

What Commonly Triggers High Scores (Even Without AI Use)

Before you begin any investigation, it helps to understand what can push a score upward in work that is entirely authentic.

Highly proficient writers

Students who have been coached intensively for A-Level or who are natural writers produce prose with the same qualities that detection models flag: well-organised paragraphs, consistent structure, controlled vocabulary, clear analytical flow. Academic writing at its best shares characteristics with AI-generated writing. A Year 13 student who is genuinely excellent at English Literature may score highly not because they used ChatGPT but because their writing is exceptionally good.

EAL students and translation

Students who speak English as an additional language sometimes draft in their first language and translate, either manually or using a translation tool. The resulting English often has a formal, slightly over-structured quality that detection models flag. This is a sensitive category — applying the same evidential standard you would to a fluent English speaker risks being both unfair and potentially discriminatory.

Subjects with formal registers

Religious studies, philosophy, law, and certain science assessments all require students to adopt a specific, structured register. A student who has genuinely mastered that register and applied it consistently may produce writing that appears machine-like precisely because they have succeeded at the task. A high detection score in a subject with a naturally formal register carries less evidential weight than the same score in a subject where informal personal voice is expected.

Heavy revision after teacher feedback

A student who has worked through several drafts, received detailed feedback, and refined their writing over multiple weeks may produce a final submission that is smoother and more consistent than early drafts. Good teaching can make writing look more like AI output to a statistical model. That is not evidence of anything except that your feedback worked.

Four Investigation Steps Before Any Conversation

A structured response to a high score starts with what you already know — before you say anything to the student.

Step 1: Compare against the student's previous work

Pull up whatever previous writing you have from this student: class exercises, rough drafts, timed in-class tasks, mock papers. Ask yourself whether this piece is consistent with what they normally produce. A sudden and unexplained shift in vocabulary range, argument structure, or analytical depth — particularly if it coincides with a high score — is worth pursuing. If the work is consistent with the student's usual standard, that context significantly changes how seriously you treat the score.

Step 2: Read the text carefully for linguistic signals

AI-generated writing tends to exhibit characteristic patterns: evenly distributed sentence lengths, a lack of personal voice, over-reliance on hedging phrases, and a kind of generic competence that lacks authentic personality. Read the flagged work slowly and ask whether you can hear this student's individual voice in it. These are not conclusive indicators on their own, but they either reinforce or reduce your concern when combined with the score.

Step 3: Consider the submission context

When was the work submitted? Has this student submitted unusually polished work at the last minute before? Is there a pattern of inconsistency between in-class and out-of-class performance? A high score combined with a late submission, from a student who typically produces weaker work, represents a stronger case for investigation than the same score from a confident, consistent writer who submitted early and whose in-class work matches the quality of what was submitted.

Step 4: Run the deeper detection if you haven't already

GradeOrbit offers two detection modes: a one-credit option for a fast check, and a three-credit option for a more thorough analysis with greater depth of reasoning. If you initially used the one-credit mode and your concern remains after reviewing the score, running the three-credit analysis gives you a more detailed evidence base to work from before any conversation takes place.

How to Have the Conversation With the Student

If multiple indicators point in the same direction, a direct conversation is your most powerful investigative tool. The structure of this conversation matters. It should be non-accusatory, exploratory, and grounded in what the student produced — not in a score.

Begin by asking the student to talk you through their argument. Not "explain yourself" — that phrase puts them on the defensive immediately. Instead: "I'd like to understand how you approached this. Can you walk me through the main argument in your own words?" A student who genuinely wrote the work will be able to engage with the ideas, explain their evidence, and discuss their reasoning. A student who submitted something they did not write will typically struggle to articulate ideas in any depth, or will give answers that do not match the sophistication of what they submitted.

You might also ask them to write a short paragraph on the same topic in class — not as a punishment, but framed as a follow-up exercise. This gives you a direct comparison between in-class and out-of-class writing that is far more evidentially robust than a detection score alone.

Take notes during the conversation. If the matter progresses to any formal stage, a contemporaneous record of what the student said and how they engaged with the work will be important.

When to Escalate — and When to Drop It

Not every high score leads to escalation. Most investigations should result in a teacher making a professional judgement and either moving on or having a supportive educational conversation with the student about appropriate AI use — not a formal process.

Escalation becomes appropriate when the evidence is convergent: a high score, a significant departure from the student's previous standard, an inability to discuss the work in conversation, and a submission context that raises additional concerns. At that point, you involve a senior colleague before taking any further action, and you document everything carefully.

A detection score alone is not sufficient grounds for formal sanction under most schools' academic integrity policies. Check your school's policy before escalating. If your school does not yet have a clear AI use policy, this is a gap worth raising with leadership — the absence of a policy is not the same as there being no rules.

How GradeOrbit's Detection Works

GradeOrbit's built-in AI detection analyses submitted work and returns a likelihood score from 0–100%, a confidence label (Low, Medium, or High), the specific linguistic signals that contributed to the score, and a brief reasoning paragraph summarising the overall assessment. You can submit work as pasted text, an uploaded image, or a scanned physical document.

Detection is available in two modes: a one-credit option for a quick initial check, and a three-credit option for a more thorough analysis when you need a deeper evidence base. Student work is never stored on GradeOrbit's servers — content is sent to the AI for analysis and immediately discarded. Before submitting, you can use GradeOrbit's built-in redaction tool to draw black boxes over student names and any other identifying information.

For broader context on building a fair detection policy across your school, our guide on how to use AI detection in school fairly covers policy and procedure in more depth.

Try GradeOrbit's AI Detection Tool

A high AI detection score is the beginning of an investigation, not the end of one. Used alongside your professional knowledge of the student, a careful reading of the text, and a direct conversation where needed, GradeOrbit's detection tool gives you the evidence you need to make a fair, well-grounded judgment.

GradeOrbit's detection is built directly into your dashboard — no separate account or subscription required. Try GradeOrbit today and see how it supports your approach to academic integrity.