Skip to main content
Back to Blog

How to Redact Student Information Before AI Detection

GradeOrbit Team·Education Technology
6 min read

When a teacher submits a piece of student writing to an AI detection tool, the most visible part of the process is the result — the likelihood score, the confidence rating, the reasoning. What is less visible, and less often thought about, is what the tool receives in order to generate that result. If the submission has not been prepared carefully, it may contain personal information that the student never intended to share with a third-party AI service.

Stripping a name from the top of a document is a reasonable instinct. But it is not the same as meaningfully redacting a piece of student work. Understanding the difference — and knowing what actually needs to go — is the difference between a lawful submission and one that exposes the school to unnecessary risk under UK GDPR.

What Counts as Personal Data in Student Work

Under UK GDPR, personal data is any information that relates to an identified or identifiable individual. The definition is intentionally broad, and it applies to student work in more ways than teachers typically consider.

The obvious category is direct identifiers: a name on the title page, a student number, a class reference, a school name. These are the details teachers most commonly think to remove. But personal data in student writing extends further than this.

A student writing about their own family, their experiences of a historical event, their personal response to a text, or their own health or circumstances is producing content that is personally identifiable even without a name attached — particularly in a context where it is linked to a specific assignment, class group, or school. A Year 11 essay about a personal bereavement, a sociology piece drawing on the student's own experience of social inequality, or an English Language creative writing response that is clearly autobiographical all carry identifying information that makes the writer traceable even when the obvious identifiers are absent.

Teacher annotations and margin comments on a document can also constitute personal data — particularly where they reference the student by name, make observations about the student's specific learning needs, or contain notes about an individual's circumstances. A scanned paper submission with a teacher's handwritten comments is not the same as a clean typed submission, and the additional content in those comments may be just as identifying as the student's name.

Finally, form group references, subject-specific details, and school-identifiable content — references to a named teacher, a specific school event, or a distinctive local context — can allow a student to be identified indirectly. This matters most in smaller schools or year groups where the pool of potential authors is limited.

The Risk of Submitting Unredacted Work

The risk of submitting unredacted student work to an AI detection tool is not purely theoretical. When a teacher uses a tool that relies on a third-party AI API — which is the case for most commercially available detection tools — the student's work passes through at least one additional organisation before a result is returned. Depending on how that organisation handles submitted content, information in the submission may be retained, logged, or processed in ways the teacher has not consented to on the student's behalf.

Under UK GDPR, the school is the data controller for student work. That means it is responsible for ensuring that personal data submitted to third-party services is handled lawfully and with appropriate safeguards. If a teacher submits an unredacted piece of student work — containing names, personal reflections, and teacher comments — to a detection tool that lacks a valid Data Processing Agreement for UK schools, the school is in breach of its data protection obligations. The fact that the teacher was acting in good faith does not resolve the compliance issue.

There is also a particular risk for students from marginalised groups. A student who has disclosed a protected characteristic — a health condition, a family circumstance, a personal identity — in their written work has a reasonable expectation that this information will not be shared with commercial platforms they have no knowledge of or relationship with. The duty of care schools owe to students extends to how their personal data is handled, not just how they are treated in the classroom.

What to Redact and What You Can Leave

Effective redaction is about removing anything that makes the text identifiable, not just removing a name. The following categories should be redacted before any student work is submitted to an AI detection tool.

Direct identifiers: student name, student number, candidate number, class group, teacher name, school name, and any reference to a specific centre or exam session. These should be removed from both the body of the text and from any cover sheet, header, or footer.

Biographical content: specific personal details the student has included about themselves — their own health, family members by name or relationship, home location, or specific personal experiences — where these details could be used to identify the writer. The test is whether someone who knows the student could recognise them from the content. If so, the detail should be removed or generalised.

Teacher annotations: any handwritten or typed comments on the document that reference the student by name, make observations about the student's specific circumstances, or contain information that is not part of the student's own writing.

What does not need to be redacted is the substantive academic content — the argument, the evidence, the analysis — that the detection tool actually needs to do its job. Redaction is not about making the text unreadable; it is about removing the identifying layer that sits around the academic content. A well-redacted piece of student work is still a complete and assessable piece of writing. It is simply one that cannot be traced back to a specific named individual.

How GradeOrbit's Client-Side Redaction Works

GradeOrbit includes a built-in redaction tool designed specifically for this purpose. Teachers preparing student work for AI detection can use the redaction tool to draw black boxes directly over any content in the submission image — a student name, a class reference, a teacher annotation — before the work is processed.

The redaction happens entirely in the browser, using the Canvas API to permanently overlay the selected areas before anything is transmitted. By the time the submission leaves the teacher's device, the redacted content is gone — not hidden, not greyed out, but permanently removed from the image. The AI model that processes the submission for detection receives only the redacted version. The original unredacted content never leaves the device and is never seen by any third-party service.

This approach addresses the core limitation of name-stripping as a privacy measure. Name-stripping is a manual step that relies on the teacher remembering to do it, knowing what counts as identifying, and having a way to action it for different types of submission — typed text, photographed handwritten work, scanned PDFs. GradeOrbit's Canvas-based redaction works on images directly, which means it applies equally to handwritten submissions photographed with a mobile device and to scanned paper scripts uploaded via QR code. The teacher draws a box; the content is gone.

Because the redaction is applied before transmission — not as a post-processing step after the content has already been sent — it provides a meaningful privacy guarantee rather than a procedural one. Teachers do not need to trust that the platform will not use the redacted content; they can verify that the redacted content was never transmitted in the first place.

Building Redaction Into Your Routine

The best redaction practice is one that happens as a standard step in the submission workflow, not as an occasional check on suspicious pieces. If redaction is treated as something to do when a submission looks especially sensitive, it will be applied inconsistently — and the submissions that seem least sensitive are not necessarily the ones with the least personal content.

For departments implementing AI detection as a regular part of their workflow, it is worth building a brief pre-submission checklist into the process. Before any student work is submitted: have identifying details been removed from headers, footers, and cover sheets? Does the body text contain biographical content that should be generalised? Are there teacher annotations on the document that reference the student by name? This does not need to be a lengthy process — for most submissions, a thirty-second check is sufficient. For handwritten work being photographed, the physical submission itself can be checked before the image is taken.

Schools developing a formal AI detection policy should include redaction as a stated requirement rather than a recommended practice. If a school's approach to AI detection is documented — in a staff handbook, a departmental protocol, or the school's broader AI use policy — redaction before submission should appear explicitly. This protects individual teachers by giving them a clear procedural basis for their actions and protects the school by ensuring a consistent standard is applied across departments.

For more on building a whole-school approach to AI detection that teachers can apply consistently, see our guide on how schools can implement AI detection consistently.

Try GradeOrbit's Privacy-First AI Detection

GradeOrbit's AI detection tool is built into your marking dashboard and includes client-side redaction as a standard part of the submission workflow. Submit work as a photographed image, a scanned document, or typed text — redact any identifying content before the AI sees it — and receive a likelihood score from 0 to 100% with a confidence label and full explanatory reasoning. Student work is never stored. Your first scans are free.

If you are currently submitting student work to a detection tool without a clear redaction step in your workflow, now is a good time to build one in. The obligation to protect student data does not disappear because the tool is convenient to use — but with the right workflow, protection and convenience are not in conflict.

Create your free GradeOrbit account and run your first privacy-safe detection scan today.

Ready to save time on marking?

Join UK teachers using AI to provide better feedback in less time.

Get Started Free