Introducing Cereby Mini: Your Document AI
An assistant that lives inside the document: chat, edits, and analysis on one surface, with a diff before anything lands.
The problem with switching surfaces
The standard AI writing loop goes something like this: open a document, copy a chunk of text, switch to a chat window, get a suggestion, copy the suggestion, switch back, and paste. If the suggestion needs adjusting, do it again. Students writing a lab report or working through a problem set hit this cycle a dozen times per session. Each round trip introduces friction, and the paste step hides what actually changed.
There are two deeper problems underneath that friction. First, the model does not see the same structure the user sees. It gets raw text, not a formatted document with headings, tables, and inline code. Second, when a suggested edit lands, there is no record of the before. The user has to remember what was there, or undo repeatedly to find it.
Cereby Mini is our answer to both. It runs inside the document and it never silently modifies anything.
How it works
The assistant lives in two places on the same surface. An inline bar appears at the caret or around a selection for quick tasks: explain a term, tighten a sentence, fix grammar on a span. A full Document AI panel opens for longer work: full-page rewrites, multi-file context, conversations that need more than a few exchanges. Both paths share the same model routing and the same mutation semantics.
The key commitment is at the bottom of that diagram. When Cereby Mini proposes a change to the document, it shows a diff first. The user can accept, reject, or send the proposal back to chat for refinement. Nothing applies until the user says so. That diff is the contract between the assistant and the document, and it is what makes larger, more aggressive suggestions feel safe to try.
Context the model actually sees
Three things shape what the assistant has access to during a session.
The active document is always in scope. If a user selects a paragraph, the model works on that paragraph. If no selection exists, it works on the full page. Either way, the model is operating on the same text the user is looking at, including structure.
@ references let users pull in other pages from their open documents. This is useful when one note connects to another, when a draft references an outline, or when a lab report needs the original problem set in context. The model prompt is scoped to those handles explicitly, so long notebooks do not collapse into an undifferentiated blob.
File attachments (uploaded documents or files from the library) work the same way. OCR and PDF parsing quality varies, so the system surfaces failures as actionable errors rather than returning empty responses and leaving the user to wonder what went wrong.
Study-specific workflows
Generic AI tools treat a lab report the same as a tweet. Cereby Mini ships first-class paths for the kinds of writing students actually do.
Structured creation produces Cornell notes, outlines, and Q&A sets as formatted pages, not prose to reformat. STEM helpers handle symbols, units, and image-backed problem solving; a student can attach a photo of a problem set and ask the assistant to work through it in the document. Extraction, document analysis, and the AI text detector and humanizer round out the set, all inside the same surface, all ending with the same diff gate.
The detector and humanizer being in-document rather than a separate vendor tab matters most when authenticity checks and editing are part of the same pass. Sending users out breaks the flow at exactly the moment when they need it to hold together.
The honest operational constraints
Large selections can exceed practical context windows. The assistant biases toward page-scoped or chunked proposals rather than silent truncation. If a selection is too large to handle cleanly, that surfaces as a prompt to narrow scope, not as a mysteriously low-quality result.
Wholesale rewrites produce large diffs. When the before and after are very different, the diff itself can be hard to read quickly. The reject and chat-refine paths need to be faster than re-prompting from scratch, which means the UI around the diff matters as much as the diff itself.
What we learned building this
The diff gate is the feature. Early in the design, we thought of it as a safety rail, something to add after getting the assistant working. It turned out to be the reason users trusted the assistant with consequential edits. Trust scales with review, not with model confidence.
Inline speed and panel depth cannot fork behavior. An inline action that produces a different kind of result than a panel action would confuse users about what the assistant is actually doing. The surfaces look different; the semantics are the same.
