Blog

Private by design.

By Yashesh Bharti · May 19, 2026

Most apps that promise to "make your inbox smarter" forward your email to a model someone else operates. Purelymail Calendar doesn't. We ran the language model inside your browser instead. This is the longer argument for why, and why we think every product touching personal data will eventually have to make the same call.

The thing nobody likes to say out loud

When you paste a paragraph into a cloud chatbot, you are not just asking a model a question. You are handing a copy of that paragraph to a company. That copy lives somewhere. It is logged. On most consumer plans it is retained for at least a month, sometimes longer, and on the free tiers of several big providers it is also used to improve the model . which is the polite way to say it goes into the training pile for the next version unless you found and ticked the right opt-out checkbox.

This is fine if what you're typing is "explain monads" or "what's a good risotto recipe". It's a different story if what you're typing is a forwarded email from your doctor, a NDA your lawyer sent over, or the salary spreadsheet your CFO accidentally cc'd you on.

Inbox products that wrap a cloud LLM have the same shape, just without the paste step. They forward your mail to the model on your behalf. Every message you receive, every newsletter, every receipt, every personal thread, becomes input the provider sees. The product's privacy page will usually say "we don't train on customer data". Sometimes that's true. Sometimes it's true today and changes quietly in a policy update next quarter. Sometimes the model provider has its own policy on the product's traffic that the product company doesn't fully control.

The provider isn't necessarily evil. They're running a real business. But the structural fact stays the same: your private correspondence is now a third party's operational data. You're trusting their retention windows, their access controls, their breach notifications, their subprocessors, and their next CEO.

The best way to keep your email out of a model's training set is to keep it out of the model's network entirely.

The shape of the wrong incentive

Cloud-LLM email tools and personal-data products generally are built on an incentive that's hard to be honest about: the more your data flows through the provider, the better the provider's product gets, and the better the provider's models get, and the more the provider can charge the next customer. Your inbox is feedstock. The product is the intermediate consumer; the model is the long-term beneficiary.

This is the same shape that gave us the last decade of ad-tech. Free product, pay with your behaviour. People learned, slowly and painfully, that "the data never leaves our servers" is not the same as "the data never affects you". Profiles get sold. Models get trained. Policies change. Companies get acquired and the new owner has different ideas.

We are now repeating the pattern with AI, faster and with a much more intimate substrate. The 2010s ad-tech leak was that your browsing got sold. The 2020s AI leak is that your writing, your actual thoughts in their original sentences, becomes training material. That's a step up in stakes.

The technical alternative existed all along

For a long time, "small models run on the device" was a research demo, not a shipping product. That changed. The smartphones in your pocket now have hardware tensor accelerators. Apple Intelligence runs on a Mac. Microsoft's Phi family runs on consumer laptops. And, for the Tasks feature in this app, Google quietly shipped Gemini Nano as a built-in capability of the Chrome browser on desktop.

Gemini Nano isn't a frontier model. It won't write your screenplay or do agentic coding. It's small, and it's smart enough for the kind of constrained classification job that "is this email an action item or a newsletter?" actually is. Crucially: the weights live on your computer, and the inference runs on your CPU or GPU. Once Chrome has downloaded the model the first time, it works without a network connection at all.

So we built the Tasks feature on top of it. When you click Scan inbox, the browser pulls your recent messages through our backend, runs each one through Nano locally, and gets back a structured list of action items. The only thing that travels back to our server is the extracted fields: a title, an optional due date, an optional owner. Your email bodies stop at your browser tab. They are never sent to OpenAI, Anthropic, Google, or anyone else. They are not stored on our side either. The wiki page walks through this step by step.

§ § §

"Trust us, but also verify"

A privacy claim is worthless without a way to check it. The nice thing about doing inference in the browser is that the check is trivial. Open DevTools, switch to the Network tab, and run a scan. You will see a request to our backend for the email list. You will see a request back to our backend with the extracted tasks. You will not see a request to any third-party model host, because there isn't one to make.

The source for the AI code is small enough to read in fifteen minutes. chromeAi.ts is the feature detection. extractTasks.ts is the per-email loop. schema.ts is the JSON schema we constrain the model to. That's it. There is no other AI module hidden somewhere in the backend. The backend doesn't even import an AI library.

What we give up, honestly

On-device models are smaller than cloud ones, sometimes much smaller. Nano gets "extract titles from short emails" right. It would be worse than a frontier model at, say, summarising a 40-message thread or writing a polite decline. We make that trade deliberately for an inbox. The privacy floor matters more than the ceiling on cleverness, here.

On-device models also need device support. The Chrome Prompt API is currently desktop-only and gated to relatively recent versions. The first run downloads several gigabytes of weights, which Chrome does once and then shares across every site that uses the API. For users on older machines or unsupported browsers, the Tasks tab stays hidden. We'd rather not have the feature than have it leak data on the way to "working".

Why this is the future, not a niche

Three forces are pushing every personal-data product the same direction.

One. Devices keep getting faster at running small models. The watershed already happened on phones for image and audio; it's happening now on laptops for language. The window where "we have to ship it to the cloud" was the only credible answer is closing.

Two. Regulation is catching up. The EU AI Act has provisions for transparency about training data. Several US states are passing laws that require explicit consent before personal communications can be used to train models. The "we trained on whatever flowed through us" era has a sell-by date.

Three. Users are getting tired. The shine on cloud chatbots is wearing off as people read the privacy pages, see the breach headlines, and realise that the convenience is the product and they are the payment. The category that will win the next decade isn't "AI that does more"; it's "AI that does enough, on the user's own hardware, without phoning home".

Personal AI should feel less like sending a letter to a stranger and more like a tool that lives on your desk. You wouldn't post every email you receive to a public bulletin board so a clever intern could organise it for you. The current default for cloud-LLM inbox products is closer to that than anyone wants to admit.

The smallest possible promise

We don't ask you to trust us about anything important. We made the feature checkable from your own devtools and put the source on GitHub. The model weights live on your machine. The prompt lives in your tab. The output is structured fields, not free-form text that the model could hide secrets in. Your email bodies never cross a third-party network because no third party is in the path.

If we ever change that, you'll see it in the source first. If a cloud model is added to anything, it'll be opt-in, named clearly, and easy to turn off. And if you don't trust us, you can run this app yourself. The whole thing is one Python service and one Vite bundle. The README has the steps.

Personal data should belong to the person. Their conversations shouldn't become someone else's training data. The technology to keep it that way already exists; the only thing missing has been the willingness to use it. We hope this is a small example of what that looks like.

Further reading.

On-device AI in Purelymail Calendar: the technical reference on the GitHub wiki.

About Purelymail Calendar: the rest of the app.

Source on GitHub: every line.