How Typeahead processes text locally

Last updated March 18, 2026

Typeahead generates every suggestion on your Mac, not in the cloud. Here is exactly how that works.

The model runs on your device

Typeahead uses llama.cpp, an open-source runtime for running language models locally. The model file is in GGUF format and lives in your Application Support folder. When Typeahead starts, it loads the model into RAM where it stays until you quit.

Inference happens entirely on your CPU and GPU (Neural Engine on Apple Silicon). Nothing leaves your Mac at inference time.

How it reads your text

Typeahead uses the macOS Accessibility API to read the text in the active text field. It sends that text to the local model as a prompt, generates a completion and inserts the result at the cursor.

It reads only the text field you are actively typing in. It cannot see other open windows, your clipboard or the rest of your screen.

Network activity

Typeahead makes two types of network requests, both limited:

License verification: one API call when you first activate your key. After that, no further verification is needed and the app works offline.
Update checks: periodic requests to check for new versions, if automatic updates are enabled. Only a version number is sent. No content or personal data.

There is no telemetry, no crash reporting sent automatically and no analytics inside the app. Your text, suggestions and usage patterns are not transmitted anywhere.

How to verify

You can confirm this using macOS Activity Monitor. Open it, find Typeahead, and click the Network tab. You will see no outgoing connections while suggestions are being generated.

The privacy policy at typeahead.ai/privacy covers this in more detail.

Still have questions?

Contact Support